Googlizer
The Googlizer queries Google about one or more words and returns the list of web pages Google identifies as associated with the word(s).
See:
http://taporware.mcmaster.ca/~taporware/otherTools/googlizer.shtml
Description
This tool uses the Google web API to perform an internet search and then display different formats such as a brief list or by aggregating the results into a single page. It will further process the searched results using other
TAPoRware tools. Possible search terms include:
- words;
- site date range;
- site containing specific links or similar to specific site;
- words in title;
- all words in title;
- words in URL;
- all words in URL;
- all words in text and
- specific file type.
Here is the googlizer interface:
History
First version of Googlizer simply returns the google results to the user. Then we made a change to let user select the results in brief list and aggregate them into different formats:
- full text without tags
- full text with tags
- TAML format without tags
- TAML format with tags
Users can select a format of display to save or further process.
Pseudocode
Ways of Using
- Enter key words based on google search rules. For example, you can use +, -, to include, exclude a words etc.
- Select one restriction in the "search limited to" field. The meaning of each restriction is self-explainable.
- Fill the "corresponding context" field if your restriction needs this field to be completed. For example, if you select "In specified file types", you should enter the file types you want the search performing.
- Select how many pages you want to generate by the tool.
- Select what kind of results you want to see:
- Links with brief information -- similar as google search result format.
- Complete corpus of all pages -- aggregate all pages into single page with user specified format:
- html format: diaplay as a single html page
- plain text format: strip all html tags and format then as best as we can into a single plain text display.
- xml format: put each html page into a xml comment to avoid not "well-formed" xml problem.
- Further processing.. -- if you select this one, more taporware tools will be displayed for your use. select one and fill the html form based on each same-named taporware tool.
- Click the submit button fro you search.
CGI Interface
If you want embed googlizer tool in you web page to perform taporware google search, using the following CGI Interface.
| Name | Type | Value | Description |
| search_words | text | | search key words or phrase |
| restrict | selection | nolimit site daterange backlink relatedlink intitle allintitle allintext inurl allinurl informats allinformats | no limit a specific site a date range contains specific link similar to specific page word(s) in title all words in title all words in text word(s) in url all words in url in specified file types not in specified file types |
| r_context | text | | the value here depends on the selection above |
| aggre_pages | selection | 1, 2, 5, 10, 20, 50, 100 | # of pages you want to display or aggregate |
Web Service Interface
Taporware provides web services to any non-benefit organizations. Here is the taporware web services infomation:
Known Bugs
- This tool is not working reliably right now. We are trying to figure out why.
- There is a Google API restriction that sometimes blocks us. We are only allowed so many queries a day. Once used up, Google blocks us.
To Do
--
GeoffreyRockwell - 12 May 2005
- Please add comments to this page. Tell us what you like or don't like about the tool. -- GeoffreyRockwell - 02 Jun 2005 15:10:59