If you want to use this tool from your web site, here is the CGI Interface:
Changed:
< <
(Note: If you want to upload local html text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)
> >
(Note: You need to use attribute name/value pair: enctype="multipart/form-data" within the form tag because the tool was to designed to allow local file uploading even if you do not use this feature)
pattern: A sequence of characters used either with regular expression notation or for path name expansion, as a means of selecting various character strings or path names, respectively. Values are matched against patterns to see if they should be included/excluded. In patterns "*" matches any string, "?" matches any single character.
pattern: A sequence of characters used either with regular expression notation or for path name expansion, as a means of selecting various character strings or path names, respectively. Values are matched against patterns to see if they should be included/excluded. In patterns "*" matches any string, "?" matches any single character.
Changed:
< <
History
> >
Predefined parameter values in Tool Bar
Source: the page the user is currently in.
Element: body or set by site owner
Words listed: all words except the glasgow stop-list
Stemmer: use inflectional stemmer to process all the words before listing
This tool can be used to list all of the words or user specified words found within a specified tag. For example, list all words matching a user entered pattern, list all words except user specified stop words. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, the <body> tag is used.
Added:
> >
Term Defination
stop words: Words ignored in a query because they are so commonly used that they can't contribute to relevancy. Includes conjunctions, prepositions, and articles such as and, to and a.
pattern: A sequence of characters used either with regular expression notation or for path name expansion, as a means of selecting various character strings or path names, respectively. Values are matched against patterns to see if they should be included/excluded. In patterns "*" matches any string, "?" matches any single character.
You can add a button in your web page to list all the words in that page by call TAPoRware cgi script.
Here is the code for this function
<form method="post" name="textForm" enctype="multipart/form-data"
action="http://taporware.mcmaster.ca/~taporware/cgi-bin/prototype/tlistword.cgi"
onsubmit="document.textForm.texturl.value=document.location.href">
<input type="hidden" name="source" value="url">
<input type="hidden" name="texturl">
<input type="hidden" name="freetext" value="yes"/>
<input type="hidden" name="range" value="all" />
<input type="hidden" name="sorting" value="2">
<input type="hidden" name="display" value="1">
<input type="hidden" name="taporface" value="new">
<input type="submit" name="doIt" value="List All Words of the Page">
</form>
Web Service Interface
Taporware provides web services to any non-benefit organizations. here is the taporware web services infomation:
If you want to use this tool from you web site, here is the CGI Interface:
> >
If you want to use this tool from your web site, here is the CGI Interface:
(Note: If you want to upload local html text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)
This tool can be used to list all of the words found within a specified tag. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, the <body> tag is used.
> >
This tool can be used to list all of the words or user specified words found within a specified tag. For example, list all words matching a user entered pattern, list all words except user specified stop words. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, the <body> tag is used.
History
Pseudocode
Tokenize text into words using spaces and punctuation marks
Changed:
< <
Count words with similar letters ignoring capitalization
> >
Sort and count words with similar letters ignoring capitalization
Extract words based on user specified criteria
Generate output format based on user's selection
Ways of Using
Added:
> >
Enter a valid URL in the URL field or enter a local upload html text
Enter a valid html tag or tag list seperated by comma, default is "body"
Select which list you want to get and enter the corresponding text if necessary
Select sorting criterion
Select output format
If you want the results displayed in the same window with taporware interface, uncheck the check box - "Open results in new window"
Finally, click the "Submit" button
CGI Interface
Added:
> >
If you want to use this tool from you web site, here is the CGI Interface:
(Note: If you want to upload local html text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)
This tool can be used to list all of the words found within a specified tag. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, the <body> tag is used.
This tool can be used to list all of the words found within a specified tag. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, the <body> tag is used.
-- MattPatey - 13 Oct 2005