Main.TAPoRwareHTMLCollocation (r1.1 vs. r1.11)
Diffs

 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.11 - 06 Jun 2008 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 65 to 65

finddisp 1/2/3 selection 2 Display format which are XML text in HTML/HTML/XML tree in the order of parameter values
taporface   checkbox checked display result in a new window without graphics interface (default) or with taporware interface in the same window
Added:
>
>

Use Find Text -- Collocation TAPoRware Tool in Your Web Page

You can add a text field and a button in your web page to get the collocates of the pattern you entered in that page by call TAPoRware cgi script.

Pattern:

Here is the code that you can cut and paste to your web pages:

<table style="border: solid gray 1pt"><tr><td>

<form method="post" name="htmlForm" enctype="multipart/form-data" target="_blank" action="http://taporware.mcmaster.ca/~taporware/cgi-bin/prototype/hfindtext.cgi" onsubmit="document.htmlForm.htmlurl.value=document.location.href">

<input type="hidden" name="source" value="url" />

<input type="hidden" name="htmlurl" />

<input type="hidden" name="freetext" value="yes"/>

Pattern: <input type="text" name="find_patt" />

<input type="hidden" name="context" value="1" />

<input type="hidden" name="conLeng" value="5" />

<input type="hidden" name="finddisp" value="1" />

<input type="hidden" name="sorting" value="3" />

<input type="hidden" name="taporface" value="same" />

<input type="submit" name="doIt" value="Get Collocate of the Page" />

</form>

</td></tr></table>


Web Service Interface

Taporware provides web services to any non-benefit organizations. here is the taporware web services information:

Line: 81 to 140

    • sorting -- values can be 1/2/3 corresponding to co-occurrence words by frequency/alphabetically/z-score
    • outFormat -- values are same as parameter "finddisp" in the CGI interface above
Added:
>
>

REST Service Interface

Taporware Rest service uses plain text HTTP protocol so that you can submit your request use either POST or GET method.

  • Service URI: http://tapor1-dev.mcmaster.ca/~restserv/html/collocates.
  • Parameters:
    • htmlInput -- any HTML text
    • htmlTag -- any valid HTML tag in your submitted HTML text
    • pattern -- Unix styled pattern you want to find in the text
    • context -- value can be 1/2/3/4 which corresponding to Words/Lines/Sentences/Paragraphs respectively
    • contextlength -- number of words/lines/sentences/paragraphs before and after the specified context
    • sorting -- sorting criteria. The values are 1/2/3 and corresponding to Co-occurring words By frequency/Alphabetically/By z-score respectively
    • outFormat -- output format. The values are 1/2/3/4 and corresponding to HTML/XML tree/XML text in HTML/Tab delimited text

Known Bugs

To Do


 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.10 - 28 Mar 2007 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 34 to 34

Ways of Using

  • Enter a valid URL in the URL field or enter a local upload html text
Changed:
<
<
  • Enter a valid html tag or tag list seperated by comma, default is "body"
  • Enter word or pattern in the corresponding text field by check the related radio button
>
>
  • Enter a valid html tag or tag list separated by comma, default is "body"
  • Enter a word or pattern in the corresponding text field by check the related radio button

  • Select the context of concordance and the length of context
Changed:
<
<
  • Select the collocates sorting critieia
>
>
  • Select the collocates sorting criteria

  • Select output format
  • If you want the results displayed in the same window with taporware interface, uncheck the check box - "Open results in new window"
  • Finally, click the "Submit" button
Line: 51 to 51

Here are the parameters:

Changed:
<
<
Parameter Name Parameter Value Control Type Default Discription
>
>
Parameter Name Parameter Value Control Type Default Description

source url/local radio button url Let user select input text (either a url or upload local html text)
htmlurl   text   A Valid URL that the pointed document should be an html text
localFile   file   The path to your local html text file
Changed:
<
<
tagtext   text body Valid html element (tag) name or multple html element name separated by comma
>
>
tagtext   text body Valid html element (tag) name or multiple html element name separated by comma

findwhat word/patt radio button word let user to select either word or pattern of the key of the concordance
find_word   text   key word of the concordance
find_patt   text   key pattern of the concordance
Line: 67 to 67

Web Service Interface

Changed:
<
<
Taporware provides web services to any non-benefit organizations. here is the taporware web services infomation:
>
>
Taporware provides web services to any non-benefit organizations. here is the taporware web services information:

Changed:
<
<
    • htmlTag -- any html element (tag) name or multple html element name separated by comma
>
>
    • htmlTag -- any html element (tag) name or multiple html element name separated by comma

    • pattern -- unix styled pattern or regular expression
Changed:
<
<
    • context -- value can be 1/2/3 which coresponding to Words/Lines/Sentences respectively
>
>
    • context -- value can be 1/2/3 which corresponding to Words/Lines/Sentences respectively

    • contextLength -- length of context
    • sorting -- values can be 1/2/3 corresponding to co-occurrence words by frequency/alphabetically/z-score
    • outFormat -- values are same as parameter "finddisp" in the CGI interface above

 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.9 - 28 Aug 2006 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml

 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.8 - 18 Jul 2006 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 69 to 69

Taporware provides web services to any non-benefit organizations. here is the taporware web services infomation:

Changed:
<
<
>
>

  • Service Method: find_Collocation_HTML
  • parameters:
    • htmlInput -- any html string

 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.7 - 11 Jul 2006 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 14 to 14

  • pattern: A sequence of characters used either with regular expression notation or for path name expansion, as a means of selecting various character strings or path names, respectively. Values are matched against patterns to see if they should be included/excluded. In patterns "*" matches any string, "?" matches any single character.
  • context: the text that occurs before and after a piece of text (or a pattern in this case).
Changed:
<
<

History

>
>

Predefined Parameter Values in Tool Bar


Added:
>
>
  • Source: the page the user is currently in.
  • Element: body or set by site owner
  • Context: words
  • Context length: 5
  • Sorting: co-occurring words by frequency
  • Display format: HTML

Pseudocode

  • Obtain HTML string by URL or from user's local disk

 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.6 - 21 Jun 2006 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 8 to 8

Description

This tool takes a word from the user and returns all words directly before and directly after it based on the given context (i.e. words, lines, sentences). Results can be sorted alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).
Added:
>
>

Term Definition

  • collocation: a sequence of words or terms which co-occur more often than would be expected by chance.
  • pattern: A sequence of characters used either with regular expression notation or for path name expansion, as a means of selecting various character strings or path names, respectively. Values are matched against patterns to see if they should be included/excluded. In patterns "*" matches any string, "?" matches any single character.
  • context: the text that occurs before and after a piece of text (or a pattern in this case).

History

Pseudocode


 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.5 - 06 Feb 2006 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 32 to 32

CGI Interface

Changed:
<
<
If you want to use this tool from you web site, here is the CGI Interface:
>
>
If you want to use this tool from your web site, here is the CGI Interface:

(Note: If you want to upload local html text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)

Line: 50 to 50

context 1/2/3 selection Words (1) context type corresponding the values in the parameter value field: Words/Lines/Sentences.
conLeng   text 5 context length corresponding to the selected context
sorting 1/2/3 selection 1 co-occurrence words sorting: by frequency/alphabetically/by zscore corresponding the order of this parameter values
Changed:
<
<
finddisp 1/2/3 selection 2 Display foemat which are XML text in HTML/HTML/XML tree in the order of parameter values
>
>
finddisp 1/2/3 selection 2 Display format which are XML text in HTML/HTML/XML tree in the order of parameter values

taporface   checkbox checked display result in a new window without graphics interface (default) or with taporware interface in the same window

Web Service Interface


 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.4 - 21 Dec 2005 - LianYan)

META TOPICPARENT TAPoRware

Find Text — Collocation

See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml
Line: 12 to 12

Pseudocode

Added:
>
>
  • Obtain HTML string by URL or from user's local disk
  • Obtain text contained by user specified tags
  • Find user specified word/pattern along with user specified context -- concordance
  • If user selects "sorting by z-score", perform span word counting and total words counting, and then calculate the values of z-score
  • Otherwise, sort and count the words of the concordance text
  • Generate output of the collocates of the concordance text

Ways of Using

Added:
>
>
  • Enter a valid URL in the URL field or enter a local upload html text
  • Enter a valid html tag or tag list seperated by comma, default is "body"
  • Enter word or pattern in the corresponding text field by check the related radio button
  • Select the context of concordance and the length of context
  • Select the collocates sorting critieia
  • Select output format
  • If you want the results displayed in the same window with taporware interface, uncheck the check box - "Open results in new window"
  • Finally, click the "Submit" button

CGI Interface

Added:
>
>
If you want to use this tool from you web site, here is the CGI Interface: (Note: If you want to upload local html text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)

Here are the parameters:

Parameter Name Parameter Value Control Type Default Discription
source url/local radio button url Let user select input text (either a url or upload local html text)
htmlurl   text   A Valid URL that the pointed document should be an html text
localFile   file   The path to your local html text file
tagtext   text body Valid html element (tag) name or multple html element name separated by comma
findwhat word/patt radio button word let user to select either word or pattern of the key of the concordance
find_word   text   key word of the concordance
find_patt   text   key pattern of the concordance
context 1/2/3 selection Words (1) context type corresponding the values in the parameter value field: Words/Lines/Sentences.
conLeng   text 5 context length corresponding to the selected context
sorting 1/2/3 selection 1 co-occurrence words sorting: by frequency/alphabetically/by zscore corresponding the order of this parameter values
finddisp 1/2/3 selection 2 Display foemat which are XML text in HTML/HTML/XML tree in the order of parameter values
taporface   checkbox checked display result in a new window without graphics interface (default) or with taporware interface in the same window

Web Service Interface

Added:
>
>
Taporware provides web services to any non-benefit organizations. here is the taporware web services infomation:

  • Endpoint URL: http://strange.mcmaster.ca:9982
  • Service URI: http://strange.mcmaster.ca/~taporware/webservice
  • Service Method: find_Collocation_HTML
  • parameters:
    • htmlInput -- any html string
    • htmlTag -- any html element (tag) name or multple html element name separated by comma
    • pattern -- unix styled pattern or regular expression
    • context -- value can be 1/2/3 which coresponding to Words/Lines/Sentences respectively
    • contextLength -- length of context
    • sorting -- values can be 1/2/3 corresponding to co-occurrence words by frequency/alphabetically/z-score
    • outFormat -- values are same as parameter "finddisp" in the CGI interface above

Known Bugs

To Do


 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.3 - 15 Oct 2005 - MattPatey)

META TOPICPARENT TAPoRware

Find Text — Collocation

Changed:
<
<
See http://taporware/~taporware/htmlTools/collocation.shtml
>
>
See http://taporware.mcmaster.ca/~taporware/htmlTools/collocation.shtml

TOC: No TOC in "Main.TAPoRwareHTMLCollocation"


 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.2 - 15 Oct 2005 - MattPatey)

META TOPICPARENT TAPoRware
Added:
>
>

Find Text — Collocation

See http://taporware/~taporware/htmlTools/collocation.shtml

Description


This tool takes a word from the user and returns all words directly before and directly after it based on the given context (i.e. words, lines, sentences). Results can be sorted alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).
Added:
>
>

History

Pseudocode

Ways of Using

CGI Interface

Web Service Interface

Known Bugs

To Do


-- MattPatey - 13 Oct 2005

 <<O>>  Difference Topic TAPoRwareHTMLCollocation (r1.1 - 13 Oct 2005 - MattPatey)
Line: 1 to 1
Added:
>
>
META TOPICPARENT TAPoRware
This tool takes a word from the user and returns all words directly before and directly after it based on the given context (i.e. words, lines, sentences). Results can be sorted alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).

-- MattPatey - 13 Oct 2005


Topic: TAPoRwareHTMLCollocation . { View | Diffs | r1.11 | > | r1.10 | > | r1.9 | More }

Revision r1.1 - 13 Oct 2005 - 20:55 - MattPatey
Revision r1.11 - 06 Jun 2008 - 17:22 - LianYan