Skip to content.

Find topic

Web tools

Help

Tools

       Analysis Tool Bar  +

Weighted Centroid

Try It

Description

This applet displays a circular graph based on word distribution data.

The text is divided up into an arbitrary number of units, which are positioned around the circumference of the circle in a clockwise sequence. The more times a word appears in a particular text unit, the closer the word will be to that unit in the circle. If a word appears an equal number of times in all units, it be located in the centre of the circle.

Words are colour coded based on the amount of times they appear in the text as a whole. Blue words have the highest word count. Rolling over a word will display lines representing its connections to the units. Clicking a word will keep its lines visible after you move the mouse off of it. Click the word again to remove the lines. The darker the line, the more times the word was found in that unit.

Additionally, all the words found in the graph are listed on the left side of the applet. There is a scroll bar for viewing the words, should they extend past the bottom of the applet. This list of words features the same rollover and clicking functionality as those found in the the graph itself.

This tool uses the processing library.

* This tool requires the JRE (v1.4.2 and up) in order to work properly.

Pseudocode

  • get data from the Weighted Centroid program and process it to determine maximum values
  • create nodes for each word using the data
    • calculate node position
    • calculate lines connecting node to each text unit that it occurs in
  • initialize the processing environment
  • draw words, word list, circle, and text units
  • add listener for mouse events

CGI Interface

If you want to use this tool from your web site, here is the CGI Interface: (Note: You need to use attribute name/value pair: enctype="multipart/form-data" within the form tag because the tool was to designed to allow local file uploading even if you do not use this feature)

Here are the parameters:

Parameter Name Parameter Value Control Type Default Description
source url/local radio button url Lets the user select input text (either an URL or a local file for upload)
texturl   text   A valid URL pointing to a text, html, or xml file
localFile   file   The path to a local text, html, or xml file
freetext on/off checkbox on Turn on to treat xml/html as plain text
disType 1/2/3 radio button 2 Defines the granularity of the graph: paragraphs/n percentage of text/chunks of n words
percent 5/10/20/50 selection 10 Specifies the percentage for use with option 2 of disType
chunk   text 100 Specifies the chunk of words for use with option 3 of disType
topfre 0/5/10/20/50 selection 10 Specifies how many of the highest frequency words to be included in the result
stoplist on/off checkbox on Specifies whether to exclude Glasgow Stop Words from the results (on = exclude)
user on/off checkbox off Specifies whether to include extra user-defined words in the results
userword   text   The user-defined words to include in the results
HowToList 1/2 selection 2 Determines how to display the results: HTML table/Java applet

Use Weighted Centroid TAPoRware Tool in Your Web Page

You can add a text field and a button in your web page to a word/pattern distribution of the current page by call TAPoRware cgi script.

Here is the code for the button:

<form method="post" name="textForm" enctype="multipart/form-data" target="_blank" action="http://taporware.mcmaster.ca/~taporware/cgi-bin/prototype/tweighted.cgi" onsubmit="document.textForm.texturl.value=document.location.href">

<input type="hidden" name="source" value="url" />

<input type="hidden" name="texturl" />

<input type="hidden" name="freetext" value="Y" />

<input type="hidden" name="disType" value="2" />

<input type="hidden" name="percent" value="10" />

<input type="hidden" name="topfre" value="30" />

<input type="hidden" name="stoplist" value="on" />

<input type="hidden" name="HowToList" value="2" />

<input type="submit" name="doIt" value="Weighted Centroid" />

</form>

Web Service Interface

Taporware provides web services to any non-benefit organizations. here is the taporware web services information:

  • Endpoint URL: http://taporware.mcmaster.ca:9982
  • Service URI: http://taporware.mcmaster.ca/~taporware/webservice
  • Service Method: weighted_centroid
  • parameters:
    • textSource -- any text string. If the text format is html or xml, all the tags will be stripped
    • suboption -- subtext unit selection, the values 1/2/3 are corresponding to paragraph/percent of characters/chunk of text in words
    • percent -- this selection is related to the choice of "percent of characters"
    • chunk -- this text field is related to the choice of "chunk of text in words" in the suboption parameter
    • topWords -- number of top frequency words (may or may not exclude stop words, see below) to be investigated
    • glasgow -- a boolean value (Y) to exclude the glasgow stop words in the top frequency word list
    • userwords -- a text field for user enter his/hers stop words (separated by comma). This list will combined with the glosgow stop list if you select it
    • outFormat -- values are 1/2 which are corresponding to HTML and java applet respectively

Responsibility

This tool was programmed by Andrew MacDonald as part of the TAPoR project.

-- AndrewMacdonald - 10 Apr 2006


Use this box to quickly add a comment to the page.

more options...