Skip to content.

Find topic

Web tools

Help

Tools

       Analysis Tool Bar  +

Distribution Graph

Try It

Description

This tool enables users to see a graphical representation of the distribution of words, patterns or HTML tags over the course of an HTML document. Results are displayed as percentages of patterns/words found in each chunk (e.g. tag, percentage of document, block of n words). A number of options are also available that allow the user to view and or interact with the graph in real-time.

Issues

This tool requires the JRE (v1.4.2 and up) in order to work properly.* This is likely to cause problems with OS X users using browsers other than Safari. By default, Firefox and other OS X web browsers use an older version of the JRE which is incompatible with this tool.* There is a solution that allows Firefox to use JRE 1.4.2+ which can be found here. Unfortunately this fix does not seem to work for the OS X version of Internet Explorer.

* This is only relevant if the desired output is Java

CGI Interface

If you want to use this tool from your web site, here is the CGI Interface: (Note: You need to use attribute name/value pair: enctype="multipart/form-data" within the form tag because the tool was designed to allow local file uploading even if you do not use this feature)

Here are the parameters:

Parameter Name Parameter Value Control Type Default Discription
source url/local radio button url Let user select input text (either a url or upload local html text)
htmlurl   text   A valid URL that the pointed document should be an html text
localFile   file   The path to your local html text file
disType 4/1/5 radio 4 subtext the distribution -- corresponding to percentage/element/chunk of words respectively (note: the following 3 controls will be paired with the radio buttons in order
percent 2/5/10/25/50 select 10 percentage of text the distribution over
elemonly   text body the HTML element name that the distribution is over
chunk   text 100 the number of words of subtext the distribution is over
relative   checkbox unchecked indicate if the relative distribution is displayed
find_patt   text   the word or pattern which is used in the distribution
HowToList 1/2/3/4 select 2 the display formats, corresponding to SVG/HTML/Tab delimited text/Java applet respectively

Use Distribution TAPoRware Tool in Your Web Page

You can add a button and a text field in your web page to list all the words in that page by call TAPoRware cgi script.

Word/Pattern:   

Here is the code for the interface:

<form method="post" name="htmlForm" enctype="multipart/form-data" target="_blank" action="http://taporware.mcmaster.ca/~taporware/cgi-bin/prototype/hdistrib.cgi" onsubmit="document.htmlForm.htmlurl.value=document.location.href">

<input type="hidden" name="source" value="url" />

<input type="hidden" name="htmlurl" />

<input type="hidden" name="disType" value="4" />

<input type="hidden" name="percent" value="10" />

<input type="hidden" name="relative" value="1" />

Word/Pattern: <input type="text" name="find_patt" />  

<input type="hidden" name="HowToList" value="4" />

<input type="submit" name="doIt" value="Submit" />

</form>

Web Service Interface

Taporware provides web services to any non-benefit organizations. Here is the taporware web services infomation:

  • Endpoint URL: http://taporware.mcmaster.ca:9982
  • Service URI: http://taporware.mcmaster.ca/~taporware/webservice
  • Service Method: pattern_Distribution_HTML
  • parameters:
    • textSource -- any HTML text
    • option -- subtext that the distribution is over, see "disType" of CGI interface for the values
    • percent -- percentage of text, can be 2/5/10/25
    • element -- valid HTML element
    • chunk -- number of words per unit of subtext
    • relative -- Y/N, indicate if relative distribution is displayed
    • outForm -- output format. values of 1/2/3/4 are corresponding to Java applet/HTML/tab delimited text/SVG respectively

To Do

  • Add title
  • Check Help - we need help
  • Get rid of "Save Data"
  • Incorporate pan/zoom feature for viewing large data sets (optional)
  • Incorporate summary of findings (i.e. document statistics, individual result information etc.)
  • Incorporate ability to change graph style on the fly (e.g. colour scheme, grid/value display etc.)
  • Include different types of graphs (i.e. point display, charts etc.)

-- MattPatey - 19 Aug 2005


Use this box to quickly add a comment to the page.

more options...