XML element and attribute pattern distribution
See
http://taporware.mcmaster.ca/~taporware/xmlTools/patterndistrib.shtml
Description
This tool generates different kind of distribution graphs of word/pattern over specific XML elements, attributes, or trunk of text.
Pseudocode
- Obtain XML string by URL or from user's local disk. If the text is not an XML, return an error message
- Obtain text based on the user selection in the "Subtext limited to" panel
- Chop text into different chunks based on the user selection as well
- Find and record the user specified word/pattern in each chunk, if "show relative distribution" is selected, the total words in each trunk are counted and recorded too
- Based on user selection, generate the corresponding graphics
Ways of Using
- Fill in the source xml in the source panel either by a valid URL or from your local disk
- Selete the subtext you want to the distribution over to.
- Select or enter the corresponding content according to your subtext
- Check or uncheck "show relative distribution" box
- Enter word/pattern in "What to find" panel
- Select the format of distribution
CGI Interface
If you want to use this tool from your web site, here is the CGI Interface: (Note: If you want to upload local xml text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)
Here are the parameters:
| Parameter Name | | Control Type | Default | Description |
| source | url/local | radio button | url | Let user select input text (either a url or upload local xml text) |
| xmlurl | | text | | A Valid URL pointing to an xml text |
| localFile | | file | | The path to your local html text file |
| disType | 1/2/4/5 | radio button | 4 | Indicate chunk of subtext, They are element/attribute/chunk of percentage/trunk of words in order of value in "Parameter Value" field |
| elemonly | | text | | If you want distriution over element, enter valid xml element name here |
| attribute | | text | | If you want distribution over attribute in element specified in the following row, enter a valid attribute here |
| element | | text | | This one specifies the element where the attribute is in |
| percent | 1/5/10/20/25 | selection | 10 | If distribution over percentage of text is selected, select a percentage number here |
| chunk | | text | 100 | If distribution over chunk of text is selected, enter the number of words each chunk in this field |
| relative | | checkbox | unchecked | Check it to display both absolute and relative word distribution in the chunks |
| pattern | | text | | This is the word/pattern you want to see in the distribution |
| HowToList | 1/2/3/4 | radio button | 2 | The format of distribution. In order of the value, it represent SVG/HTML/Tab delimited/java applet. Note: you need SVG viewer to view SVG and java runtime to view applet. |
| taporface | | checkbox | checked | Indicate if open result in new window (no taporware interface) |
Web Service Interface
Taporware provides web services to any non-benefit organizations. Here is the taporware web services information:
(Note: the form layout is customized)
- Endpoint URL: http://taporware.mcmaster.ca:9982
- Service URI: http://taporware.mcmaster.ca/~taporware/webservice
- Service Method: pattern_Distribution_XML
- parameters:
- xmlSource -- any well-formed xml string
- option -- distribution over chunk of text
- 1 -- over element
- 2 -- over attribute in a specified element
- 4 -- over percentage of text (counted by characters)
- 5 -- over number of words in each chunk
- element -- for distribution over element only (put it with the selection in one line)
- attriName -- for distribution over attribute only
- attriValue -- for distribution over attribute only, this is the containing element name of the attribute above. Put this two controls aligned with the corresponding selection
- percent -- for distribution over percentage of the text. Please use selection control and put it in line with the percentage selection of the subtext.
- chunk -- for distribution over chunk of words. It should be in line with the chunk of text selection
- relative -- check to display relative distribution as well. Checked value is "Y"
- pattern -- the field of word or pattern distribution to be generated
- outFormat -- display format. the values are:
- 1 -- java applet
- 2 -- HTML
- 3 -- tab delimited text
- 4 -- SVG
Known Bugs
To Do
--
MattPatey - 13 Oct 2005