Skip to content.

Find topic

Web tools

Help

Tools

       Analysis Tool Bar  +

CAPs Finder

See http://taporware.mcmaster.ca/~taporware/betaTools/capsfinder.shtml

Description

CAPs is abbreviation for capital letters.

This tool tries its best to recognize and find all the groups of capitalized words in the input text. However, it is very difficult to recognize correctly all the meaningful CAPs, especially the Capitalized words are interleaved with non-Capitalized words. If any user has a better algorithm and wants to share with us, we will implement it for you.

Pseudocode

  • Obtain source text by URL or form user's local disk. If the text format is XML or HTML, strip off all the tags
  • Chop the text into sentences
  • Remove the first word of each sentence if it matches some criteria. For example, it matches "By ", "Again ", "Now " etc.
  • For each subtext obtained above, chop them into sub-subtext using words such as "are", "is", "am", "have", "has", "weather", "|", "(" ... ....
  • Apply pattern matching and other method to identify the CAPs

Way of Using

  • Enter a valid URL in the URL field or enter a local path to upload the source text
  • Click the "Submit" button

CGI Interface

If you want to use this tool from your web site, here is the CGI Interface: (Note: You need to use attribute name/value pair: enctype="multipart/form-data" within the form tag because the tool was to designed to allow local file uploading even if you do not use this feature)

Here are the parameters:

Parameter Name Parameter Value Control Type Default Description
source url/local radio button url Let user select input text (either a url or upload local html text)
texturl   text   A valid URL pointing plain text, html or xml document
localFile   file   The path to your local text file
freetext   hidden Y Set the value to "Y" allow the tool to process HTML or XML by strop off all the tags
listdisp 2 select 2 display output in HTML format (the only format implemented currently

Use CAPs Finder TAPoRware Tool in Your Web Page

You can add a button in your web page to list all the words in that page by call TAPoRware cgi script.

Here is the code for the button interface:

<form method="post" name="textForm" enctype="multipart/form-data" target="_blank" action="http://taporware.mcmaster.ca/~taporware/cgi-bin/prototype/tcapsfinder.cgi" onsubmit="document.textForm.texturl.value=document.location.href">

<input type="hidden" name="source" value="url" />

<input type="hidden" name="texturl" />

<input type="hidden" name="freetext" value="Y"/>

<input type="hidden" name="listdisp" value="2" />

<input type="submit" value="Find CAPs" />

</form>

Web Service Interface

Taporware provides web services to any non-benefit organizations. here is the taporware web services information:

-- LianYan - 01 Jun 2007


Use this box to quickly add a comment to the page.

more options...