Main.TAPoRwareOther (r1.1 vs. r1.4)
Diffs

 <<O>>  Difference Topic TAPoRwareOther (r1.4 - 03 Apr 2007 - LianYan)

META TOPICPARENT TAPoRware

Aggregate text from different sources

See http://taporware.mcmaster.ca/~taporware/otherTools/aggregator.shtml
Line: 27 to 27

CGI Interface

Added:
>
>
If you want to use this tool from your web site, here is the CGI Interface: (Note: If you want to upload local html text to the tool, you need to use attribute name/value pair: enctype="multipart/form-data" within the form tag)

Here are the parameters:

Parameter Name Parameter Value Control Type Default Description
source url/local radio button url Let user select input text (either a url or upload local html text)
urls   text area   URL list that the pointed text to be aggregated -- one url per line
urlfile   file   The path to your local url list file
localfile   file   path to your local text file you want to aggregated with the above retrieved text
howtoaggre striptag/formxml radio button striptag Indicate how to treat the tags in the retrieved text. If "Strip tag" is selected, all tags will be stripped. Otherwise, generate an XML corpus with non-xml text being commented out
strip 1/2/3/4 selection 1 Output format that corresponding to HTML/XML text in HTML/XML tree/Plain text
taporface   checkbox checked display result in a new window without graphics interface (default) or with taporware interface in the same window

Web Service Interface

Added:
>
>
Taporware provides web services to any non-benefit organizations. here is the taporware web services information:

  • Endpoint URL: http://taporware.mcmaster.ca:9982
  • Service URI: http://taporware.mcmaster.ca/~taporware/webservice
  • Service Method: aggregator_HTML
  • parameters:
    • source1 -- any source text
    • source2 -- any source text to be aggregated with source 1
    • urls -- valid urls delimited by "\n" that their pointed text to be aggregated with the two above
    • element -- This is the element name that applies to all aggregated text. Only those text contained in the element will be extracted and aggregated. Default is body
    • outFormat -- output format: 1 -- plain text, 2 -- html text

Known Bugs

To Do


 <<O>>  Difference Topic TAPoRwareOther (r1.3 - 03 Apr 2007 - LianYan)

META TOPICPARENT TAPoRware

Aggregate text from different sources

See http://taporware.mcmaster.ca/~taporware/otherTools/aggregator.shtml
Line: 6 to 6

Description

Changed:
<
<
This tool aggregates texts/subtexts from different locations into a single text. The original texts can be from different locations, such as the internet or/and on your local machine. Aggregating subtexts requires all documents to share a common subtext tag, i.e. limiting the subtext to body requires all texts to have a <body> tag. The aggregator tool will grab the contents from the texts to form a single new text.
>
>
This tool aggregates texts/subtexts into a single text. The original texts can be from different locations, such as the internet or/and on your local machine. Aggregating subtexts requires all documents to share a common subtext tag, i.e. limiting the subtext to body requires all texts to have a <body> tag. The aggregator tool will grab the contents from the texts to form a single new text.

Note If you use this through the TAPoR portal you have to put something in the two source text boxes. You can choose "Text" and just enter a word or two in the text boxes.

Deleted:
<
<

History


Pseudocode

Added:
>
>
  • Get the URLs from user input
  • Get local text if user specify a local file to be aggregated with the text from the URLs
  • Get all the text by the URLs
  • Strip tags based on user selection
  • Generate output based on user selected format

Ways of Using

Added:
>
>
  • Enter valid URLs you want the related text to be aggregated into the "URLs" text area (one URL per line) if it is selected
  • Or enter (browse) the local file path into the "Local file" field if your URLs are in a file. The format of the URLs in the file should also be one URL per line
  • Browse or enter the local text path into the "Plus ..." file field if you wnat this TEXT to aggregated with the texts indicated above
  • Select how to treat the tag from the retrieved texts if there are any
  • Select the aggregated text format.
  • Click submit button

CGI Interface


 <<O>>  Difference Topic TAPoRwareOther (r1.2 - 22 Nov 2006 - GeoffreyRockwell)

META TOPICPARENT TAPoRware

Aggregate text from different sources

See http://taporware.mcmaster.ca/~taporware/otherTools/aggregator.shtml
Line: 8 to 8

Description

This tool aggregates texts/subtexts from different locations into a single text. The original texts can be from different locations, such as the internet or/and on your local machine. Aggregating subtexts requires all documents to share a common subtext tag, i.e. limiting the subtext to body requires all texts to have a <body> tag. The aggregator tool will grab the contents from the texts to form a single new text.
Added:
>
>
Note If you use this through the TAPoR portal you have to put something in the two source text boxes. You can choose "Text" and just enter a word or two in the text boxes.

History

Pseudocode


 <<O>>  Difference Topic TAPoRwareOther (r1.1 - 15 Oct 2005 - MattPatey)
Line: 1 to 1
Added:
>
>
META TOPICPARENT TAPoRware

Aggregate text from different sources

See http://taporware.mcmaster.ca/~taporware/otherTools/aggregator.shtml

Description

This tool aggregates texts/subtexts from different locations into a single text. The original texts can be from different locations, such as the internet or/and on your local machine. Aggregating subtexts requires all documents to share a common subtext tag, i.e. limiting the subtext to body requires all texts to have a <body> tag. The aggregator tool will grab the contents from the texts to form a single new text.

History

Pseudocode

Ways of Using

CGI Interface

Web Service Interface

Known Bugs

To Do

-- MattPatey - 15 Oct 2005


Topic: TAPoRwareOther . { View | Diffs | r1.4 | > | r1.3 | > | r1.2 | More }

Revision r1.1 - 15 Oct 2005 - 19:42 - MattPatey
Revision r1.4 - 03 Apr 2007 - 18:08 - LianYan