Main.TAPoRwareWordPairs (r1.1 vs. r1.3)
Diffs

 <<O>>  Difference Topic TAPoRwareWordPairs (r1.3 - 07 Jun 2007 - LianYan)

META TOPICPARENT TAPoRware

Word Pair Finder

Line: 8 to 8

<!-- This tools is being developed with Andy Hrymak from Chemical Engineering. It will find high frequency pairs of words. It is needed to help build a subject keyword list. -->
Changed:
<
<
This tool lists word pairs of a text.
>
>
This tool finds word pairs of a source text based on user's choice. It can list all the word pairs, word pairs matching some pattern, word pairs not spanning sentences etc.

Pseudocode

<!--
  1. Find the top X high frequency words. Exclude stop words.
Line: 16 to 16

    1. To do this it would find the collocates immediately to the left or right Y places
  1. It would generate a list of the pairings sorted first by the key high-freq word and then by the frequency of the collocations
-->
Added:
>
>
  • Get source text from internet or from user's local disk. if the source text is HTML or XML, strip off all the tags
  • Tokenize the text based on user's choice. For example, if user wants all the word pairs, the token is word. If user wants word pairs not spanning sentences, the token will be sentence, then word
  • Get all the word pairs, then filter all the pairs not needed by user.
  • Sort and count the word pairs based on user requirement.
  • Generate output format

Way of Using

  • Enter a valid URL in the URL field or enter a local path to upload text
  • Select one of the word pairs you want to get and then select/enter the corresponding text/selection if necessary
  • Select sorting criterion
  • Select output format
  • If you want the results displayed in the same window with taporware interface, uncheck the check box - "Open results in new window"
  • Finally, click the "Submit" button

To Do

  • Talk to Lian to see if it can be done.

 <<O>>  Difference Topic TAPoRwareWordPairs (r1.2 - 29 Mar 2007 - LianYan)

META TOPICPARENT TAPoRware

Word Pair Finder

Description

Added:
>
>
<!--

This tools is being developed with Andy Hrymak from Chemical Engineering. It will find high frequency pairs of words. It is needed to help build a subject keyword list.
Changed:
<
<

History

Still in concept phase.
>
>
--> This tool lists word pairs of a text.

Pseudocode

Changed:
<
<
>
>
<!--

  1. Find the top X high frequency words. Exclude stop words.
  2. Find high frequency pairs that include the original list. Thus it would find high frequency collocating pairs.
    1. To do this it would find the collocates immediately to the left or right Y places
  3. It would generate a list of the pairings sorted first by the key high-freq word and then by the frequency of the collocations
Changed:
<
<
>
>
-->

To Do

  • Talk to Lian to see if it can be done.

 <<O>>  Difference Topic TAPoRwareWordPairs (r1.1 - 26 Jun 2006 - GeoffreyRockwell)
Line: 1 to 1
Added:
>
>
META TOPICPARENT TAPoRware

Word Pair Finder

Description

This tools is being developed with Andy Hrymak from Chemical Engineering. It will find high frequency pairs of words. It is needed to help build a subject keyword list.

History

Still in concept phase.

Pseudocode

  1. Find the top X high frequency words. Exclude stop words.
  2. Find high frequency pairs that include the original list. Thus it would find high frequency collocating pairs.
    1. To do this it would find the collocates immediately to the left or right Y places
  3. It would generate a list of the pairings sorted first by the key high-freq word and then by the frequency of the collocations

To Do

  • Talk to Lian to see if it can be done.

-- GeoffreyRockwell - 26 Jun 2006


Topic: TAPoRwareWordPairs . { View | Diffs | r1.3 | > | r1.2 | > | r1.1 | More }

Revision r1.1 - 26 Jun 2006 - 14:39 - GeoffreyRockwell
Revision r1.3 - 07 Jun 2007 - 13:34 - LianYan