Skip to content.

Find topic

Web tools

Help

Tools

       Analysis Tool Bar  +



List Words to Identify Themes

This exercise uses this Recipe to identify simple themes within a sample text.

It applies a recipe to real textual example which is freely available on the Internet so you can do the steps yourself and see the results.

This recipe and exercise is available as a PDF download.

Exercise Steps

  1. This exercise uses Volume 2 of Thomas Macaulay's History of England which can be downloaded from Project Gutenberg.
  2. Run the TAPoR List Words Tool to generate a word list sorted by frequency. The result should resemble the following:
    Word Count
    The------3591
    Of------2057
    And------1360
    To------1234
    A------850
    Was------848
    In------758
    Had------686
    Been------265
    Be------255
    Not------246
    At------240
    On------213
    From------212
    Who------201
    They------187
    Their------174
    All------153
    King------139

    The most frequently used words are function words such as 'The', 'A', etc. They don't appear to be particularly unique, so we decide to eliminate common function words.


  3. Run the TAPoR List Words Tool again, applying a list of words to exclude from the list. One useful stop list, the Glasgow stop words list, is available here. The result should be similar to:
    Word Count
    King------139
    Great------115
    Parliament------92
    England------86
    House------83
    Men------81
    Time------75
    Government------74
    Charles------73
    Power------68
    Party------66
    Public------59
    Years------57
    France------56
    Long------56
    English------55
    Court------54
    Commons------53
    State------52
    Church------51
    New------46
    Man------46
    Country------46

    The list of frequent words is now more intriguing. Words such as : King, Great, Parliament, England, House, Men, Time, Government, Charles, Power, Party, Public Years, Just immediately stand out.


  4. Thus, with one simple list words tool you can easily identify the themes of power, monarchy, the common man and time in Macaulay's History of England.

Next Steps/Further Information

-- ShawnDay - 3 November 2006


Use this box to quickly add a comment to the page.

more options...