Generate Meta Information
for a Website using Text Analysis Tools
This exercise uses
this Recipe to build meta tags for a web page or website using Frequency lists and Googlizer.
Exercise Steps
- This exercise uses this webpage as an analysis sample.
- Log-in into TAPoR.
- Generate a list of most frequently used words on this web page using the TAPoR List Words Tool. Input the url "http://www.tapor.ca/html/faq.html" as the target. Make sure that the tool is set to list the results by frequency.
- Your result should be similar to:
Summary: There are 273 unique words and there are 616 words in total. 196 words occurred once and 31 words occurred twice.
|
|---|
| Words | | Counts |
|---|
| Text | ------ | 19 |
| Project | ------ | 12 |
| Analysis | ------ | 11 |
| Tapor | ------ | 9 |
| Portal | ------ | 7 |
| Tools | ------ | 7 |
| Researchers | ------ | 7 |
| University | ------ | 6 |
| Canada | ------ | 6 |
| Electronic | ------ | 6 |
| Research | ------ | 6 |
| Texts | ------ | 5 |
| Humanities | ------ | 4 |
| Study | ------ | 4 |
| Access | ------ | 4 |
| Enable | ------ | 4 |
| Web | ------ | 3 |
| Representation | ------ | 3 |
| Information | ------ | 3 |
| Knowledge | ------ | 2 |
| Universities | ------ | 2 |
| Human | ------ | 2 |
| Scholars | ------ | 2 |
| Computing | ------ | 2 |
| Society | ------ | 2 |
| Infrastructure | ------ | 2 |
| Techniques | ------ | 2 |
| Easy | ------ | 2 |
| Textual | ------ | 2 |
| Digital | ------ | 2 |
- Inspect this list and ask yourself whether these are words that you feel define the content of your site. In this sample web page, we are defining the TAPoR project and want people to know that it provides a series of easily accessible, powerful text analysis tools. Are they the terms that you would expect someone to search for if they were looking for this site? If they are not what you expect, you may want to consider rewriting your content to more directly refer to what your site is all about.
- If you know of other sites on your subject or a similar subject, inspect the keywords* they have chosen for their meta tags.
- To automate this process, choose one of the key words returned above as a search term and use it as a search parameter with the TAPoR Googlizer. For this exercise use the search term "text analysis tools" and ask for 10 links with full pages;
- Save the results to the Databench;
- Generate a list of most frequently used words on the aggregated web pages with a List Words tool such as the TAPoR List Words Tool;
- This list should resemble:
Summary: There are 1549 unique words and there are 5479 words in total. 919 words occurred once and 248 words occurred twice.
|
|---|
| Words | | Counts |
|---|
| Text | ------ | 135 |
| Electronic | ------ | 71 |
| Analysis | ------ | 62 |
| Texts | ------ | 60 |
| Tools | ------ | 39 |
| Information | ------ | 30 |
| Software | ------ | 27 |
| Language | ------ | 21 |
| Research | ------ | 21 |
| Researchers | ------ | 21 |
| Concordance | ------ | 20 |
| Study | ------ | 20 |
| Word | ------ | 19 |
| Search | ------ | 18 |
| Resources | ------ | 15 |
| Dream | ------ | 15 |
| Computer | ------ | 15 |
| Document | ------ | 14 |
| Canadian | ------ | 12 |
| Page | ------ | 12 |
| University | ------ | 12 |
| Retrieval | ------ | 12 |
| Words | ------ | 12 |
| Written | ------ | 12 |
| Documents | ------ | 11 |
| Tool | ------ | 11 |
| Project | ------ | 11 |
| Concordances | ------ | 10 |
| Version |
- Consider the list of keywords suggested for your site and the list of key words drawn from similar sites. Are there keywords appearing on other sites that you want to add to your list?
- Assemble a list of the top ten words that you feel describe your site, possibly suggested by these two lists;
- Construct an appropriate meta tag in the format:
"<meta name="keywords" content="_your keywords here_" />"
- This meta tag can now be added to your web page to aid others to find your information.
Next Steps/Further Information
--
ShawnDay - 21 October 2006