Skip to content.

Find topic

Web tools

Help

Tools

       Analysis Tool Bar  +
Main > DictionaryOfWordsInTheWild

Dictionary of Words in the Wild

What would it be like to live in a world in which there were no written words to be seen, ever? (Willard McCarty)

Important The Dictionary is Moving to a new Home

On Tuesday, May 05th 2009, the Dictionary of Words in the Wild will be moved from the McMaster University to University of Alberta, with a new domain name http://lexigraphi.ca. While we try to minimize service interruptions during this transition, we would like to make the users aware that the site may be down during the day of May 5th. Once the transfer is completed, the dictionary will be running as before, but off a TAPoR server at U of Alberta. We apologize about any inconvenience.

White Papers and Abstracts

DH 2008 Abstract This is a work in progress.

Known Bugs

  • The Stats page gives a different statistic for the number of words than the home page. I suspect this has to do with the Phrases in the Fields, but it would be nice to fix that.

  • If you click a word like "Time" in the word cloud to the right of the home page, it doesn't go to the page with the pictures of that word. The link says: "http://ra.tapor.ualberta.ca/~dictwordwild/list_by_word/Time" but this link doesn't work. If you search for "Time" you do get a page.

To Do

These are suggestions from users that we are looking to implement. Thanks to Willard McCarty who has provided a lot of testing and feedback.

  • We need to parse phrases so that we remove punctuation. For example, "faith," won't find the image for "faith".
  • Done - Check We have to fix the search - it seems to be hungry and finds words that sound like.
  • Done File images for multiple words properly under each word. At the moment images are just being filed under one word. So, a word for "Off On" ends up under "Off" but not "On". This partially fixed.
  • Done Allow implicit words to be entered with parentheses where the word doesn't appear, but is implicit. An example would be http://tapor1-dev.mcmaster.ca/~dictwordwild/show/694 which is filed under "Average" even though the word doesn't appear.
  • Done Allow short phrasal verbs and compounds to be entered with quotation marks so they are filed as one item. An example would be "come up" or "happy days".
  • Done Allow images of longer passages to identified as "Phrases in the Fields". These would not be filed under individual words, but the full text could be searched. When you choose a "Passage" it would change the entry interface to say Label = "Passage", bigger scrollable field, Directions = "(Type passage verbatim.)". The button would say "Create Passage".
  • Done Allow people to control capitalization so that, for example, "ER" (which stands for "Emergency Room") is not rendered as "Er". Set it so it stores the word to exactly what entered. Add instructions to people to remind them of this.
  • New Colour scheme

Possible To Dos

These are ideas we are not sure about.

  • Provide bulk upload feature
  • Make it easy to reset password
Let people comment on pictures if they have an account like Flickr. Done
  • Let people create albums or trains of words.
  • Provide a tags field where other information could be attached like geospatial information.
  • Have a bulk upload feature
  • Build example toys that demonstrate the API
  • Provide HTML code to make it easy to put automatic toys into a web site.
  • have the ability to view all the pictures with a particular word tagged in it (from Ali)
  • have a listing of all the words (word clouds?) that are used in conjunction with a particular word (from Ali)
  • have words sorted by geography (Cities if that info is provided, or even more specific if data is available) (from Ali)
  • have a featured word sighting of the day (randomly chosen by the dictionary? or by a particular algorithm) (from Ali)
  • have ability to tag pictures with the languages that appear in them (from Ali)

Wild Winds

These are ideas in the wild.

  • Develop a "Wild Word Trap" that would capture images of passing textuality. This could be some sort of web cam with OCR built in so that when an image it caught had text it would tag it and save it.
  • "A Day of Wild Words" would try to capture all the forms of textuality a person captures over a day from the alarm clock in the morning to the toothpaste label at night.

Meeting Notes

Original Design

These were the original specifications

Fields for the Database

  • Keyword (can have more than one separated by spaces)
  • Original height and width
  • Format (automatic)
  • Who submitted it (your account name)
  • Description
  • A check box for if the image is just one word or a batch of words (not implemented)
  • Date submitted (automatic)

Forms for the Database

  • An upload form which is behind a password for uploading images.
    • When an image is uploaded we keep the original and downsize it to a standard size - the standard one is saved as a new image
    • When uploading you can choose to crop the image
  • A form for typing a phrase and getting the images back (this is on the home page)
  • An edit form for editing existing items - allows administrators to delete entries, add keywords, change description and so on
  • A form for seeing all the keywords for which we have images

Toys

The Dictionary has an API that allows text toys to be developed that use it.

80 Million Tiny Images is a related project that we should think about.

Matt Didemus Project

Matt Didemus has developed a text toy that uses a SVG browser Batik. In order to run the text toy you need to download this browser, http://xmlgraphics.apache.org/batik/

Then put this url in: http://didemus.whsites.net/4w03/getwordsfromwild.rb

This will generate a random set of words rendered with the pictures of those works coming through. Matt also drew my attention to a list of the most commonly used words in English: http://www.duboislc.org/EducationWatch/First100Words.html .

-- GeoffreyRockwell - 13 Oct 2005


Use this box to quickly add a comment to the page.

more options...