Skip to content.

Find topic

Web tools

Help

Tools

       Analysis Tool Bar  +



Add a French language Text to TAPoR


This exercise uses this Recipe to import a French language into the TAPoR text analysis environment.

This exercise applies the recipe to a textual example which is freely available on the Internet so you can complete the steps yourself and see the results.

Exercise Steps


This exercise assumes that you have a text which is not encoded in a way that will allow for analysis to take place.
The text in this sample is in French with accented characters but is encoded using Windows ASCII. Although it may appear properly on the screen of your word processor, it will not be interpreted properly when you attempt to analyse it in TAPoR.


  1. Download this text from FrenchSample and save to your desktop.
  2. Log in to TAPoR;
  3. Choose the MyTexts tab in the portal.
  4. At the bottom of your list of texts, click the Add Text button. A tutorial on adding texts to TAPoR is available if you need more information on adding texts to TAPoR.

    When you choose to upload this text file, you will get the error message "Please correct these errors: File to upload: upload.invalid-type". This is because TAPoR checks this text and determines that it is encoded as Windows ASCII and cannot be used.


  5. To re-encode this text, open a text editor on your computer and follow the instructions at Recipe 22 for your particular operating system. Save on your desktop with the filename MyFrenchText.txt.
  6. Your text should now be saved in UTF8 format.
  7. Add this text file to MyTexts, by choosing the Add Text button in MyTexts and browse to find the saved file. Add an appropriate tag then click the Add Text button at the bottom of the Coplet.
  8. When you receive the message that the text has been added successfully, click the button indicated to refresh your text list.
  9. To test whether the encoding has been completed correctly, generate a word list using the TAPoR List Words Tool. Use the default parameters.
  10. If the document has been successfully encoded and imported, you should obtain a results such as:

    Summary: There are 378 unique words
    and there are 744 words in total. 304 words occurred
    once and 27 words occurred twice.
    WordsCounts
    La------40
    De------39
    Pouvoir------23
    Soif------20
    Et------19
    Les------17
    L------15
    Des------15
    Du------15
    Le------11
    D------10
    Il------9
    Que------8
    Est------8
    à------7
    Une------6
    Qui------6
    En------6
    Dans------6
    S------5
    ------5
    Sa------5
    Un------5


  11. To test whether you can enter an accented word and search within this text, build a concordance using TAPoR Find Words - Concordance Tool and input a word that you know occurs within the text and includes diacritical marks. In this example search for the word in the text.
  12. Do not copy and paste this search term. Enter this word using your normal keyboard technique for entering an accented character. If you are unsure of how to do this, a tutorial is available here.
  13. If you are capable of inputting characters correctly, you should see a result similar to:

    5 entries found.
    avant J . au moment les Juifs rêvaient encore d'une
    les programmes scolaires et partout cela est encore possible des
    La vie familiale d'abord , Chirac asservit les siens à
    d'une démocratisation mal négociée , le détenteur du pouvoir veut
    ouvrir des boîtes de pandore d'où surgissent des démons incontrôlables


  14. Congratulations, you are all ready to work with French language text within the TAPoR environment.

Next Steps/Further Information

-- ShawnDay - 21 March 2007


Use this box to quickly add a comment to the page.

more options...