Help: using search in POS-tagged corpus
How do I specify what to find ?
-
You can specify a pattern consisting of 1 up to 4 words.
Always start with "Word 1" and use the following columns only if the
pattern you are looking for consists of more than one word.
-
You can specify up to three types of properties for each word.
For each property there is a cell in the grid.
- word category: verb, noun, adverb, adjective, etc.
- word lemma (or baseform): e.g. for a verb, this is the infinitive.
- word form
- For the word form you may use a word form or a
regular expression.
Examples
- To find all verbs, select "Verb" in the "Word Cat" cell for "Word 1". Leave anything else open.
| | Word 1 | Word 2 | Word 3 | Word 4 |
| Word Cat | Verb |
Any | Any | Any |
| Lemma | | | | |
| Form | | | | |
- To find a sequence of two adverbs, select "Adverb" in the "Word Cat" cell of words "Word 1" and "Word 2".
| | Word 1 | Word 2 | Word 3 | Word 4 |
| Word Cat | Adverb | Adverb |
Any | Any |
| Lemma | | | | |
| Form | | | | |
- To find a sequence of two adverbs, the first of which is "pas", enter this:
| | Word 1 | Word 2 | Word 3 | Word 4 |
| Word Cat | Adverb | Adverb |
Any | Any |
| Lemma | | | | |
| Form | pas |
| | |
- For homonyms, you can specify features so as to select the interpretation you want.
| | Word 1 | Word 2 | Word 3 | Word 4 |
| Word Cat | Noun |
Any | Any | Any |
| Lemma | pas |
| | |
| Form | | | | |
- To find all verb forms starting with "re", select "Verb" and enter "re.*" in the cell "Form".
| | Word 1 | Word 2 | Word 3 | Word 4 |
| Word Cat | Verb |
Any | Any | Any |
| Lemma | |
| | |
| Form | re.* |
| | |
How do I stop an ongoing query ?
Click on the Stop button on the top menu bar of your browser.
How long it takes to complete a query ?
It make take a while before a query (that is, its index) is completed,
depending on the number of occurrences you asked for
and the number of occurrences of the requested words in the corpus.
In the worst case, the entire corpus will be scanned, which takes about 4 seconds,
if there is just a single user.
To test whether everything is OK, enter a common word form (such as "est") and ask for 1 occurrence.
After you have clicked the Submit button, the result should be available very quickly.
Which corpora are available ?
Only part of the Elicop corpus has been tagged and verified.
Right now parts of Elilap are available, but none for Lancom.
But work continues and files are added from time to time.
If you need completeness, use the untagged corpus instead:
use this link.
© 2001 P. Mertens
Send your comments to Piet Mertens