Morlex is a lexical database containing over 33.000 entries, for use in both basic research and NLP applications, such as information retrieval, speech synthesis and speech recognition.
Each entry represents a lemma (i.e. a base form) and contains the following information:
| part-of-speech | entries | analysis | generation | oral form |
|---|---|---|---|---|
| adjective | 3200 | + | - | + |
| noun | 20000 | + | - | + |
| adj/noun | 2200 | + | - | + |
| verb | 6773 | + | + | - |
| adverb | 1300 | + | - | + |
| preposition | 56 | + | - | - |
| others | 165 | + | - | - |
The lexical database is compiled into a form which can be used by a computer program.
marchons = marcher,verbe,indicatif,présent,1,plur
livres = livre,noun,masc,plur
livres = livre,noun,fem,plur
livres = livrer,verbe,indicatif,présent,2,sing
livres = livrer,verbe,subjonctif,présent,2,sing
Click here for an on-line demonstration of verb lemmatisation