Abstract View of Bulgarian Lexicon

Glosa Lexicon Builder

The Glosa Lexicon Builder is a morphological database tool that allows the rapid development and testing of descriptions of natural language words and word formation rules. It consists of two components: a lexicon component and a rules component. Either one can be developed first; the lexicon component automatically determines derivations and inflections from the rules component, allowing overriding for special cases, while the rules component can be developed automatically from individual lexicon examples, using a special machine-learning technology.


Features of the Lexicon Builder:

Defining the Rules Component

The Lexicon Builder offers three different approaches to defining a language's morphology. These approaches can be used separately or in combination.

Affix-Specifc Surface Rules

This paradigm combines phonological rules and morphotactics into a single rule formalism, using only surface symbols. Each set of rules shows the relationship between surface configurations with and without a particular affix; a single rule shows the relationship for a subset of the word forms that can take the affix. For example, the English Noun Plural suffix -s could be represented by this series of rules (B represents sibilants s, x, and z): Affix-Specific Surface Rules The most specific rule that applies in a given case is used. Specificity is defined as
  1. Greater length of matching terminal string.
  2. Smaller size of character class (for instance B, which contains only three members, would be more specific than C, the set of all consonants). A single character is more specific than any character class.

General Surface Rules

To capture language-wide phonological phenomona, special rules can be applied at morpheme boundaries These occur conceptually after an affix attachment or before an affix removal, but in reality they are composed with the affix-specific rules to form a new set of affix-specific rules. Thus we could define phonological rules similar to the plural suffix rules above, that would apply in any situation where the suffix consists of a single s (i.e. Noun Plural and Verbal Third Person Singular). The plural suffix specification would then be reduced to a single rule.

General Lexical Symbol Rules

The Lexicon Builder supports special Lexical Symbols in affixes. As in two-level rules[1], these lexical symbols can be realized as different surface formations in different contexts. For example, one could define a lexical symbol S that is realized as es after B, o, ch, or sh, and as s the rest of the time. Some Lexical Symbol Rules The plural suffix specification is then reduced to

...<->...S.
The corresponding surface only rules are automatically computed, and can be viewed in the transformation window. Computed Surface Rules

Comparison of the Lexicon Builder with Two-Level Finite State Transducer based Implementations

Similarities:

Differences:


References

[1] Koskenniemi, Kimmo. 1983. Two-level morphology: a general computational model for word-form recognition and production. Publication No. 11. Helsinki: University of Helsinki Department of General Linguistics.

[2] Barton, G. Edward, Robert C. Berwick, and Eric Sven Ristad. 1987. Computational complexity and natural language. Cambridge, MA: The MIT Press. (see chapter 5, "The complexity of two-level morphology").


Demos of Lexicon Builder applications: English Lexicon

A portion of an English lexicon.


For further information and pricing, contact:

Glosa International
4538 Winona Ct.
Denver, CO 80212-2513
USA
+1-303-458-1496 (voice and fax)


[
] [Home]