Texts are composed of Lines; Lines are composed of individual Tokens; Tokens have associated tags for grammatical, lexical, and linguistic trait information.
Each text is viewable as Glossed View (tokens with grammatical information, English glosses, and Bulgarian lemmas), Line Display (text line and English translation only), and Cyrillic Line Display (Bulgarian lines in Cyrillic).
Five searches are possible.
• Wordform search extracts tokens (in context) according to English gloss and/or Bulgarian standard lexeme and/or a combination of grammatical tags.
• Lexeme search extracts all phonetic renditions on the site of any one standard lexeme, and all lexemes characterized by specific lexical traits.
• Linguistic trait search extracts tokens characterized by one of a very large number of traits of interest to linguists.
• Thematic content search extracts all lines of text that concern particular ethnographically-defined topics.
• Phrase search extracts grammatically or pragmatically significant chunks that are composed of more than one token.