How to Use the Site | Bulgarian Dialectology as Living Tradition

Home Page

The map on this page shows pins for all of the 78 sites visited. Clicking the name of a site, either in the list above the map or on the label next to the pin on the map, leads to a separate page for the location.

Individual location pages include links to the currently available texts recorded in that site, and give the Bulgarian administrative region* of the locality and the date(s) on which recordings were made. The most valuable section of the page contains a description of the salient features of the dialect group to which that village’s dialect belongs, with examples taken from the texts on the site.
* The site follows current Bulgarian dialectological practice in assigning villages to administrative regions of the pre-World War II era.

Contents Page

This table provides basic information about each of the 203 texts in the database. It can be re-sorted by clicking the headers of the various columns.

Text provides a direct link to that text’s main page.

Dialect Group identifies the basic dialect group to which the location in which the text was recorded belongs.

Tokens gives a word count of the text. Only speech of informants is counted here; that of investigators is excluded.

Length states the duration of the audio clip that is the basis of each text. [This is the only header that does not allow sorting.]

Lines denotes the number of lines in any one text.

Synopsis of Thematic Content gives a brief description of the topics discussed in the text.

Text Pages

Pages for individual texts (e.g. Archar 1) contain:

Audio Player

Play and pause buttons start and stop the audio. Click within the bar to jump to a point in the text. Time codes at the beginning of each line of the text indicate the corresponding point in the audio.

Metadata

The sidebar contains metadata about the text, including a map showing the location in which it was recorded, the date of recording, the number of words in the text, and a synopsis of its content. It also contains a description of the physical context of the recording session, a summary of the thematic content of the test, and an identification of the participants in the conversation. Only the sex is identified for informants, but full names are given for investigators. For certain village only, a "village image" (photograph) is included.

Text

Three different views can be selected at the top of each page:

Glossed View – Lines are broken up into individual tokens, each of which has interlinear glosses. (For a full list of the abbreviations, see Grammatical Categories below). Also given are standard Bulgarian forms or normalized dialectal forms (lemmas) of tokens, in Cyrillic. Click on any token or lexeme for a link to its own page.
Line View – A simple display with lines and their English translations enables distraction-free reading of the text for content.
Cyrillic Line View – Only the line itself is given, using the Cyrillic transcription conventions accepted in Bulgarian dialectology.

Token Pages

The token is the central piece of data around which almost all of the information in the corpus is organized.

The top portion of this page contains grammatical tags that have been assigned to the token and, for lexically meaningful words, the English gloss. Certain tokens also contain tags marked "Lexical variation" or "Linguistic trait"; the function of these is explained in the sections entitled Lexeme Search and Linguistic Trait Search, respectively. If desired, one can also click on the name of the associated lexeme to go to its page.

Below this is a list of all of the lines in the corpus containing the token (listed according to the texts in which it is found). One can click on the line number to go to its page.

Note: any individual token is a phonetic transcription. Most lexical items will be represented by many different tokens, all of which can be found by consulting the Lexeme search.

Line Pages

Line pages are accessed by clicking on the Bulgarian line in the Line Display view of the Text page. The Line page lists the full line in three forms: in Latin transcription, in Cyrillic transcriptions and in English translation. It then lists each token in the line separately. If the line has been tagged for thematic content or phrases, these tags are also given.

One can click on any of the individual tokens the line contains to jump to their individual pages, on any of the phrases (if the line contains them) to jump that phrase’s page, or on any of the thematic content tags (if the line contains them) to access all lines carrying the same tag.

Lexeme Pages

Lexeme pages are accessed by clicking on any one lexeme in the bottom line of Glossed view of a text. This leads to display all of the tokens corresponding to that lexeme. One lexeme may be associated with tokens of various grammatical forms, of which there may be various phonetic realizations. All such forms are provided, with glosses, on this page.

Each lexeme corresponds to a standard Bulgarian lexical item (lemma). A number of tokens fail to correspond to a lexeme in standard Bulgarian. For these, a new lemma is created, and called here a “dialectal lexeme.”

A number of lexemes are marked for etymological source, or other traits of interest, if such information seems relevant. In addition, lexemes that function as heteronyms carry "lexical variation" tags. The headwords for such lexemes are listed in the section explicating the Lexeme Search.

Wordform Search

This search allows the user to locate tokens according to their grammatical and pragmatic tags. Consult the list below for a full explanation of all tags used in the system.

To search for tokens, select the desired tags from any combination of categories and click "Apply." (Hold the Command key on a Mac or Control on Windows to select or deselect multiple tags.) One can also select operators for more complex searches above the boxes for each category. One can limit the search as desired by entering a lexeme (in Cyrillic) or an English translation in the relevant boxes. Choose "Reset" to deselect all tags.

Results will show all tokens grouped by the texts in which they are found. Each entry gives all tags for the token, its location by text and line number, and the context in which it occurs – the full line, both in Bulgarian (Latin transcription) and English translation. As in all pages, forms in blue function as links to the corresponding pages. The geographic distribution of all selected tokens is shown on a map at the top of the page.

Wordform Tags (Alphabetical)

These following tags appear in texts in Glossed View, and tokens marked by them can be found using the Wordform Search:

1pl – 1st-person plural (for verbs, pronominal subjects and objects)

1sg –1st-person singular (for verbs, pronominal subjects and objects)

2pl – 2nd-person plural (for verbs, pronominal subjects and objects)

2sg – 2nd-person singular (for verbs, pronominal subjects and objects)

3pl – 3rd-person plural (for verbs, pronominal subjects and objects)

3sg – 3rd-person singular (for verbs, pronominal subjects and objects)

acc – accusative case (for pronominal objects, nouns and adjectives)

adj – adjective

adrs – pragmatic particles expressing address, such as ma, be, and the like

adv – adverb

an.num – animate numbers of the type dvama (regardless of gender of referent)

aor – aorist tense

aux – forms of sŭm, either present or past, functioning as verbal auxiliaries

bkch – pragmatic particles expressing “backchannelling,” the acknowledgement of communication

clt – clitic form (assigned to short-form pronoun objects, reflexive particles, present-tense forms of sŭm, and the interrogative particle li, regardless of whether they bear stress or not)

coll – collective form (e.g. list’e ‘foliage’)

comp – the complementizer da

cond – conditional forms

conj – subordinating conjunction

cop – copula forms of sŭm, either present or past

ct – the “count form” of masculine nouns appearing after numbers (e.g. dva leva ‘two levs’)

dat – dative case (for pronominal object forms and rare noun forms)

def – definite (for nouns and adjectives)

disc – discourse particles that structure narrative (including forms such as ‘this,’ ‘it,’ etc. that have grammatical meaning in other contexts)

dist – distal marking (on definite forms, e.g. decana ‘those children,’ and pronominal adjectives and adverbs, e.g. nakiva ‘that kind’). This marker is used only in texts from regions where a three-way distinction is made.

excl – pragmatic particles functioning as exclamations. (The determination of this meaning is subjective and was made mostly on the basis of intonational cues.)

exist – existential forms of the non-inflected ima ‘there is’ and njama ‘there isn’t’, in all tenses

f – feminine marking (on nouns and singular adjectives, pronouns and participles)

fut – the unchanging future particle, in any of its phonetic forms. (Inflected forms of *xotěti are marked as fut.pst if they have clear “future in the past” meaning.) Note that this tag identifies only the auxiliary: full future tense forms are periphrastic and can be found through the Phrase search.

fut.pst – auxiliary form marking the tense “future in the past” (for njamaše, and for inflected forms of *xotěti only when such a meaning is clearly intended). Note that the full tense forms are periphrastic and can be found through the Phrase search.

ger – gerund forms (e.g. sedeškum ‘while sitting’)

hes – pragmatic particles indicating hesitation

hort – pragmatic particles with hortative meaning, such as xajde, ja, nemoj.

I – imperfective aspect

I/P – biaspectual

impf – imperfect tense (including for auxiliary and copula forms)

imprs – impersonal (for verb forms not inflecting for person, other than the existential forms ima, njama)

imv – imperative

indcl – nondeclining noun or adjective, usually unadapted foreign borrowings which do not inflect for gender or number

inf – infinitive

interr – interrogative

interr.rel – interrogative form used in relative function

L.part – L-participle formed from aorist stem

L.part.impf – L-participle formed from imperfect stem (e.g. možel)

m – masculine marking (on nouns and singular adjectives, pronouns or participles)

med – medial marking (on definite forms, e.g. decata ‘the children,’ and pronominal adjectives and adverbs, e.g. takiva ‘that kind’). This marker is used only in texts from regions where a three-way distinction is made.

n – neuter marking (on nouns and singular adjectives, pronouns or participles)

name – proper name (used only on noun forms)

neg – verbal negation

nom – nominative case (on pronoun forms only)

ost – pragmatic particle used in ostensive function, or to point directly to something

P – perfective aspect

pl – plural marking (on nouns, adjectives, participles or imperative forms)

place – toponym

pl.t – pluralia tantum

P.part – passive participle

pres – present tense

prox – proximal marking (on definite forms, e.g. decasa ‘these children,’ and pronominal adjectives and adverbs, e.g. sakiva ‘this kind’). This marker is used only in texts from regions where a three-way distinction is made.

refl – reflexive pronouns

rel – relative (for pronouns and relative markers of various sorts)

sg – singular marking (on nouns, adjectives, participles and imperative forms)

vbl.n – verbal noun

voc – vocative marking (on nouns)

Wordform Tags (by Category)

Case

nom – nominative
acc – accusative
dat – dative
voc – vocative

Number

sg – singular
pl – plural
pl.t – pluralia tantum
ct – count form

Gender

m – masculine
f – feminine
n – neuter
indcl – nondeclining

Definiteness

def – definite

Person

1sg – 1st-person singular
1pl – 1st-person plural
2sg – 2nd-person singular
2pl – 2nd-person plural
3sg – 3rd-person singular
3pl – 3rd-person plural

Verb form

pres – present
aor – aorist
impf – imperfect
fut – future [auxiliary]
imv – imperative
inf – infinitive
cond – conditional [auxiliary]
fut.pst – future in the past [auxiliary]
L.part – L-participle
L.part.impf – imperfect L-participle
P.part – passive participle
ger – gerund
vbl.n – verbal noun

Aspect

I – imperfective
P – perfective
I/P – biaspectual

Function

aux – auxiliary
comp – complementizer
cop – copula
interr – interrogative
neg – negative
refl – reflexive
rel – relative
interr.rel – interrogative form used in relative function

Clitic

clt – clitic

Deictic

prox – proximal
med – medial
dist – distal

Other

adj – adjective
an.num – animate number
adv – adverb
coll – collective
conj – conjunction
exist – existential
imprs – impersonal
name – name
place – place

Pragmatic

adrs – address
bkch – backchanneling
excl – exclamation
disc – discourse
hes – hesitation
hort – hortative
ost – ostensive

Lexeme Search

This search allows users to locate tokens associated with corresponding lexemes, either standard forms found in the 2012 edition of the Bǔlgarski tŭlkoven rečnik (ed. L. Andrejčin et al.) or “dialectal lexemes,” normalized forms that do not correspond (in form and meaning) to an entry in this dictionary.

To search for lexemes (and instances of their corresponding tokens), enter the Cyrillic form of the word in the box marked “Lexeme Search” and select “Apply.” One can limit the search, if desired, by adding an English translation in the relevant box. One can also search for prefixes, roots or suffixes by selecting the "Starts with...", “Contains…” or “Ends with…” options.

The Lexical Traits search allows one to locate lexemes with a particular etymology, are characterized by different lexical traits, or are associated with tokens containing a particular English gloss. One can select any combination of these fields to form a complex search.

The Lexical Variation search allows one to search for heteronyms that happened to occur among the texts on the site. Only variant words which are used in whole dialect groups, or at least in one well-defined dialect, have been so tagged. The following lists the lexemes for which lexical variation tags exist. Imperfective verbs marked * list variants of both aspects.

баница "banitsa (pastry)"
бия.масло "churn [butter]"
боб "beans"
брадва "ах" [n.]
броя "count" [v.]
бръсна "shave" [v.]
булка. "bride"
було "veil" [n.]
бурило "butterchurn"
бързо "quickly"
валя "precipitate" [v]
вътък "woof" [n].
говоря "speak"
горещ "hot"
гръб "back" [n]
гъска "goose" [n]
диня "watermelon"
*довеждам   "bring [someone]"
есен "autumn"
ечемик "barley"
*започвам "begin"
зеле "cabbage"
землен "earthen"
*идвам "come"
искам "want"
картоф "potato"
коледа "Christmas"
коноп "hemp"
кора "sheet [of pastry]"
костенурка "turtle"
котел "cauldron"
котка "cat"
крак "foot / leg"
крия "hide"
кум "godfather"
кумица "godmother"
къс "short"
маса "table"
*намирам "find" [v]
нишка "thread" [n]
обичам "love" [v]
*отглеждам "raise [animal, child, plant]"
отсам "from here"
патица "duck" [n]
пикая "urinate"
подница "baking dish"
постя "fast" [v]
прежда "yarn [for knitting]"
риза "shirt"
*скачам "jump"
скубя "pluck"
*слагам "put"
срещу "against, across from"
стан "loom"
стая "room [in building]"
*стъпя "step" [v]
тичам "run" [v]
точа "roll" [v]
*тръгвам "depart"
трябва "must" [v]
търся "seek"
утре   "tomorrow"
чувал "sack" [n]
харман "threshing floor"
хомот "yoke" [n]
хора "people"
царевица "corn"
цървули   "sandals"
чекръг "spinning wheel"
чупя "break" [v]

Results of each search will appear below. Clicking a lexeme or token will take the user to the page of the specific lexeme or token.

Linguistic Trait Search

Linguistic trait tags are assigned to tokens and provide data about traits of interest at various levels of linguistic structure. Many of the tags are phrased in diachronic terms and index the traits normally catalogued in dialect descriptions and atlases.

To search for tokens marked with linguistic trait tags, select a major category from the box under “Linguistic Trait.” Any time a choice is followed by an arrow (→), one may make a subsidiary selection in the box that appears to the immediate right. Upon reaching the end of the hierarchical string, select “Apply.” The results, a list of tokens arranged according to texts, will appear below a map with tabs marking the location of all tokens marked with the trait in question.

Some traits allow (but do not require) the specification of various Conditions, or of a phonetic Realization. Conditions that can be specified for a particular trait are marked with encircled letters (e.g. ⓢ), and the possibility of selecting a Realization is indicted with a stylized letter “R” (ℝ). Brief identifications of these symbols appear at the top of the Linguistic Traits search page.

To specify a Condition, select “Conditions” in an “Additional Trait” box, and then specify the desired condition in the box that appears to the right, followed by the value for that condition. Further conditions can be specified in the remaining “Additional Trait” boxes.

To specify a desired phonetic Realization, select “Realizations” in an “Additional Trait” box, and then select the desired reflex from the list that appears in the box to the right. Finally select “Apply” to see results.

Linguistic Trait Tags

PHONOLOGY →
- Vowels →
  - Historical Slavic Vowels →
    - jat ⓜ ⓢ ⓟ ⓒ ℝ
    - analogical jat ⓜ ⓢ ⓟ ℝ
    - jotated /a/ ⓜ ⓢ ⓟ ℝ
    - back jer ⓜ ⓢ ℝ
    - borrowed schwa-like vowel ⓜ ⓢ ℝ
    - front jer ⓜ ⓢ ℝ
    - inserted jer ⓢ ℝ
    - back nasal ⓜ ⓢ ℝ
    - front nasal ⓜ ⓢ ⓙ ℝ
    - jery →
      - jery not fronted
      - jery > /i/
      - jery > /e/
  - Lengthening & Contraction →
    - compensatory lengthening after elision of /l/
    - compensatory lengthening after elision of /x/
    - contraction of vowel in verb stem
  - Elision →
    - elision of unstressed vowel
  - Reduction Phenomena →
    - non-reducing /a/ (unexpected)
    - non-reducing /o/ (unexpected)
  - Transformations of Specific Vowels →
    - stressed /a/ > /ɑ/
    - /e/ > /a/
    - stressed /e/ > /e̝/
    - stressed /e/ > /i/
    - /e/ > /o/ ⓜ
    - /e/ > /'u/
    - unstressed /e/ > /'ɤ/
    - /i/ > /ᵊi/
    - /i/ > /ɨ/
    - /i/ > /’u/
    - /i/ > /y/
    - /i/ > /ɤj/
    - /i/ > /ᵚi/
    - /e/ > /u/
    - /e/ > /ə/
    - /e/ > /ɤ/
    - /e/ > /’ə/
    - /i/ > /ə/
    - /i/ > /ɤ/
    - /i/ > /’ə/
    - /i/ > /’ɤ/
    - unstressed /o/ > /a/
    - unstressed /e/ > /’ə/
    - unstressed /e/ > /ɤ/
    - unexpected /o/ > /e/
    - /o/ > /wo/
    - stressed /o/ > /o̝/
    - stressed /o/ > /u/
    - /u/ > /i/
    - /u/ > /əu/
    - stressed /o/ > /u:/
    - /u/ > /ɤw/
    - /u/ > /əw/
    - /ɤ/ > /ɤ̟/
    - stressed /ɤ/ > /ʌ/
  - Other Vocalic Phenomena →
    - devoiced vowel
    - prothetic /i/ before initial consonant cluster
    - unadapted foreign sound
- Consonants →
  - Historical Slavic Consonants →
    - syllabic /l/ ⓢ ⓛ ℝ
    - syllabic /r/ ⓢ ℝ
    - analogical syllabic /r/ ℝ
    - proto-Slavic */tj/ */ktj/ */kt + front vowel/ ℝ
    - proto-Slavic */stj/ */skj/ */sk + front vowel/ ℝ
    - analogical syllabic /l/ ⓢ ⓛ ℝ
    - proto-Slavic */dj/ */gtj/ ℝ
    - morpheme-initial */črь/ ⓢ ℝ
  - Palatalization Phenomena →
    - palatalized word-final consonant
    - assimilatory palatalization of consonants
    - /j/ from anticipated palatalization
    - consonant + /j/
  - Elision & Lengthening →
    - elision of intervocalic /j/
    - elision of syllable-final /l/
    - elision of /v/ before rounded vowel
    - elision of /x/ ⓜ
    - elision of consonant between sonorants
    - elision of consonant in syllable-final cluster
    - consonant-vowel fusion
  - Epenthesis →
    - epenthetic /l/ preserved
    - epenthetic /n'/
  - Prothesis →
    - /j/ before /a/ (unexpected)
    - absence of /j/ before non-front vowel (unexpected)
    - /j/ before /u/ (unexpected)
    - /j/ before front vowel
    - prothetic consonant before back nasal ℝ
  - Voicing →
    - unexpected retention of voicing
  - Changes in Consonants & Sequences →
    - /bn/ > /mn/
    - /d/ > /dz/
    - /d’/ > /g’/
    - /dn/ > /n/
    - /dn/ > /nn/
    - /dz/ > /z/
    - /dž/ > /ž/
    - /f / > /h/
    - /f/ > /v/
    - /f/ > /ɸ/
    - /g/ > /dž/
    - /j/ > /v/
    - /jk/ > /k'/
    - /k/ > /č/
    - /l’/ > /j/
    - /mn/ > /fn/
    - /mn/ > /bn/
    - /mn/ > /m/
    - /mn/ > /ml/
    - /mn/ > /n/
    - /mn/ > /vn/
    - /s/ > /c/
    - /š/ > /č/
    - /sr/ > /str/
    - /str/ > /sr/
    - /t/ > /c/
    - /t’/ > /k’/
    - /v/ > /β/
    - /v/ > /w/
    - /vn/ > /mn/
    - /vs/ > /sv/
    - /x/ > /f/ ⓜ
    - /x/ > /j/ ⓜ
    - /x/ > /v/ ⓜ
    - /x/ > /w/ ⓜ
    - /xv/ > /f/
    - /z/ > /dz/
    - /zdr/ > /zr/
    - /ž/ > /dž/
  - Other Consonantal Phenomena →
    - long consonant
    - retention of /dz/ from Common Slavic palatalization
    - metathesis of /r/ and unstressed vowel
    - other metathesis
    - prefix /ot/ > /od/
    - morpheme-initial /v/ > /u/
- Stress →
  - Nominal Patterns →
    - accent on adjective ending (unexpected)
    - accent on ending of masculine plural noun (unexpected)
    - accent on definite article of polysyllabic masculine noun (unexpected)
    - accent on possessive pronoun ending
    - accent retraction in bisyllabic neuter noun
    - accent retraction in bisyllabic plural feminine noun
    - accent retraction in bisyllabic singular feminine noun
    - non-mobile accent in monosyllabic masculine noun (unexpected)
  - Verbal Patterns →
    - accent retraction from theme vowel of II conjugation
    - accent retraction to initial syllable of 1sg present
    - accent retraction to initial syllable of 2sg-3sg aorist
    - accent on final syllable of 1-2pl present
    - accent retraction in imperative
    - accent on theme vowel in aorist and aorist L-participle (unexpected)
  - Clitics →
    - accent on preverbal negative particle
  - Polyaccentedness →
    - lexical double accent
    - secondary accent on plural article
  - Syllabic Phenomena →
    - elision of unaccented syllable
  - Lexical Phenomena →
    - lexicalized accent advancement
    - lexicalized accent retraction

MORPHOLOGY →
- Definite Article →
  - masculine singular nouns ⓢ ⓟ ℝ
  - masculine singular adjectives ℝ
  - plural →
    - plural article -/te/ (unexpected)
    - plural article -/ti/
    - plural article -/to/
    - elision of plural formant /t/ in definite forms of neuter nouns
- Nouns →
  - Plurals
    - feminine plural -/e/
    - masculine plural -/e/ (unexpected)
    - masculine plural -/je/ (unexpected)
    - masculine plural -/ove/ (unexpected)
  - Other Noun Phenomena
    - depalatalization of noun stem final consonant
    - feminine count form
    - stem unification in plural masculine nouns
- Pronouns →
  - nonstandard oblique personal pronoun form
- Adjectives →
  - definite article on bare adjective stem
  - long ending on masculine adjective
  - non-masculine gender marking in plural adjective
  - Common Slavic palatalization before plural adjective
- Verbs →
  - Present →
    - 1sg Present Conjugation →
      - 1sg -/m/ ending for I/II conjugation
      - 1sg lack of -/m/ ending for III conjugation
    - 1pl Present Conjugation →
      - 1pl -/me/ ending for I/II conjugation
      - 1pl -/mo/ ending
      - 1pl -/ne/ ending
    - 3pl Present Conjugation →
      - lack of final /t/
    - Theme Vowel Shifts →
      - theme vowel shift /e/ to /i/ in present conjugation
      - theme vowel shift /i/ to /e/ in present conjugation
    - Other Present Phenomena →
      - depalatalization of verb stem final consonant
  - Aorist & Imperfect →
    - 1pl aorist and imperfect -/mo/ ending
    - 1pl aorist and imperfect -/ne/ ending
    - 2pl aorist -/ste/ ending
    - 3pl aorist -/še/ ending
    - 3pl aorist and imperfect -/e/ ending
    - aorist theme vowel shift /o/ to /a/
    - lack of -/na/ suffix in aorist stem
    - word-final -/n/ on aorist and imperfect ending
    - imperfect stem extended by -/še/
  - L-Participles →
    - non-masculine gender marking in plural L-participle
    - plural -/e/ ending in L-participle
    - reduplication of -/l/ in L-participle
    - deletion of stem-final consonant in L-participle
    - lack of -/na/ suffix in L-participle
  - Passive Participles →
    - -/n/ suffix for expected -/t/
    - -/t/ suffix for expected -/n/
    - -/an/ suffix for expected /en/
  - Verbal Nouns →
    - verbal noun -/n'e/ ending
    - verbal noun -/te/ ending
  - Stem Unifications →
    - present stem unification
    - present stem in aorist
    - present stem in aorist L-participle
    - present stem in verbal noun
    - aorist stem in past participle
    - perfective stem in derived imperfective
    - masculine stem in other L-participle gender forms

SYNTAX →
- Gender →
  - gender shift
- Case →
  - feminine case markings in masculine nouns
  - historical feminine accusative used in nominative function
- Transitivity →
  - transitive use of intransitive verb

LEXEMES →
- diminutive
- nonstandard usage

PRAGMATICS →
- vocative usage
- vocative form with non-vocative usage

CONDITIONS →
- ⓜ Morpheme Type
  - affix
  - definite article
  - ending
  - preposition
  - root
- ⓢ Stress
  - stressed
  - unstressed
- ⓟ Palatalizing Environment
  - before palatalizing environment
  - not before palatalizing environment
- ⓒ Affricate /c/
  - before /c/
  - not before /c/
- ⓛ Labials
  - after labial
  - not after labial
- ⓙ Postalveolars & /j/
  - after postalveolar or /j/
  - not after postalveolar or /j/

REALIZATIONS →
- /a/
- /a:/
- /ar/
- /'a/
- /an/
- /č/
- /cɤr/
- /čer/
- /e/
- /e̝/
- /’e/
- /i/
- /it/
- /k’/
- /lə/
- /ḷ/
- /o/
- /ol/
- /or/
- /ot/
- /'o/
- /ṛ/
- /rɛ/
- /rɤ/
- /š/
- /šč/
- /šk/
- /št/
- /u/
- /'u/
- /v/
- /ə/
- /ər/
- /ət/
- /’ə/
- /ɛ/
- / ždž’/
- /cer/
- /cṛ/
- /cre/
- /čʌr/
- /cәr/
- /čәr/
- /č’er/
- /č’ɤr/
- /en/
- /es/
- /f/
- /in/
- /lɤ/
- /r/
- /ra/
- /rɤ/
- /rә/
- /š’č’/
- /ɔ/
- /ɔl/
- /ɔn/
- /ɔs/
- /ɔt/
- /əs/
- /ɛl/
- /ɛr/
- /ɤl/
- /ʌ/
- /ʌr/
- /ʌt/
- /әl/
- /әn/
- /’i/
- /’ɛ/
- /’ɑ/
- /’ʌ/
- /ɔr/
- /’ɔ/
- /ɤ/
- /ɤr/
- /ɤt/
- /’ɤ/
- /žd/
- [zero]
Home Page | Contents Page | Text Pages | Token Pages | Line Pages | Lexeme Pages | Wordform Search | Wordform Tags | Lexeme Search | Linguistic Trait Search | Linguistic Trait Tags | Thematic Content Search | Phrase Search | Phrase Tags

Thematic Content Search

This search allows the user to isolate sections of lines from texts which contain discussion of a particular theme or item.

The list of topics which can be searched is available in two forms. One is hierarchical, according to topic; and the other is alphabetical. Each entry functions as a link; in order to see all the material concerning that topic, simply click on the link.
Home Page | Contents Page | Text Pages | Token Pages | Line Pages | Lexeme Pages | Wordform Search | Wordform Tags | Lexeme Search | Linguistic Trait Search | Linguistic Trait Tags | Thematic Content Search | Phrase Search | Phrase Tags

Phrase Search

The purpose of the Phrase search is to allow the user to locate grammatically significant groups of tokens, the meaning of which is impossible to tag at the level of the individual token. Some phrases bear only a single identifying tag. Most, however, are defined by the interaction of a number of different grammatical features; consequently they are multiply tagged. The separation of these tags into the following fourteen separate categories allows the user to narrow the search by combining tags from these different categories.

1. Tense-mood.
Compound verbal tenses, comprising a main verb form plus auxiliary, are named in every instance (the perfect and future series, and the conditional mood). Other verb tenses or moods are named when they co-occur with component clitics or particles.

2. “Evidentials”.
Forms of the renarrated mood are not phrases as such since they frequently consist of a single token, the L-participle. But because the meaning is very different from the L-participle which forms part of a compound tense, they are best marked as a phrase (whose other components are usually null).
3. Reflexive. The reflexive particles se or si modify the meaning of many different words. Usually they combine with a verb form to create a so-called “reflexive verb”, but the particle si can also function as a possessive clitic or nominal clitic of other sorts. In sequence with verbal auxiliaries, they function technically as objects (with se fulfilling the accusative function and si the dative function) and are so marked (under Word order). When combined with pronoun objects they are marked as reflexives.

4. Clitic objects.
These occur attached to a headword. Usually they are objects within a verb phrase (and are so marked). If there are two objects only this fact is stated since there is never variation in word order (it is everywhere “dative - accusative”); if one or both of such clitic objects is reflexive this is marked. Clitics can also be attached to a noun in the function of ethical or possessive dative, or simply as a nominal clitic; exceptionally they can also occur as objects of prepositions.

5. Doubled phrases.
These contain two coreferential words functioning as object (more rarely, subject). Usually one is a clitic pronoun while the other is either a nominal form or a full pronoun; in the case of subjects one is a pronoun and the other a noun. Both the type of form and the sequence of their occurrence are identified. Sometimes only long pronoun objects are encountered instead of the more frequent doubled sequence or just short pronoun alone; this fact is marked.

6. Negation.
Negation of simplex verb forms occurring alone is not marked. All other negated verbs are marked. For negated future forms, only the fact of negation is noted; for all other forms, the position of the negative particle with respect to clitic forms (either copula, verbal copula or pronoun object) is noted.

7. Word order.
Pronoun objects occurring in predicate phrases or the perfect tense can either precede or follow the copula or auxiliary, respectively. These word order possibilities are noted. When this word order differs from the standard, this is noted additionally.

8. Syntactic cohesion. For most of the phrases catalogued herein, the components occur sequentially. In some instances, however, one or more other words intercede. This fact is marked; the amount and grammatical characteristics of the intervening material is not noted, but can be viewed as part of the search results, which quote the entire line in which the phrase appears.

9. Category shift. A few phrases implement known grammatical categories in a non-standard, and linguistically interesting, manner.

10. Accentuation. Two types of phrasal accent patterns are possible. In the case of double accent, a second accent occurs either on the final syllable of the main word (when followed by a clitic or other particle), or on a clitic itself (when followed by another clitic or other particle). In the case of additional accent, a pre-verbal clitic following a phrase initial trigger (conjunction or particle) is accented. When double accent occurs within the confines of a single word (such as kràstavìci or nèkugàš), it is techincally not a phrase; nevertheless it is included here, as lexical double accent, for the sake of completeness. Such instances of double accent are also tagged at the token level and can be found via the Linguistic trait search.

11. Interrogative. The presence of the interrogative particle is noted if other clitic forms are also present; it is not noted otherwise.The absence of interrogative forms in expressions with clear interrogative meaning is also noted.
12. Verbal phrase. All instances of non-verbal predicates (such as sram me e) are noted; predicate phrases, including those where the predicate is a passive participle, are noted if other forms (such as negation or clitic objects) are present. This rubric also comprises phrases including an ostensive particle followed by a clitic (such as eto go), compound imperative phrases (such as nemoj uči or idi pitaj), dialectally interesting instances of lexical negation (such as nie sme n’amali) or coordinated idioms containing clitics.

13. Style. Any of the above phrases which occur in the context of a narrated folktale or quoted verse are so marked, since such a narrative style may be relevant in the frequency of occurrence of one or another phrase type.

14. Miscellaneous. A large number of interesting phrasal facts are grouped here. Some are sufficiently individual as to not fit under any of the above rubrics, while others are meant to be combined with those in other rubrics. See the Phrase Tags listing below for more detail.

Home Page | Contents Page | Text Pages | Token Pages | Line Pages | Lexeme Pages | Wordform Search | Wordform Tags | Lexeme Search | Linguistic Trait Search | Linguistic Trait Tags | Thematic Content Search | Phrase Search | Phrase Tags

Phrase Tags

The following list explicates all the phrase tags listed under each of the major categories on the Phrase Search page. Examples of each type are given, identified by the text name and line number where the example occurs on the site.

Tense-mood.

NOTE: for instances of renarration of verbal tenses see renarrated (under “Evidentials”)

• aorist – instance of aorist tense, tagged only when co-occurring with attendant clitics or when renarrated

Examples: mu sɤ ozɤbi (Arčar 1:33); srèšnəl gu (Drabišna 2:65)

• conditional – instance of conditional mood, comprising conditional particle and main verb form

Example: bi litnul (Repljana 1:118)

• future – instance of future tense (affirmative or negative), comprising future auxiliary and main verb form (and any attendant clitic forms). For dialectal variation in future tense forms, see compl.in.fut, and conjug.fut under Miscellaneous, and neg.fut.as.pres under Category shift.

Examples: šə sɤ rudɤj (Godeševo 2:2), nema da rabotat (Eremija 6:67)

• fut.in.past – instance of future in the past tense, comprising auxiliary and main verb (and any attendant clitic forms). For dialectal variation in this tense form, see non.conjug.fut.past under Miscellaneous.

Examples: šteše da i dukara (Brŭšljan 1:21), če a izedeše (Eremija 4:8)

• fut.perf – instance of future perfect, comprising auxiliaries and main verb form (and any attendant clitic forms)

Example: če sme slagale (Bansko 24)

• imperative – instance of imperative mood, tagged only when co-occurring with attendant clitics, when renarrated, or when part of a compound imperative construction (see comp.imper under Verbal phrase)

Examples: posoli si go (Belica 3:78), da ne mɤ svarila (Iskrica 1:74)

• imperfect – instance of imperfect tense, tagged only when co-occurring with attendant clitics or when renarrated

Examples: dujahne gi (Brŭšljan 1:22), gi tɛrsili (Golica 5:77)

• infinitive – instance of infinitive occurring in a subordinate clause context or as part of compound imperative (see comp.imper under Verbal phrase)

Example: ne mogə ti kaza (Malevo/Xsk 1:68)

• present – instance of present tense, tagged only when co-occurring with attendant clitics or when renarrated

Examples: ti plaštam (Arčar 1:30), ne znaeli (Golica 3:32)

• perfect – instance of perfect tense, comprising auxiliary and main verb form (and any attendant clitic forms)

Example: sə im vikəli (Černovrŭh 39)

• pluperf. – instance of pluperfect tense, comprising auxiliary and main verb form (and any attendant clitic forms)

Example: b’ahte dušlili (Drabišna 2:41)

• Rom.perf – instance of the “Romance perfect”, in which the auxiliary is a form of imam (and not sŭm).

Example: n’eməm s’atu (Golica 3: 175

“Evidentials”

• dubitative – instance of the renarrated mood in specifically emotive, ironic meaning

Example: bila sam imala (Eremija 5: 30)

• renarrated – instance of the renarrated mood (and any attendant clitic forms). The function of renarrated in the discourse is also noted. If it transmits a specific narrative, the renarrated tense is identified

Examples: ste jadeli [present] (Rakovski 15), utiš’ɤl [aorist] (Oreše 26), imalo [imperfect] (Repljana 4:36), št’eli ə ut’uət [future] (Petrov Dol 2:18), tel da me ubie [future in the past] (Repljana 1:29), bil zakasal [pluperfect] (Eremija 5:31)

If the tag occurs alone the form simply communicates the fact of evidence of completed action

Examples: stojal (Oborište 1:25), se sɤbrali (Repljana 2:90)

Reflexive.

• reflexive – the reflexive particle together with the word it depends on. For the particle se, this word is always a verb, and the two together form a “reflexive verb”. The particle si can similarly be attached to verbs, but it can also be attached to nouns and even adverbs. The two clitics can also occur together (see clt.noun.phrase and under 2.refl.clit under Clitic objects), and it is also possible for a non-reflexive pronoun to be used in reflexive meaning (see non.refl under Miscellanеous)

Examples: se razbirame (Arčar 1:23), udrežiš si (Babjak 1:27), učite si (Golica 3:145), dɤlga si vɤlna (Graševo 100)

• 2.refl – instances of the compound reflexive form sebe si

Example: s’ebe si (Golica 4:10)

Clitic objects

• 1.obj.pron – verb (of any tense) plus a dative or accusative pronoun object; this object is usually a simple clitic but can also be reduplicated by a fuller form, and exceptionally be a full form standing alone (see clt.pron-full.pron, clt.pron-nom, full.pron-clt.pron, nom.clt.pron, or long.only, under Doubled phrases).

Examples: mi rekuə (Babjak 4:7), gu z’eə (Bangejci 1:30)

• 2.obj.pron – verb (of any tense) plus both dative and accusative pronoun objects, always in that order

Example: im go klaah (Huhla 1:58)

• 2.refl.clit – verb (of any tense) plus both dative and accusative reflexive pronoun objects (si se, always in that order)

Example: si se poznavame (Dolno Ujno 212)

• clit.noun.phrase – any dative clitic together with the noun, noun phrase, or other word to which it is attached

Example: dobər vi pɤt (Dolno Ujno 27)

• eth.dative – a clitic dative pronoun together with its head word, used in the meaning “ethical dative”

Example: ala ti si (Stakevci 2:23)

• poss.clitic – a clitic dative pronoun together with its head word, used to indicate possession

Example: tatku ti (Brŭšljan 4:44)

• prep+clit.obj – a preposition followed by a clitic pronoun object

Example: puder’ə mu (Srebŭrna 1:60)

• refl+obj.pron – verb (of any tense) with two pronoun objects (dative and accusative, always in that order), one of which is reflexive

Example: da mu se meni (Belica 3:88)

• refl+2.obj.pron – verb (of any tense) plus both dative and accusative pronoun objects, always in that order, preceded by the reflexive si

Example: də si mi gu dədɔt (Mogilica 3:100)

Doubled phrases

• clt.pron-full.pron – instance of verb plus doubled object, in which a full pronoun object is preceded by a reduplicative clitic object

Example: dɤ tɤ uženim t’ebe (Drjanovec 1:45)

• clt.pron-nom – instance of verb plus doubled object, in which a nominal form object is preceded by a reduplicative clitic object

Example: n’eməš’e gu tuva zəjre (Brŭšljan 1:90)

• full.pron-clit.pron – instance of verb plus doubled object, in which a full pronoun object is followed by a reduplicative clitic object; the full form can sometimes appear as subject rather than object (see subj.in.doubled under Miscellaneous)

Example: mene me ne puštat (Eremija 1:12)

• clit.pron-full.pron-nom – instance of tripling, in which a clitic pronoun is followed by both the corresponding full pronoun and then by the noun to which both refer.

Example: vi dade na nevu na nevestutu (Nasalevci 1:58)

• long.only – instance of a verb with a single pronoun object, in the full form

Example: tɛh gledame (Gela 3:51)

• nom-clt.pron– instance of verb plus doubled object, in which a nominal form object is followed by a reduplicative clitic object

Example: bulkətə jə pəd’ət (Garvan 1:47)

• nom-subj.pron – instance of a nominal subject form followed by a reduplicative subject pronoun

Example: vṛšačkata ona (Rajanovci 1:38)

• subj.pron-nom – instance of a verb whose nominal subject form is preceded by a reduplicative subject pronoun

Example: t’a mi putkazə ženikčkətə (Kruševo 2:42)

Negation.

NOTE: in the following, the term “clitic” comprises short form pronoun objects, reflexive particles, verbal auxiliaries and copula forms.

• clit-neg – instance of a negated verb accompanied by clitic(s) in which the negative particle follows the clitic(s)

Example: mu ne kazvam (Gradec 1:55)

• clit-neg-clit – instance of a negated verb accompanied by two clitics in which the negative particle occurs between them

Example: im ne səm klalə (Huhla 1:53)

• clit-X-neg – instance of a negated verb preceded by its clitic object in which some other word separates the clitic from the negative

Example: da gi ne razduva [da gi vetɤr ne razduva] (Glavanovci 1:84)

• neg – instance of a negated compound verb form without attendant clitics

Example: ni sɤm slagələ (Bangejci 2:36)

• neg-clit – instance of a negated verb accompanied by clitic(s) in which the negative particle precedes the clitic(s)

Examples: ne sme se razbrale (Arčar 1:31), ne se peče (Belica 2: 165)

• neg-verb-clit– instance of a negated verb accompanied by clitic(s) in which the verb is preceded by the negative particle and followed by the clitic(s)

Example: ne pada se (Gorno Vŭršilo 1:16)

• no.neg – instance of a verb phrase with clear negative meaning in which the negative particle is absent

Example: sme zəpovnili nɤjštu (Godeševo 3:5)

• verb-neg – instance of a negated predicate phrase in which the negation follows the verb

Example: to ko e ne (Šumnatica 1:97)

Word order.

NOTE: in the following the term “object” applies both to clitic pronoun objects and reflexive particles. When the order in question departs from that of the standard language, this is noted (see non.std.word.order under Miscellaneous)

• aux-obj – instance of a perfect tense accompanied by object(s) in which the auxiliary precedes the object(s).

Examples: sa go donele (Arčar 2:20), ne e se setil (Gradec 1:35)

• cop-obj –instance of a predicate construction accompanied by object(s) in which the copula precedes the object(s).

Examples: kaktu sə si (Stančov Han 1:13), ne e me jet (Repljana 1:128)

• obj-aux– instance of a perfect tense accompanied by object(s) in which the auxiliary follows the object(s).

Examples: vi e nael (Arčar 1:9), jə sə dali (Kruševo 1:105)

• obj-aux-obj – instance of a perfect tense accompanied by two objects in which the auxiliary stands between the objects.

Example: nie si sme jə pravili (Graševo 83)

• obj-cop – instance of a predicate construction accompanied by object(s) in which the copula follows the object(s).

Examples: žal me e (Repljana 1:28), takiva ti sə (Belica 3:53)

Syntactic cohesion

• ellipsis – instance of an elliptical (truncated) construction following directly upon (and depending on) a fully formed one

Example: hodilə səm kupalə ž’ɔnələ vɔrhlə karələ (Momčilovci 6)

• non-sequential – instance where other, grammatically extraneous, material breaks up elements of the specified phrase. This material is not entered as part of the phrase, but it can be viewed in search results, which bring up the full line in which the phrase occurred.

Example: mene mi puč’inə [mene mi pɔrvət mɔž puč’inə] (Mogilica 1:79)

• aux-X-verb – instance of a perfect tense in which some other word separates the auxiliary and L-participle form

Example: ne səm zaletela [ne səm negde zaletela] (Repljana 1:95)

• clit-X-verb – instance of a sequence in which some other word separates the clitic from its headword.

Example: to se izbistri [to se odgore izbistri] (Belica 3:17)

• clit-verb-clit – instance of a verb surrounded by clitics

Example: momčeto go kṛstixme si (Glavanovci 2:24)

Category shift

• init.clit – instance of a verb phrase with clitic objects in which the clitics begin the utterance

Example: səm gu vərdzələ (Srebŭrna 1:23)

• neg.fut.as.pres. – instance of the negative future used with present tense meaning

Example: nema da ti meriše (Belica 3:86)

• no.count.masc – instance where a masculine noun following a quantifier is in the plural form rather than the expected count form

Example: četiri bivale (Petŭrnica 35)

• non.refl – instances of the non-reflexive pronouns used in reflexive meaning

Example: si odgovara za nego (Belica 1:116)

• pers.pron.demonstr. – instances where the personal pronoun is used in function of a demonstrative pronoun

Example: neja staja (Srebŭrna 1:111)

• pos.as.comp – instances where a positive degree of the adjective expresses the meaning of comparative degree

Example: č’etr’i gudini malku ud mene (Drjanovec 1:23)

• subj.in.doubled – instances where a subject pronoun appears instead of the expected full-form object pronoun in doubled constructions

Example: toj sv’ekərə gu ubili (Markovo 11)

Accentuation.

NOTE: Trigger particles which are known to occasion addl.acc are included in the citation of any phrase where they occur, whether or not addl.acc is present.

• addl.acc – instance of “additional accent”, defined as a sequence of conjunction or other trigger particle followed by an accented clitic (and usually then by a verb).

Example: ku jà dadète (Oreše 47)

• dbl.acc – instance of “double accent”, defined as two word accents in a phonological word (a lexical word followed by one or more monosyllabic segments).

Example: dəšterìčkətà mi (Kruševo 2:10)

• lex.dbl.acc – instance of “double accent” in which the two accents occur within a single lexical word.

Example: kùmuvètu (Dolno Draglište 2:11)

Interrogative.

• interr.clit – instance of a verb phrase containing the interrogative particle li, tagged only when object forms are also present

Example: znaeš li a (Belica 1:78)

• no.interr – instance of an utterance with interrogative meaning which lacks any formal mark of interrogation

Examples: ste ne videvɤli (Tihomir 2:15), jok praite (Šumnatica 3:65)

Verbal phrase.

• comp.imper – instance of a sequence of forms conveying a command, either negative (consisting of nemoj + verb form) or positive (consisting of two consecutive imperative forms)

Examples: nemoj pita (Čokmanovo 2:11), idi si grai (Nasalevci 1:180)

• coord.idiom – instance of two identical phrases connected by the negative particle, with the general meaning “regardless of X”

Example: istəkal’i ne istəkal’i (Malevo/Xsk 1:137)

• echo.question – instance of an interrogative phrase which repeats all or part of a question just asked by another, usually containing two interrogative markers

Example: kogi go bereme li (Vladimirovo 3:87)

• gen’l.neut.L.part – instance of the neuter L-participle used instead of an expected masculine or feminine one

Example: d’et si sm’elilu ž’itotu (Sŭrnica 4:38)

• lexical.neg – instance of negative meaning expressed not by the negative particle but through other, lexicalized means

Example: sə n’aməli (Iskrica 2:1)

• non-vbl.future – instance of the future particle with verbal meaning understood but not lexically expressed

Example: če teka (Stakevci 3:8)

• non-vbl.pred. – instance of verb phrase composed of copula, noun and accusative pronoun, with the general meaning of emotional condition

Example: sram me e (Babjak 4:5)

• ostensive – instance of ostensive particle followed by clitic object pronoun

Example: e gu (Stančov Han 1:68)

• pass.part – instance of phrase whose predicate is a passive participle, tagged only when other elements (such as negation or clitic object forms) are present

Example: ne mi e kladen (Kruševo 3:18)

• predicate – instance of predicate phrase, tagged only when other elements (such as negation or clitic object forms) are present

Example: da ni e l’eku (Bangejci 2:34)

Style.

• folktale – instance of any of the phrases defined herein that occurs within the narration of a folktale

Example: slənc’et gu n’amə (Drabišna 2:73)

• quoted.verse– instance of any of the phrases defined herein that occurs within a section of quoted verse

Example: sə svivə previvə (Čokmanovo 1:34)

Miscellaneous.

• approx.num – instance of one or more numerals spoken together conveying the idea of an approximate amount

Example: sto i pedese dvesta (Leštak 1:10)

• comp.pron – instance of a phrase containing interrogative or relative pronoun plus particles with a generally indefinite meaning

Example: edi si koga (Vŭrbovo 2:43)

• compl.in.fut – instance of the future tense in which the complementizer da follows the future particle

Example: še d idem (Oborište 2:62)

• conjug.future – instance of the future tense in which the future particle bears tense markings for 1st singular

Example: šta lisna (Belica 3: 83)

• mult.determ – instance in which more than one element of the same noun phrase bears definite or determiner markings

Example: ženičkətə starətə (Kruševo 2:42)

• mult.L.part. – instance of repeated L-participle forms accompanying the same auxiliary

Example: gi e dərdžal dərdžal (Kolju Marionovo 5:50)

• no.compl. – instance where the complementizer da has been omitted

Example: nema vidim (Belica 2:50)

• no.copula – instance where the verbal copula has been omitted

Example: hmi gotovu (Malevo/Xsk 1:113)

• non.conj.fut.past – instances of the future in the past tense where the auxiliary is unchanging and the main verb is in the imperfect tense.

Examples: če a izedeše (Eremija 4:13)

• non.std.word.order – instances of the perfect tense where the 3rd singular auxiliary (or copula) precedes object pronouns, or any other auxiliary (or copula) form follows object pronouns

Examples: ne e se setil (Gradec 1:35), jə sə dali (Kruševo 1:105)

• od+obj – instance where the preposition ot is spoken as od (other than immediately before a voiced obstruent)

Example: ud ədin (Kruševo 3:57)

• 2nd.accus – instance of the “second accusative” construction

Example: rəkojki gu kazvət (Leštak 3:183)

• v+obj – instance where the preposition v is spoken as v rather than f (other than immediately before a voiced consonant)

Example: v armana (Belica 2:30

Home Page | Contents Page | Text Pages | Token Pages | Line Pages | Lexeme Pages | Wordform Search | Wordform Tags | Lexeme Search | Linguistic Trait Search | Linguistic Trait Tags | Thematic Content Search | Phrase Search | Phrase Tags

Bulgarian Dialectology as Living Tradition

Main menu

You are here