School Head — Ekaterina Rakhilina
Deputy Head — Yana Akhapkina
Moscow, 21/4 Staraya Basmannaya Ulitsa
Phone: +7 (495) 772-95-90 *22734
The paper examines the properties of heavy as a perceptual concept, based on evidence from 11 languages. We demonstrate that the semantics of this concept is heterogeneous; lexemes of this field can be used in situations of at least three types: Lifting, Shifting and Weighing. These situations are either lexicalised as separate words or they converge in a single lexeme in various combinations following certain strategies. We also argue that different metaphorical extensions correspond to different situation types; this allows us to use analysis of metaphoric shifts as an additional instrument to establish the semantic structure of direct meanings.
Questionnaires constitute a crucial tool in linguistic typology and language description. By nature, a Questionnaire is both an instrument and a result of typological work: its purpose is to help the study of a particular phenomenon cross-linguistically or in a particular language, but the creation of a Questionnaire is in its turn based on the analysis of cross-linguistic data.
We attempt to alleviate linguist’s work by constructing lexical Questionnaires automatically prior to any manual analysis. A convenient Questionnaire format for revealing fine-grained semantic distinctions includes pairings of words with diagnostic contexts that trigger different lexicalizations across languages. Our method to construct this type of a Questionnaire relies on distributional vector representations of words and phrases which serve as input to a clustering algorithm. As an output, our system produces a compact prototype Questionnaire for cross-linguistic exploration of contextual equivalents of lexical items, with groups of three homogeneous contexts illustrating each usage. We provide examples of automatically generated Questionnaires based on 100 frequent adjectives of Russian, including veselyj ‘funny’, ploxoj ‘bad’, dobryj ‘kind’, bystryj ‘quick’, ogromnyj ‘huge’, krasnyj ‘red’, byvšij ‘former’ etc. Quantitative and qualitative evaluation of the Questionnaires confirms the viability of our method.
This paper surveys relative clause constructions in West Circassian (Adyghe) and Kabardian.
The paper presents a methodology for an automatic construction of a lexical typological questionnaire based on the data from a monolingual Russian National Corpus. Using the domains ‘sharp’, ‘straight’, ‘thick’, and ‘smooth’ as a test dataset, we elaborate an algorithm that constructs a list of collocations for the corresponding Russian adjectives, computes vector representation for every collocation, clusters the vector space into semantically homogenous groups and extracts three central elements from every cluster. We compare the resulting questionnaires with the manually prepared ones, conclude that the suggested methodology demonstrates a high quality and can be implemented in the process of a lexical typological research.
This paper describes the semantic and morphosyntactic properties of general converb constructions in Andi, a language of the Avar-Andic group of the East Caucasian language family. There are two general converbs in Andi, both of which are homophonous with a finite verb form (the aorist and the perfect, respectively). Each converb has a particular contextual meaning (manner and cause for the perfect converb, and means in the case of the aorist converb), while both can be used interchangeably to indicate the first stage of a complex event. The two constructions seem to be diachronically related, the aorist converbial construction being secondary and morphosyntactically more constrained. The aim of this paper is to describe and compare these two partially competing constructions in view of how similar forms are used in closely related languages.
The article compares the qualities ‘sharp’ and 'blunt' in 20 languages. We show that they tend to be unequal, with bluntness being negatively defined through sharpness. The two main oppositions in the domain are 1) the type of sharp object, and 2) the sense through which the quality is primarily experienced. The first opposition divides objects into bladed (knives etc) and pointed (needles etc), the second deals with touch vs. vision and translates to function (sharp/blunt instruments etc) vs. shape (pointed/rounded features etc).
We also find that these oppositions determine the semantic shifts that a word of sharpness or bluntness can have, and that the metaphoric patterns are consistent across languages.
In polysynthetic West Caucasian languages, the morphological verbal complex amounts to a clause, with all kinds of participants cross-referenced by affixes. Relativization is performed by introducing a relative affix in the cross-reference slot which corresponds to the relativized participant. However, these languages display several cross-linguistically rare features of relativization. Firstly, while under the view of the verbal complex as a clause this affix appears to be a relative pronoun, it is an unusual relative pronoun because it remains in situ. Secondly, relative affixes may appear several times in the same clause. Thirdly, relative pronouns are not expected to occur in languages with prenominal relative clauses. Fourthly, in the Circassian branch, relative pronouns are identical to reflexive pronouns. These features are explained by considering relative prefixes to be resumptive pronouns. This interpretation finds a parallel in the neighboring East Caucasian languages, where reflexive pronouns also show resumptive usages. Finally, since in some West Caucasian languages the relative affix is a morpheme with a dedicated relative function but still shows properties of a resumptive pronoun, our data suggest that the distinction between relative pronouns and resumptive pronouns may not be so clear as is usually assumed.
The paper deals with the encoding of “right” and “left” in Katharevousa Greek, which provides us with worth-exploring data on intentionally archaizing, artificial language of the XIX-XX centuries. The research is carried out on the basis of the Corpus of Modern Greek and the translations of two Classical Greek texts (“Anabasis” by Xenophon and “The History of the Peloponnesian War” by Thucydides) into Katharevousa.
Since Katharevousa is an archaizing language, one can suppose that it would copy the ancient means of marking “right” and “left”. On the other hand, the language was artificial, but based on the language variety, spoken by educated Greek people – so, the strategies of the spoken language of that time can also be expected. Such rules are not usually mentioned in grammar books, and in this domain we get an opportunity to analyze speakers’ intuitive choices.
According to the available data, the translators used utterly different strategies than the ancient writers. This language prefers dynamic projections and adverbs to static prepositions, which is obvious not only from the translations, but also from the quantitative distribution of the markers. The archaization in spatial strategies is quiet selective and influenced mostly by the Old and New Testament texts, rather than by the Classical Antiquity. Moreover, the choice of the spatial marker can depend on extralinguistic factors.
The paper proposes a corpus analysis of a Russian adjective slavnyj. Its semantic evolution is analyzed through its distribution in XVIII-XXI centuries texts, including the main types of its usages, its main meanings, and possible shifts from one meaning to another. It is shown that the initial semantics of ‘being famous’ that the adjective slavnyj expresses up to the beginning of the XIX century gives rise to the idea of strong positive evaluation. Slavnyj is very frequent as a positive marker during the XIX century, and then it gradually loses its intrinsic expressiveness. Nowadays, this adjective became much less frequent, having undergone a peculiar meaning shift: it marks a moderate compliment. While the grammaticalization pattern of slavnyj represents a well known shift ‘famous’ => ‘good’ (as a specific case of a more general pattern ‘differing from the others’ => ‘good’) widely attested crosslinguistically, the further stage of the semantic evolution of the word slavnyj appears to be more exotic.
The article describes the linguistic behavior of the introductory phrase stalo byt’ (literally became to be, meaning roughly ‘so’) in the Russian language of the XVIII - XXI centuries. The data from the Russian National Corpus show that this construction acquired additional senses over the centuries. Firstly it was used as a reason and cause marker, then it also developed two discourse meanings: paraphrase and returning to a previous theme. Our data show that this development is correlated with the increasing use of stalo byt’ in dialogues. In the Russian language of the XVIII – XIX centuries there was a variant of stalo byt’ – a single introductory word stalo (literally became), the difference between the constructions being rather stylistic than grammatical. Stalo while being the most preferable and prestigious in the language of the XVIII century, by the beginning of the XIX century became to sound archaic and thus vernacular. Finally, by the end of the XIX century parenthetical stalo almost disappeared. We suppose that this kind of semantic development from circumstance role marker to discourse unit must be typical for a class of lexical items and we plan to do some further research on it.
In this book you can find descriptions of the most popular authentic Rassian games and recomendations how to use them in RSL classes.
This book is for RSL teachers and foreign students interested in Russian games.
The Russian Constructicon project currently prioritizes multi-word constructions that are not represented in dictionaries and that are especially useful for learners of Russian. The immediate goal is to identify constructions and determine the semantic constraints on their slots. The Russian Constructicon is being built in parallel with the Swedish Constructicon and will ultimately model the entire Russian language in terms of constructions at all levels from morpheme to discourse. The contents of the Russian Constructicon will serve learners of the language, linguists researching both language-internal and typological phenomena, and will also serve language technology applications such as spell checkers and automated readability assessment tools.
The paper presents a supervised machine learning experiment with multiple features for identification of sentences containing verbal metaphors in raw Russian text. We introduce the custom-created training dataset, describe the feature engineering techniques, and discuss the results. The following set of features is applied: distributional semantic features, lexical and morphosyntactic co-occurrence frequencies, flag words, quotation marks, and sentence length. We combine these features into models of varying complexity; the results of the experiment demonstrate that fairly simple models based on lexical, morphosyntactic and semantic features are able to produce competitive results.
The Aphasia Rapid Test (ART; Azuar et al., 2013) is a bedside test allowing to rate aphasia severity in the acute stroke period. This test is developed as a 26-point scale estimating the severity of both speech comprehension and production less than in 5 minutes. Previously, ART was used in English and French clinical practice (Azuar et al., 2013). In Russian, there has been no analogous bedside screening scale for acute hospital units. Tests which were used before (Wasserman et al., 1987; Tsvetkova et al., 1981) are detailed, but time-consuming and effortful for patients in the first days post-stroke. ART is a reliable measure allowing to identify a language and speech disorders (aphasia, dysarthria or apraxia).
The role of access to a learner corpus has proved to increase efficiency of L2 acquisition for learners as well as teaching efficiency for EFL instructors. This paper presents a computer tool for a learner corpus designed at the School of Linguistics of the Higher School of Economics for both categories of users. REALEC, Russian Error-Annotated Learner English Corpus, set up at the School of Linguistics, is the first collection of English texts written by Russian students learning English available in the open access. All errors made by Russian students in their academic writing in English are pointed out to them with special tags by expert annotators (EFL instructors, as a rule). The annotation process is controlled by the research team responsible for consistency in tagging, as well as for the development of the learner corpus. One of the directions of the development is to look at the lexical features used in student essays. Our approach in this research was to find such lexical features in the essays scored highly by experts which will be significantly different from those features in the essays scored with the lowest grades.
This paper concerns the converb forms expressing Simultaneity in Izhma Komi, Northern Khanty and Moksha belonging to the Finno-Ugric group of languages. The existing typological classifications of temporal relations and simultaneity relations in particular either are not detailed enough or lack rigorous criteria and thus appear not to be sufficient for understanding the usage and the distribution of temporal converbs. The study attempts to build a more detailed typological classification of Simultaneity relations which accounts for the data of the languages under consideration. The analyzed parameters of variation include the viewpoint aspect of the events, clause modification type, givenness of the conveyed information and the pragmatic type of the predicate. Special attention is paid to the discourse-pragmatic properties of the forms which bring new insights into the discussion.
The morphology of aspect in many East Caucasian languages is usually described in terms of two aspectual stems. One stem, called ‘perfective’, derives perfective forms, including perfective past (i.e. aorist), perfective converb, perfective participle and other forms. The other stem, called ‘imperfective’, derives imperfective forms, including e.g. imperfective past (i.e. imperfect) and imperfective present, imperfective converb, imperfective participle and some others. Some of the imperfective- vs. perfective-based forms may be formally identical in terms of inflection (e.g. aorist and imperfect may be produced by the same suffix), but this is a matter of variation. In addition to the forms with clear aspectual semantics (e.g. aorist vs. imperfect), there is a number of forms that are not obvious in their aspectual quality. Thus, the prohibitive, expressed morphologically, is consistently derived from the imperfective stem. Imperative and infinitive, on the other hand, may be derived from both stems, thus distinguishing between perfective and imperfective, as in Dargwa (including Mehweb), or from separate secondary stems, as in Archi.
The parallels between East Caucasian languages are not absolute. The study of intra-family variation may focus on two different issues – the distribution of the forms lacking a clear aspectual meaning between the two stems (e.g. where do the prohibitive and the imperative or various types of special converbs go) or on the formal correlation between the perfective and the imperfective stem. It is the latter issue that I consider below. I study the mutual relation between the two stems, the ways in which they are formally different, and whether and to what extent one of them may be considered the primary one and the other derived. I will address this issue in three languages belonging to three different branches of the family: Archi (Lezgic), Mehweb (Dargwa) and Khinalug (Khinalug). My main conclusion is that, notwithstanding a plethora of patterns that differs across and within languages, the general tendency is that the imperfective stem is, in various ways, the marked member of the opposition, either straightforwardly derived from the perfective stem (Khinalug) or being structurally marked in the sense of Croft (2002).
I use the same parameters to arrive at conclusions comparable across the three languages, including:
The languages considered in the paper show different degrees of such asymmetry, from clearly asymmetrical Archi through Mehweb whose system seems to be perfectly symmetrical but where the imperfective stem is somewhat more marked to Khinalug where the imperfective stem is almost unequivocally derived from the perfective stem. The data comes from descriptions, including (Kibrik 1977) (also the dictionary (Chumakina et al. 2008) for Archi; (Kibrik et al. 1972) for Khinalug, and (Magometov 1982, Daniel in preparation) for Mehweb.
Sections 2, 3 and 4 treat Archi, Mehweb and Khinalug, respectively. Section 5 is a comparison of the three languages across the relevant parameters. Section 6 is a summary of the results.
The paper introduces a valuable tool for EFL instructors to select the direction for creating custom-made learning materials, namely, using a learner corpus with errors annotated by experts for the purpose of administering to the target group of learners a custom-made test which has been automatically generated from the sentences with student errors. The paper describes the stages in test-making and the statistics from automatically generated tests administered to students of the School of Linguistics (HSE).
The paper addresses an issue of an automatic data collection for lexical typological studies in the Frame approach paradigm. A research in this framework is based on the analysis of distributional properties of the lexemes in question. Hence, questionnaires for such studies consist of typical contexts where lexical items from a given semantic domain can potentially occur. We aim at filling these questionnaires automatically, and this task can be splitted into two different problems: questionnaire translation and its filling with the relevant data. We suggest three methods for the first task completion (translation via bilingual dictionaries vs. online cloud translators vs. parallel corpora), and two algorithms are focused on the second task (filling of a questionnaire based on monolingual corpora vs. on online translators). We test our algorithm on the data from four semantic domains of qualitative features (‘sharp’, ‘smooth’, ‘thick’, ‘thin’).