Moscow, 21/4 Staraya Basmannaya Ulitsa
Phone: +7 (495) 772-95-90 *22734
The volume is devoted to the typology of the category of number in the world's languages.
The chapter provides a detailed description of the expression of number in West Circassian.
Udi (East Caucasian) possesses several means of expressing the meaning ‘other’, namely (i) the combination of a (usually distal) demonstrative with a numeral (usually ‘one’), arguably calqued from Azerbaijani, (ii) the expression originating from a combination of a demonstrative with the noun ‘arm, side’ and (iii) borrowed adjectives. It is shown that the morphological properties of some of these expressions suggest a kind of grammaticalization. The semantic differences between the expressions mostly fit into the contrast between the types of ‘other’ expressions proposed by Cinque (2015), but also display additional remarkable contrasts.
The paper describes expressions with the meaning ‘other’ in East Caucasian (Nakh-Daghestanian) languages. It is shown that four main strategies can be distinguished: i) the ‘one’-based strategy: ‘other’ includes the numeral ‘one’; ii) the demonstrative-based strategy: ‘other’ includes a demonstrative pronoun; iii) the mixed demonstrative-based + ‘one’-based strategy: ‘other’ includes both a demonstrative and the numeral ‘one’; and iv) the lexical strategy: ‘other’ is a dedicated adjective (pronoun), not necessarily derived from any other clearly discernable source.
In this paper, we analyse case marking in Russian eventive nominalisations recently discussed in Pereltsvaig et al. (2018) with regards to two competing theories of case: the Inherent Case Theory (Woolford 2006; Woolford 2009) and the Dependent Case Theory (Marantz 1991). We contest the view that Russian eventive nominalisations display ergative alignment (Koptjevskaja-Tamm 1993) and argue that Russian is a nominative-accusative language across the board. We propose an analysis for the syntax of Russian eventive nominalisations and show that, contrary to Pereltsvaig et al. (2018), they are in principle incapable of disproving the DCT. The resulting analysis is trivially compatible with the DCT.
It has been acknowledged that the null subject of a converbial clause in Russian is canonically controlled by the Nominative subject of a main clause (Nominative subject control). Non-Nominative control has been considered ungrammatical. On the basis of two experiments (acceptability rating and speeded grammaticality judgement tasks) the paper shows that the non-Nominative control (by u menya ‘PREP I.GEN’) with mental converbs is evaluated lower than grammatically correct but higher than grammatically incorrect sentences. Moreover, according to the data from the RNC, the frequency of non-Nominative control increased in more recent written texts (approximately since the second half the 20th century). Furthermore, the paper reveals a new effect of the linear position of the converbial clause relative to the main clause (preposition vs. postposition). Preposed converbial clauses are judged as more acceptable than postposed converbial clauses. In more recent written corpus texts, there is also a tendency for non-Nominative control to occur in sentences with preposed converbial clauses. Last but not least, the paper demonstrates that sentences with the 1SG pronoun controllers are more acceptable than sentences without an overt subject.
The paper presents the results of GramEval 2020, a shared task on Russian morphological and syntactic processing. The objective is to process Russian texts starting from provided tokens to parts of speech (pos), grammatical features, lemmas, and labeled dependency trees. To encourage the multi-domain processing, five genres of Modern Russian are selected as test data: news, social media and electronic communication, wiki-texts, fiction, poetry; Middle Russian texts are used as the sixth test set. The data annotation follows the Universal Dependencies scheme. Unlike in many similar tasks, the collection of existing resources, the annotation of which is not perfectly harmonized, is provided for training, so the variability in annotations is a further source of difficulties. The main metric is the average accuracy of pos, features, and lemma tagging, and LAS. In this report, the organizers of GramEval 2020 overview the task, training and test data, evaluation methodology, submission routine, and participating systems. The approaches proposed by the participating systems and their results are reported and analyzed.
Head/dependent marking is a typological parameter based on whether syntactic relations, or dependencies, are marked on the head of the relation, on the non-head, on both, on neither, or elsewhere in the constituent. It has been visible in description and comparison for some thirty years, during which time advances in analysis of phrase structure and descriptions of previously unnoticed patterns have revealed some imprecisions and gaps in the typology. That approach has figured in descriptive and theoretical work of various kinds and has proven quite useful as far as it goes, but the expansion of descriptive and theoretical work on morphosyntax in the subsequent decades has revealed some gaps and inconsistencies in the original formulation. These can be removed by allowing markers to be assigned not to words but to entire phrases, a move that also allows detached and neutral marking to be more comfortably accommodated in locus theory.
Artificial General Intelligence (AGI) is showing growing performance in numerous applications - beating human performance in Chess and Go, using knowledge bases and text sources to answer questions (SQuAD) and even pass human examination (Aristo project). In this paper, we describe the results of AI Journey, a competition of AI-systems aimed to improve AI performance on knowledge bases, reasoning and text generation. Competing systems pass the final native language exam (in Russian), including versatile grammar tasks (test and open questions) and an essay, achieving a high score of 69%, with 68% being an average human result. During the competition, a baseline for the task and essay parts was proposed, and 80+ systems were submitted, showing different approaches to task understanding and reasoning. All the data and solutions can be found on github https://github.com/sberbank-ai/combined_solution_aij2019
This article presents a survey of the morphology of highly polysynthetic Northwest Caucasian languages.
Northwest Caucasian languages display a high degree of polysynthesis (manifested in complex words which bear much information on arguments and the characteristics of a situation), prefixes and suffixes, with some morphemes being capable to appear both as prefixes and suffixes, ergative-based cross-reference of core arguments and indirect objects introduced by applicatives, highly developed means of expressing locational semantics within the predicate, and intricate tense-modality-aspect systems. Although classical noun-to-verb incorporation does not occur, there are constructions akin to incorporation, especially in the nominal domain. Nouns constitute a subclass of a broad class of predicates (both morphologically and syntactically) and form word-like nominal complexes with their attributes. Morphemes demonstrate features which are not typical of morphemes in Standard Average European languages, including much autonomy reflected in affix order variation and ability to attach to complex syntactic constituents.
This note presents two challenges for the analysis of promise-type verbs within the Movement Theory of Control. We show that the objects of these verbs in Russian are not prepositional and are incorrectly predicted to be legitimate controllers. We also argue against analysing oblique control as movement.
The category of person is a linguistic expression of reference to a role in a speech act, including the speaker, the addressee, or a combination thereof. The values of the person category commonly, if not universally, include the opposition of first person (reference to the speaker) versus second person (reference to the addressee). Reference to neither the speaker nor the addressee is commonly—though not always—considered to be the third value of the category, third person. This article is an overview of person indexation on the verb and in possessive constructions, interaction of the category of person with other categories such as number and moods, the issue of person hierarchies as reflected in the categories of clusivity and direct-inverse systems, and some topics in the pragmatics of person. The discussion includes some topics disregarded or less touched upon in other surveys of the category of person, such as a discussion of the person relationship to commands (imperative paradigms) or logophoricity. The main focus is on the morphology of person, and other aspects of personal reference are discussed with respect to how they are expressed or differentiated by morphological material. On the other hand, personal reference in grammar and lexicon show strong affinity, making it both difficult and unnecessary to separate independent personal pronouns from person affixes in a typological perspective. In this sense, person-related lexicon and inflectional morphology are treated together.
This paper gives an account of participial clauses in Agul (Lezgic, Nakh-Daghestanian), based on a sample of 858 headed noun-modifying clauses taken from two text corpora, one spoken and one written. Noun-modifying clauses in Agul do not show syntactic restrictions on what can be relativized, and hence they instantiate the type known as GNMCCs, or general noun-modifying clause constructions. As the text counts show, intransitive verbs are more frequent than transitives and experiencer verbs in participial clauses, and among intransitive verbs, locative statives with the roots ‘be’ and ‘stay, remain’ account for half of all the uses. The asymmetry between the different relativization targets is also significant. Among the core arguments, the intransitive subject (S) is the most frequent target, patient (P) occupies second place, and agent (A) is comparatively rare. The preference of S and, in general, of S and P over A also holds true for most other Nakh-Daghestanian languages for which comparable counts are available. At the same time, Agul stands apart from the other languages by its high ratio of non-core relativization which accounts for 42% of all participial clauses. Addressee, arguments and adjuncts encoded with a locative case, as well as more general PLACE and TIME relativizations show especially high frequency, outnumbering such arguments as experiencers, recipients, and predicative and adnominal possessors. Possible reasons for the high ratio of non-argument relativization are discussed in the paper.
This paper discusses two analyses of the Anaphor Agreement Effect (AAE, Rizzi 1990) in the light of novel data from Avar. By demonstrating that Avar anaphors trigger full, non-trivial agreement on the φ-probe, I argue that the Avar data instantiate a genuine exception to the AAE. I then compare two competing analyses of binding and the AAE: an account whereby anaphoric dependencies arise via the syntactic operation Agree (Murugesan 2018), and a theory deriving the inability of the anaphors’ φ-features to trigger full agreement from the presence of additional structural layers inside the anaphors that render the features inaccessible (Preminger 2019). I claim that the absence of the AAE in Avar supports the encapsulation analysis.
Background: The Aphasia Rapid Test (ART) is a screening test developed for fast speech/language assessment of people in the acute stroke period. This test has been developed for French and English and was recently adapted for Portuguese and Italian. Nowadays, such a standardised screening test is in a great need at clinics with Russian-speaking patients. To fill this gap, the ART was adapted for Russian.
Aims: The current study investigated whether the Russian ART meets all the psychometric standards, and whether it is suitable for detecting speech/language disorders and estimating their severity, as well as for the evaluation of improvement in the acute post-stroke period.
Methods & Procedure: First, we evaluated the validity, sensitivity, specificity, accuracy, test-retest reliability, inter-item consistency and inter-rater reliability of the test in a group of people with chronic speech/language disorders (N = 55) and in an age-matched control group of non-brain-damaged individuals (N = 50). Participants performed the Russian ART, and their linguistic status was confirmed by the Russian e-version of the Token Test. Second, to test the appropriateness of the Russian ART in the acute post-stroke period, a clinical group of such individuals (N = 43) performed the ART and the Token Test, as well as the Vasserman’s scale which is widely used in Russian clinics. Finally, 16 people in the acute stroke period performed the Russian ART twice to prove that the test can detect early changes in an acute patient’s linguistic status.
Outcomes & Results: The results showed that the Russian ART can be considered as a valid, sensitive, specific, and accurate screening tool with the high test-retest reliability, inter-item consistency, and inter-rater reliability. In the acute post-stroke group, the correlation between the ART and the Token Test was high and significant; a moderate correlation and no significant correlation were found between the Vasserman’s scale and the Russian ART and the Token Test correspondingly. The Russian ART also allowed us to detect the improvement in speech/language status in the acute post-stroke period.
Conclusion: The study confirmed that the Russian ART meets all required standards to be suggested for usage in a Russian-speaking clinical population. This test was relevant for detecting the presence and severity of speech/language disorders and to measure the improvement in the acute post-stroke period.
Vossian Antonomasia is a prolific stylistic device, in use since antiquity. It can compress the introduction or description of a person or another named entity into a terse, poignant formulation and can best be explained by an example: When Norwegian world champion Magnus Carlsen is described as "the Mozart of chess", it is Vossian Antonomasia we are dealing with. The pattern is simple: A source (Mozart) is used to describe a target (Magnus Carlsen), the transfer of meaning is reached via a modifier ("of chess"). This phenomenon has been discussed before (as 'metaphorical antonomasia' or, with special focus on the source object, as 'paragons'), but no corpus-based approach has been undertaken as yet to explore its breadth and variety. We are looking into a full-text newspaper corpus (The New York Times, 1987–2007) and describe a new method for the automatic extraction of Vossian Antonomasia based on Wikidata entities. Our analysis offers new insights into the occurrence of popular paragons and their distribution.
This chapter presents an overview of the Northwest Caucasian (West Caucasian, Abkhaz-Adyghe) family.
Orthographic and morphological heterogeneity of historical texts in pre-modern Slavic causes many difficulties in pos- and morphological tagging. Existing approaches to these tasks show state-of-the-art results without normalization, but they are still very sensitive to the properties of training data such as genre and origin. In this paper, we investigate to what extent the heterogeneity and size of the training corpus influence the quality of pos tagging and morphological analysis. We observe that UDpipe trained on different parts of the Middle Russian corpus demonstrates a boost in accuracy when using less training data. We resolve this paradox by analyzing the distribution of pos-tags and short words across subcorpora.