We use cookies in order to improve the quality and usability of the HSE website. More information about the use of cookies is available here, and the regulations on processing personal data can be found here. By continuing to use the site, you hereby confirm that you have been informed of the use of cookies by the HSE website and agree with our rules for processing personal data. You may disable cookies in your browser settings.
The School of Linguistics was founded in December 2014. Today, the School offers undergraduate and graduate programs in theoretical and computational linguistics. Linguistics as it is taught and researched at the School does not simply involve mastering foreign languages. Rather, it is the science of language and the methods of its modeling. Research groups in the School of Linguistics study typology, socio-linguistics and areal linguistics, corpus linguistics and lexicography, ancient languages and the history of languages. The School is also developing linguistic technologies and electronic resources: corpora, training simulators, dictionaries, thesauruses, and tools for digital storage and processing of written texts.
Bangkok: Association for Computational Linguistics, 2024.
Turkish Studies. 2025. P. 1-31.
Afanasev I., Lyashevskaya O.
In bk.: Structuring Lexical Data and Digitising Dictionaries: Grammatical Theory, Language Processing and Databases in Historical Linguistics. Leiden; Boston: Brill, 2024. P. 13-35.
Konstantin Zaitsev.
arxiv.org. Computer Science. Cornell University, 2024
Abstract:
The Dictionary of Russian Language of the 11th — 17th centuries 29 volumes of which have been published since 1975 (the 30th vol. is now in printing), is based on all sorts of texts of the 11th — 17th cc. representing the Medieval and Early Modern time periods of the history of Russian culture. The Dictionary covers both original and translated Slavonic texts attested in Early East Slavic and Middle Russian manuscripts as well as in early printed books of the 16th —17th centuries. Modern computational technologies allow to develop new forms of lexicographic resources. Importantly, such resources are open to continual additions and corrections; multidirectional, user defined search options enable users to quickly acquire and analyze extensive linguistic content. The Database is made by members (students and docents) of the workshop at the School of Linguistics at the Higher School of Economics and thus represents results of both didactic and research activities. It provides scholars with a tool for complex query making it possible to search for (a) certain lemmata, (b) their grammatical information, (c) etymology, (d) chronological and historical periods, (e) certain texts and sources, (f) phraseological units, etc. The search directions are defined by the structure of Dictionary entries and can be specified by users. The proposed structure of the database meets the requirements of the modern computer lexicography as attested by a number of dictionaries and databases of some of modern languages. Digitalized version of the Dictionary does not contain any usably marked fields that we can easily convert into the database fields. Therefore, the project participants had to write programs for the automatic processing and retrieve information from the Dictionary entries. The Database is supplied with the on-line source index of the Dictionary containing the most comprehensive up to date list of sources on the history of Old and Middle Russian language. The online Index contains information on the origins of Slavonic translations in Medieval and Early Modern Russia. The sources are classified according their genre, origin (translated / original), linguistic and pragmatic peculiarities. The Database opens further research perspectives in the field of chronology and genre stratification of East Slavonic lexics, dynamics of lexical borrowings in different time periods of the history of Russian language, varieties of linguistic usage and lexical norms of East Slavonic written culture.