Eureka Engine

1

Language detection - API

Defines the language of the text. Supported languages: Russian, English, Dutch, Swedish, German, Norwegian, Danish, French, Spanish, Italian, Portuguese, Romanian, Ukrainian, Belarusian, Tatar, Serbian, Bulgarian, Kazakh, Polish, Czech, Croatian, Bosnian, Slovenian, Finnish, Turkish, Armenian, Azerbaijani, Slovak, Hungarian, Estonian, Latvian, Lithuanian, Kirghiz, Mongolian, Chinese, Japanese, Korean, Swahili, Arabic, Farsi, Hindi, Uzbek, Vietnamese, Thai, Laotian, Khmer, Tibetan, Burmese, Filip Pinsk (Buhid, Tagbanva, Hanunoo, Baybayin, Sebuansky, Varayan), Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Sinhala, Saurashtra, Hebrew, Syrian (Aramaic).

Request parameters
{ "text": "Мама мыла раму. Папа смотрел телевизор." }
Answer
[ { "l": "RU", "n": "Russian", "p": 87 } ]
Fields value

Field data returns an array of objects - parameters of the detected languages. Object fields:

l – language code that consists of two symbols

n – name of the language

p – probability in per cent (0 - 100)
2

Sentiment Analysis - API

This module automatically extracts sentiment of the given text object.
Takes russian, english or armenian-language text as input, generates output in json format.

Parameters of http request
{ "text": "Обсуждается вопрос по линии Роскосмоса о привлечении Китая в качестве основного партнера по проекту создания лунной научной станции", "listWordsOT" : "Китай" }
Answer
[ { "result": [{ "ton":"neut", "pos":53, "origin":"Китая", "normal":"Китай", "tonweight":1, "len":5 }], "avgNegMsg": 0.0, "avgPosMsg": 1.0, "ver": "1.0.4.200" } ]
Fields value

Field result returns an array of objects, where each object consists of a word and its properties. Object fields:
- ton – polarity (tonality in BA, polarity in literature) of the message, pos - positive, neg - negative, neut - neutral
- pos – position of the detected object of the sentiment in the text
- origin – original form of the detected object of sentiment in the text.
- normal – normalized form of the detected object of the sentiment in the text
- tonweight – strength of the sentiment of the document
- len – length of the detected object of sentiment in the text
- avgNegMsg – average negative sentiment of the sentence
- avgPosMsg – average positive sentiment of the sentence
- ver – version of the service of sentiment detection
3

Autoclassification - API

This module automatically assigns a category to a text. It computes the probability of the text belonging to a specific topic.
Russian text serves as an input.

Request parameters
{ "text": "Несколько сотен сотрудников компании Apple секретно работают над созданием электромобиля, напоминающего минивэн, сообщает The Wall Street Journal со ссылкой на собственные источники. " }
Answer
[{ {"Classes":[ { "i":9, "n":"Наука и технологи", "p":"38.93" }, { "i":0, "n":"Авто", "p":"18.18" }, { "i":1, "n":"Экономика и бизнес", "p":"11.53" } ] }]
Fields value

Field Classes returns array of objects where every object contains a word and its properties. Object fields:
- i – identifier of a category, to which the text belongs
- n – name of the category, to which the text belongs
- p – probability that the text belongs to a given category
4

Named Entity Recognizer (NER) (NER) - API

It's a module that automatically detects named entities.
It receives russian and english texts
as an input and allows to group named entities into five classes (individuals, legal entities, geographical objects, names of products and brands and named events) for Russian and three classes (proper names, legal entities and geographical objects) for English.

Request parameters
{ "text": "Генеральная ассамблея ООН приняла 27 марта резолюцию о территориальной целостности Украины. Об этом сообщает Agence France-Presse." }
Answer
[ { "i": 22, "l": 3, "ner": "ORG", "v": "ООН" }, { "i": 83, "l": 7, "ner": "GEO", "v": "Украины" }, { "i": 109, "l": 20, "ner": "ORG", "v": "Agence France-Presse" } ]
Fields value

Field data returns an array of objects - named entities. Object fields:
- i – position of an entity in the text
- l – symbol length of an entity
- v – text entity
Types of entities:
- name – Proper Name
- org – Organisation
- geo – Geography
- prod – Product
- entr - Event
5

Word normalisation - API

It returns basic word forms and its morphological characteristics. Takes as input russian text.

Request parameters
{ "text": "Мама мыла раму. " }
Answer
[ { "o": "Мама", "n": "мама", "c": "Nominative", "m": "Singular", "g": "Feminine", "p": "Undefined", "v": "Undefined", "t": "Undefined", "r": "Undefined", "pos": "Noun" }, { "o": "мыла", "n": "мыть", "c": "Undefined", "m": "Singular", "g": "Feminine", "p": "Undefined", "v": "Active", "t": "Past", "r": "Transitive", "pos": "Verb" }, { "o": "раму", "n": "рама", "c": "Accusative", "m": "Singular", "g": "Feminine", "p": "Undefined", "v": "Undefined", "t": "Undefined", "r": "Undefined", "pos": "Noun" }, { "o": ".", "n": null, "c": "Undefined", "m": "Undefined", "g": "Undefined", "p": "Undefined", "v": "Undefined", "t": "Undefined", "r": "Undefined", "pos": "Other" } ]
Fields value

Field data returns array of objects, each object consists of a word and its properties. Object fields:
- o – original word
- n – normalized word form
- с – case (Nominative, Genitive, Dative, Accusative, Locative, Instrumental, Prepositional)
- m – number (Plural, Singular)
- g – gender (Masculine, Feminine, Neuter)
- p – person (First, Second, Third)
- v – voice (Active, Passive)
- t – tense (Future, Present, Past, FutureInThePast)
- r – transitivity (Transitive, Intransitive)
- pos – part of speech (list is below)
List of parts of speech
- Other - Other (not determined)
- Article - Article
- Adj - Adjective
- AdjPron - Adjective Pronoun
- Adv - Adverb
- AdvPart - Adverbial Participle
- AdvPron - Adverbial Pronoun
- AuxVerb - Auxiliary Verb
- Conj - Conjunction
- Inf - Infinitive
- Intr - Interjection
- Noun - Noun
- Num - Number
- Part - Participle
- Pr - Particle
- PosPron - Possessive Pronoun
- Pred - Predicate
- Prep - Preposition
- Pron - Pronoun
- Punct - Punctuation
6

Morphological analysis - API

This module determines part of a speech, word forms and morphological attributes of an input word.
It accepts russian texts as input.

Request parameters
{ "text": "Мама мыла раму. " }
Answer
{ "r": [{ "o":"Мама", "n":"мама", "s":"мам", "c":"Nominative", "m":"Singular", "g":"Feminine", "p":"Undefined", "v":"Undefined", "t":"Undefined", "r":"Undefined", "pos":"Noun", "si":0, "wf":["мама","мам","мамам","мамами","мамах","маме","мамой","мамою","маму","мамы"] },{ "o":"мыла", "n":"мыть", "s":null, "c":"Undefined", "m":"Singular", "g":"Feminine", "p":"Undefined", "v":"Active", "t":"Past", "r":"Transitive", "pos":"Verb", "si":5, "wf":["мыть","моем","моет","моете","моешь","мой","мойте","мою","моют","моющая","моющего","моющее","моющей","моющем","моющему","моющею","моющие","моющий","моющим","моющими","моющих","моющую","моя","мыв","мывшая","мывшего","мывшее","мывшей","мывшем","мывшему","мывшею","мывши","мывшие","мывший","мывшим","мывшими","мывших","мывшую","мыл","мыла","мыли","мыло","мыт","мыта","мытая","мыто","мытого","мытое","мытой","мытом","мытому","мытою","мытую","мыты","мытые","мытый","мытым","мытыми","мытых","мыло","мыл","мыла","мылам","мылами","мылах","мыле","мылом","мылу","мыло","мыл","мыла","мылам","мылами","мылах","мыле","мылом","мылу","мыло","мыл","мыла","мылам","мылами","мылах","мыле","мылом","мылу"] },{ "o":"раму", "n":"рама", "s":"рам", "c":"Accusative", "m":"Singular", "g":"Feminine", "p":"Undefined", "v":"Undefined", "t":"Undefined", "r":"Undefined", "pos":"Noun", "si":10, "wf":["рама","рам","рамам","рамами","рамах","раме","рамой","рамою","раму","рамы"] },{ "o":".", "n":".", "s":null, "c":"Undefined", "m":"Undefined", "g":"Undefined", "p":"Undefined", "v":"Undefined", "t":"Undefined", "r":"Undefined", "pos":"Punctuation", "si":15, "wf":null }] }
Fields value

Field r returns the object of the word under analysis. Fields of the object.
- о – original word form in the text
- n – normalized word form in the text
- s – stem
- c – case
- m – number
- g – gender
- p – person
- v – voice
- t – tense
- r – transitivity
- pos – part of speech
- si – word position in the text
- wf – list of possible word forms

Language detection - API

Sentiment Analysis - API

Autoclassification - API

Named Entity Recognizer (NER) (NER) - API

Word normalisation - API

Morphological analysis - API