• Semantic search
  • text mining
  • agile software development

Automatic Detection of Emotions in Text

Wed 30 September, 2015

Our research is the first attempt to offer a solution for detecting emotions in Hungarian texts. In general, emotion analysis is mostly popular in behavioral sciences and psychology, however, in the recent years it also started to spread in the field of NLP (Natural Language Processing).

Plutchik wheel of human emotions

The background

It is important to make a distinction between the widely used sentiment analysis and emotion analysis. Emotion analysis aims to extract emotional states from a given text. Detecting emotions is extremely hard, they come and go so quickly and they are usually associated with extra-linguistic clues such as facial expressions, tone and etc.

In the Internet Era, it is becoming more and more important to analyze and extract emotions from texts, not just because it is uniquely fascinating and challenging to NLP experts but also because it is becoming strikingly important in the field of economy, if for example we would like to measure customer satisfaction.

Our research group hypothesizes that words with emotional meaning or content should be the best markers of the speaker’s/writer’s emotional intent, so we have constructed a Hungarian Emotion Dictionary. The dictionary consists of sub-dictionaries, each based on Ekman's six basic emotions, namely sadness, anger, disgust, fear, surprise and joy. Our team manually annotated several blog posts and their comments to test the efficiency of using our dictionaries for emotion analysis.

How can we use emotion analysis?

During the local elections in 2014, we analyzed Hungarian tweets related mayoral candidates in Budapest. We found that anger is the best predictor of winning! We were surprised, since most studies (like this classic from Bollen et al. found number of mentions and/or positive sentiment the best factors of success. The number of Hungarian Twitter users is very small, and less fine-grained solutions like sentiment analysis or the frequency of mentions could give us a bad picture since most of the tweets were neutral, and mentions of small party candidates were very rare.  So, we analyzed tweets by our emotion dictionaries and gave each candidate an emotion score that reflects the relative proportion of each emotion in tweets mentioning him/her. From the six basic emotions, it was the mean square error of anger which were in accord with the results of opinion polls and later the final outcome of the election.

The Economist’s R-word index is one of the most well-known indicator of the economy. It is so simple, as it depicts the frequency of the term “recession” in the Wall Street Journal and in the Financial Times, yet it is mostly accurate. We created a corpus, or a collection of articles from various news sites and blogs. We found no correlation between the frequency of “recession” and its Hungarian synonyms and the GDP. However, the level of fear and anger are usually increasing before the GDP starts to decline.


Mon 21 September, 2015

A closeup from the Silicon valley

The Kaposvar HQ of the Precognox company hosted a new meetup in Wednesday afternoon. Our guest was Dénes Finduk, the editor of http://siliconvalleylife.blog.hu and Data Engeneer Lead of Addepar Company seating in the Silicon Valley. We learned from him how he achieved to accomplish his studies with Master degree in the University of Edinburgh and later how he started his career in the US. During the interactive meetup Dénes was bombarded with questions and he answered them with his friendly and opened manner. From his report we took a glimpse of his everyday life in California which mostly consists of work and work and work. Well, we’ve seen from his photos that sometimes there is room for fun activities in the lives of our colleagues overseas as well. Long and familiar discussion closed the meetup in the evening with the obligatory pizza dinner.


We have a new NLP member

Tue 7 July, 2015

Our former trainee, Kitti Balogh, joins our research team as a full-time member.

Kitti has just graduated in Eötvös University with an MSc degree from Statistics and her thesis "The application of latent Dirichlet allocation for Social Sciences" received the prize of best Survey Statistics Master's Thesis of the academic year 2014/15.

Congratulations Kitti, we are so proud of you!

New KConnect search services give healthcare the very best in medical information

Wed 15 April, 2015

KConnect launched its official website: www.kconnect.eu and begins the commercialisation of new multi-lingual medical text analysis and search services. Precognox is a proud partner of the team.

The new state-of-the-art medical information search services have the ability to empower healthcare and life science professionals and the public alike. The search services can provide the fastest and most relevant medical support information available from which users can make the best-informed decisions. 

The intelligent (semantic) search services can incorporate both published medical literature and in-house medical information sources (such as electronic health records or health registries).

"The quality of the search performance can help clinicians and researchers remain at the forefront of their profession. By having the right knowledge about best practices and treatments at their fingertips, clinicians can ensure the very best in patient outcomes and a healthier community," says Professor Robert Stewart, Department of Psychological Medicine, King's College London.

Intelligent search for better user experience

The search services have been made 'intelligent' by understanding the meaning/context/intent of user queries. The very best in medical information is made more findable by the fact that the semantic search is not just based on query keywords but also on related concepts and contexts.

The user search box has the ability to understand keyword connotations, related concepts and their relationships within a medical context. Such machine comprehension is also employed in the 'reading' (indexing, classifying and annotating) of medical content so that the most relevant information can be found even if a user's chosen keyword happens to be absent within the text.


Search global medical information in any language

The accurate language mapping of key medical concepts allows users to search in their own language (currently there are several European languages available with more to follow). The addition of machine translation means that information can be provided either in English or the source's original language.

Building blocks for tailored medical services

Individually created components and toolkits mean that an organisation can tailor its search-driven medical solutions according to its own requirements. There are several tailoring options available including information sources, access (cloud or local installation), language, security, functionality (alerts, recommendations and social search) and whether the created solution is either standalone or embedded.

Partnership opportunities

Due to the expected demand for its services, KConnect is looking to extend its Professional Service Community by looking for new partners to help with the quick and wider adoption of its services.

The KConnect Consortium are:

- Vienna University of Technology (Austria);
- Findwise AB (Sweden);
- Precognox Kft (Hungary);
- Ontotext AD (Bulgaria);
- Trip Database Ltd (UK);
- Health on the Net Foundation (Switzerland);
- Qulturum, Region Jönköping County (Sweden);
- King's College London (UK);
- University of Sheffield (UK) - GATE;
- Charles University, Prague (Czech Republic).


Digiwhist project

Tue 14 April, 2015

As a partner of Corruption Research Centre Budapest, Precognox is participating in the DIGIWHIST Horizon 2020 project. DIGIWHIST aims to build a platform which can analyze procurement data from 35 European countries in various languages. Also, the whistleblower platform will be implemented during the project.

The project is lead by the University of Cambridge, other members of the consortium are

ERCAS (Hertie Schoool of Covernance)
Corruption Research Centre Budapest
Open Knowledge Foundation Deutschland


Syndicate content