Intelligent search solution to support agricultural research work

Case study of the search solution developed for the MATE Kaposvár Campus and the Krizevci Agricultural College

About the project

The goal of the comprehensive project, which will be realized between 2020 and 2022 with the support of INTERREG, is to improve the agility and resilience of the currently operating and potential food chains in the Croatian-Hungarian border area by developing and operating a complex, multi-level training system. The planned training system covers the entire vertical, from short-cycle training through bachelor’s and master’s degree training.

One of the main parts of the project is the development of an agricultural food chain search application, the purpose of which is to help the research work taking place at the MATE’s Kaposvár Campus by collecting the content of numerous relevant online agriculture-related sources (websites) and making them searchable.

Text analytics product range developed by Precognox fully cover the tasks arising during the project, including data collection and validation and implementation of an advanced and intelligent search.

the building of MATE Kaposvár Campus (source: LinkedIn – MATE Kaposvár Campus)

First step: data collection as the basis of the process

In order to carry out successful research work, it is first necessary to collect the text contents of relating websites containing relevant and reliable information. Our TAS Data Collector solution currently collects and processes the content of 5 online sources, such as

In parallel, the education and training process takes place, during which the researchers working in MATE”s Kaposvár Campus become able to expand the data collection process with another 60-70 sources. Precognox provides the continuously updated and expanded WIKI for TAS Data Collector and also provides technical support for users. Thus, although with a greater investment of time on the part of the customer, the expansion and maintenance of data collection process can be achieved at much lower costs.

How data collection is implemented?

During the data collection process, the essence* of the news is extracted by TAS Data Collector, followed by manual and low-level processing (*regex “country extraction”, data cleaning based on samples). In the course of this, the collection of certain metadata and attributes (source, url, date of collection, date of creation, category, title, etc.) is carried out.

data collection process can be continuously monitored on TAS Data Collector interface

Intelligent search

Of course, data collection is only the starting point for the implementation of the project, the ultimate goal of which is for the customer to be able to search easily and efficiently in the already available content. Users can perform these queries using TAS Enterprise Search.
User interface offers similar functions existing by web search engines, such as advanced search, narrowing (filtering) and sorting options. By clicking on each result, the user can get to the original content in seconds.

Text analytics solutions in one place

TAS Platform contains all the solutions and services that customers need to solve the given text analytics task. The text analytics solutions applied in the MATE’s Kaposvár Campus project do not fully cover the entire product range available within TAS Platform, as it may also include additional services such as TAS Tagger, TAS Alarmlist or TAS News Reader.

Advanced search process

TAS Thesaurus Manager is available to the customer within TAS Platform, which serves as a kind of improved thesauri and in which the following word relations can be set:

  • synonym
  • correct form
  • typo
  • narrower term
  • broader term
  • no connection (stop list)

With the help of the set relationships, the relevant content can be found easily even if typo was made during the search.

the user interface of TAS Enterprise Search

An other additional service within the TAS Platform is also available to the customer, TAS Search Log Analyzer, which can be used to examine the search terms used previously in TAS Enterprise Search. The followings can be analyzed on the clear-cut interface:

  • search terms: list of search terms
  • top searches: most popular search terms
  • the number of searches carried out in the given period: the list of terms in descending order of the number of searches
  • filtering options: based on words, phrases, number of searches and results, users, indexes, trends, date range (period)
  • frequent searches with no results
  • positive and negative search trends
the most important information about search terms can be reviewed on TAS Search Log Analyzer interface
word connections created in TAS Thesaurus Manager make the search process more efficient

Search application for agriculture

In order to make the results of the project (intelligent agricultural search solution) more widely accessive, the search interface will not only be available to the customer within the TAS framework, but also to the general public, through a dedicated website. This is expected to be realized in the second half of this year.

In the service of research work

The agricultural sector is one of the most dynamically developing sectors, which is largely due to research processes in the field. The data collection and search solution implemented within the presented project provide serious help to experts so that they can work even more efficiently on R&D projects related to agriculture.