• Semantic search
  • text mining
  • agile software development

CEU: Feedback from a satisfied customer

Fri 25 April, 2014

"CEU MicroData is a team consisting of lecturers, PhD students and researchers of the Central European University. We analyse company and personal data for a better understanding of economical growth, international trade, company network, political connections and corruption.

Using our web application at kozbeszerzes.ceu.eu, we would like add more transparency to spending public moneys. Based on Open Data principles, we created an easy-to-search and browse, downloadable interface for the announcements of the - also open - Közbeszerzési Értesítő [Public Procurement Bulletin].

The archive of Közbeszerzési Értesítő  - consisting of over 140,000 text bulletins between 1997 and 2013 - contained the information we needed - e.g. the announcer, the winner and the amount of the procurements - in a semi-structured form only. Moreover, the format of the bulletins kept changing from year to year. Being a small research group only, we could not process all the data ourselves. We were looking for a company that was able to build a structured database from text files, meeting all data-quality criteria in a short time-span. That is why we chose Precognox.

Precognox was already pleasant to cooperate with during the specification of the task and signing the contract. Following a personal needs analysis, together, we designed the schematics of the future database, the value and the method to monitor data quality. After rendering the documents into a uniform scheme we asked them to validate a number of data fields, i.e. amounts, dates, company names and addresses.

The product - shipped on deadline - exceeded our expectations. The accuracy of each data field was measured between 89 and 95 per cent, i.e. the value of correspondence between the ones entered by our researchers and the ones found and validated by the algorithm of Precognox - in a random pattern of a hundred items. We had never thought such accuracy was possible by automatic processing only.

They answered our further questions fast and flexibly, as a real, agile team. We would be happy to rely on their services in the future, too."

Miklós Koren: Teacher of the Central European University and a researcher of the Centre for Economic and Regional Studies at the Hungarian Academy of Sciences. His research field is economical growth and international trade. His scientific findings were published in leading international papers. His research on knowledge transfer is supported by the European Research Council. He co-edits defacto.io, an information site and blog on economics.

Customers