"CEU MicroData is a team consisting of lecturers, PhD students and researchers of the Central European University. We analyse company and personal data for a better understanding of economical growth, international trade, company network, political connections and corruption.
Using our web application at kozbeszerzes.ceu.eu, we would like add more transparency to spending public moneys. Based on Open Data principles, we created an easy-to-search and browse, downloadable interface for the announcements of the - also open - Közbeszerzési Értesítő [Public Procurement Bulletin].
The archive of Közbeszerzési Értesítő - consisting of over 140,000 text bulletins between 1997 and 2013 - contained the information we needed - e.g. the announcer, the winner and the amount of the procurements - in a semi-structured form only. Moreover, the format of the bulletins kept changing from year to year. Being a small research group only, we could not process all the data ourselves. We were looking for a company that was able to build a structured database from text files, meeting all data-quality criteria in a short time-span. That is why we chose Precognox.
Precognox was already pleasant to cooperate with during the specification of the task and signing the contract. Following a personal needs analysis, together, we designed the schematics of the future database, the value and the method to monitor data quality. After rendering the documents into a uniform scheme we asked them to validate a number of data fields, i.e. amounts, dates, company names and addresses.
The product - shipped on deadline - exceeded our expectations. The accuracy of each data field was measured between 89 and 95 per cent, i.e. the value of correspondence between the ones entered by our researchers and the ones found and validated by the algorithm of Precognox - in a random pattern of a hundred items. We had never thought such accuracy was possible by automatic processing only.
They answered our further questions fast and flexibly, as a real, agile team. We would be happy to rely on their services in the future, too."
Miklós Koren: Teacher of the Central European University and a researcher of the Centre for Economic and Regional Studies at the Hungarian Academy of Sciences. His research field is economical growth and international trade. His scientific findings were published in leading international papers. His research on knowledge transfer is supported by the European Research Council. He co-edits defacto.io, an information site and blog on economics.
This week, kozbeszerzes.ceu.hu, a portal that makes searchable the Hungarian procurement data, has been launched. The site has been developed by CEU Microdata, a research group at the Department of Economics of the Central European University, lead by Miklos Koren and Adam Szeidl. Procurement data has been released in unstructured documents by the government, so it is extremely hard to get useful information from the texts. Precognox has developed a special text mining solution that extracts the relevant information from text files and stores them in a structured database which can be analyzed by researchers. Our company is very proud of the success of CEU Microdata. The site is simple and functional, and it is even robot friendly, so one can automatically harvest procurement data using kozbeszerzes.ceu.hu. CEU Microdata is a very active research group and they will produce other open datasets in the near future. Congratulations guys, we are eagerly waiting for your new results!
We are exhibiting at Cebit this week with other Hungarian companies at Hall 5, D06 stand.
We were having a pretty successful company get-together in November. We learnt that our team members are not only top-notch developers, but they are naturally born laser tagging fighters, world class champions and go-kart drivers! We are devoted to the “work hard, play hard” philosophy, at least to its second part, so we started the day with laser tagging. After the fights, the soldiers of Precognox visited a special restaurant that’s offering probably the biggest portions of hamburgers in the country. The good soldiers won another fight against the food and they needed new challenges! Go-kart was the name of the new game, and we pushed it hard during the race. Istvan won again and we sadly concluded that the success of the company is now correlated to his winning series…
Here are the results of the competition:
2. András 26,537
3. Karesz 26,692
4. Endre 26,792
5. Sz. Tamás 27,064
6. Gabi 28,022
7. Attila 28,066
8. T. Zoli 28,373
9. Péter 28,448
10. M. Tamás 28,696
11. V. Zoli 28,864
12. L.Tamás 30,995
opendata.hu has just launched last week. It is a community driven data hub, that aims to catalogue open datasets related to Hungary. Precognox provides the technical background to run the data hub while K-Monitor is operating and coordinating the community.
opendata.hu is an instance of CKAN - the world leading open source data catalouging software that has been used by international organization (e.g. http://publicdata.eu/), national and local governments (e.g. http://data.gov.uk/) and community driven open data portals (e.g. http://open-data.okfn.gr/). We do hope that open data brings benefits to business, society and government.