And the winner is….!!!

DataScouting is happy to announce Alexander D’yakonov was the winner of our Kaggle competition on Greek Media Monitoring Multilabel Classification. Participants were asked to develop an automated ann...

Automated topics identification in media monitoring

A competition on Media Monitoring Multilabel Classification The value of media monitoring lies in selecting, categorising and delivering the right content to clients in real time. As quality and time...
Aggrate knowledge

Aggregate knowledge: media intelligence platforms

The following is an interview of DataScouting’s General Manager, Stavros Vologiannidis (SV), by FIBEP Marketing and Communications Commission. The full interview can be found in here. MarComm: ...

PaladorScheduler

PaladorScheduler A job scheduler for large scale data processing (more info at datascouting.com).   DataScouting PaladorScheduler is a software ecosystem for scheduling, distributing, processing...

Easy server virtualization using virtualbox

There are several commercial and open source virtualization solutions ranging from small to large enterprises. In this post I will describe the process of installing a debian based headless server th...

Highlighting Annotations in SolrJ

Recently in work, we had to use the Solr Indexer when creating a RESTful API in Java using the JAX-RS specification. Solr provides wrappers around its API calls for a variety of programming languages...

Amazon’s Mechanical Turk

In other words, how to rent an army of slaves on demand. Quoting from Amazon Web Services (emphasis mine): Amazon Mechanical Turk is a marketplace for work that requires human intelligence. The Mecha...

Trac, SVN quick howto on a linux Debian

In the following, I will present a mini guide to setup Trac 0.10.3 and SVN services on a Linux Debian stable. I needed a per project authentication both in trac and in svn. I just finished it, seems ...

Text Analysis inside Lucene

Lucene (http://lucene.apache.org) is a well-known Informational Retrieval (IR) library, implemented in Java, which allows you to add powerful indexing and searching capabilities to your application. ...