In multi-label classification, multiple target variables are modelled and predicted together for each instance, as opposed to the traditional learning problem where a single target variable (class) is predicted. The main challenge is detecting and modelling dependencies among labels, while trying to remain computationally tractable to large problems.
Multi-label learning is relevant to many domains, for example text categorisation (a document belongs to multiple categories), scene classification (each image is associated with multiple concepts or objects) as well as video and other media, medical classification, and applications in microbiology.
The general case of multi-output prediction includes the regression case; it is a particular type of structured output prediction. It has close connections with other topics (which are also research interestes), including
probabilistic graphical models
time series forecasting
models for sequence learning
Data Stream Classification
Many real-world applications are found in the context of data streams, where data instances arrive continuously in a theoretically-infinite stream, for example in sensor networks, online social media, news feeds, and large deployments of e-mail.
In this context, methods must be able to process large volumes of data quickly and learn and make predictions in real time, as well as detect and adapt to concept drift.
Some applications dealing with sensory-data that I have worked on with real-world sensor deployments:
Learning to predict a traveller’s route and destination
In Aalto University I was involved in the Traffic Sense - Energy Efficient Traffic with Crowdsensing project doing route recognition and prediction. Given only a week or so of location data from a mobile phone device, it was possible to make reasonably accurate predictions about the traveller’s route and future destination. See the
Demo Animation (the captions explain what is going on)
Tracking on very low-power sensor motes
In the Comonsens project in Spain I worked on formulating and implementing a distributed particle filter on very low-power motes for target tracking. For more information, see the
Modelling tree growth in Scots pine
In the project MultiTree - Multi-scale modelling of tree growth, forest ecosystems, and their environmental control, I worked with forestry scientists to model intra-annual growth of pine trees in Finand and France using machine learning methods.