World Library  
Flag as Inappropriate
Email this Article

Multi-label classification

Article Id: WHEBN0007466947
Reproduction Date:

Title: Multi-label classification  
Author: World Heritage Encyclopedia
Language: English
Subject: Multiclass classification, Apprentissage des intérêts, Winnow (algorithm), Intertwingularity, Classification algorithms
Collection: Classification Algorithms
Publisher: World Heritage Encyclopedia

Multi-label classification

In machine learning, multi-label classification and the strongly related problem of multi-output classification are variants of the classification problem where multiple target labels must be assigned to each instance. Multi-label classification should not be confused with multiclass classification, which is the problem of categorizing instances into one of more than two classes. Formally, multi-label learning can be phrased as the problem of finding a model that maps inputs x to binary vectors y, rather than scalar outputs as in the ordinary classification problem.

There are two main methods for tackling the multi-label classification problem:[1] problem transformation methods and algorithm adaptation methods. Problem transformation methods transform the multi-label problem into a set of binary classification problems, which can then be handled using single-class classifiers. Algorithm adaptation methods adapt the algorithms to directly perform multi-label classification. In other words, rather than trying to convert the problem to a simpler problem, they try to address the problem in its full form.


  • Problem transformation methods 1
  • Adapted algorithms for multi-label classification 2
  • Statistics and evaluation metrics 3
  • Implementations and datasets 4
  • See also 5
  • References 6
  • Further reading 7

Problem transformation methods

Several problem transformation methods exist for multi-label classification; the baseline approach, called the binary relevance method,[2][1] amounts to independently training one binary classifier for each label. Given an unseen sample, the combined model then predicts all labels for this sample for which the respective classifiers predict a positive result. This method of dividing the task into multiple binary tasks has something in common with the one-vs.-all (OvA, or one-vs.-rest, OvR) method for multiclass classification. Note though that it is not the same method: in binary relevance we train one classifier for each label, not one classifier for each possible value for the label.

Various other transformations exist. Of these, the label powerset (LP) transformation creates one binary classifier for every label combination attested in the training set.[1] The random k-labelsets (RAKEL) algorithm uses multiple LP classifiers, each trained on a random subset of the actual labels; prediction using this ensemble method proceeds by a voting scheme.[3]

Classifier chains are an alternative ensembling method [2] that have been applied, for instance, in HIV drug resistance prediction.[4]

Adapted algorithms for multi-label classification

Some classification algorithms/models have been adaptated to the multi-label task, without requiring problem transformations. Examples of these include:

Statistics and evaluation metrics

The extent to which a dataset is multi-label can be captured in two statistics:[1]

  • Label cardinality is the average number of labels per example in the set: \frac{1}{N} \sum_{i=1}^N |Y_i|;
  • label density is the number of labels per sample divided by the total number of labels, averaged over the samples: \frac{1}{N} \sum _{i=1}^N \frac{|Y_i|}{|L|} where L = \bigcup_{i=1}^N Y_i.

Evaluation metrics for multi-label classification performance are inherently different from those used in multi-class (or binary) classification, due to the inherent differences of the classification problem. If T denotes the true set of labels for a given sample, and P the predicted set of labels, then the following metrics can be defined on that sample:

  • Hamming loss: the fraction of the wrong labels to the total number of labels, i.e. \frac{1}{|N|} \sum_{i=1}^{|N|} \frac{ \sum_{j=1}^{|L|} xor(y_{i,j}, z_{i,j})}{|L|}, where y_{i,j} is the target and z_{i,j} is the prediction. This is a loss function, so the optimal value is zero.
  • The closely related Hamming score, also called accuracy in the multi-label setting, is defined as the number of correct labels divided by the union of predicted and true labels, \frac{|T \cap P|}{|T \cup P|}.[8]
  • Precision, recall and F_1 score: precision is \frac{|T \cap P|}{|P|}, recall is \frac{|T \cap P|}{|T|}, and F_1 is their harmonic mean.[8]
  • Exact match: is the most strict metric, indicating the percentage of samples that have all their labels classified correctly.

Cross-validation in multi-label settings is complicated by the fact that the ordinary (binary/multiclass) way of stratified sampling will not work; alternative ways of approximate stratified sampling have been suggested.[9]

Implementations and datasets

Java implementations of multi-label algorithms are available in the Mulan and Meka software packages, both based on Weka.

The scikit-learn python package implements some multi-labels algorithms and metrics.

A list of commonly used multi-label data-sets is available at the Mulan website.

See also


  1. ^ a b c d Tsoumakas, Grigorios; Katakis, Ioannis (2007). "Multi-label classification: an overview" (PDF). International Journal of Data Warehousing & Mining 3 (3): 1–13.  
  2. ^ a b Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank. Classifier Chains for Multi-label Classification. Machine Learning Journal. Springer. Vol. 85(3), (2011).
  3. ^ Tsoumakas, Grigorios; Vlahavas, Ioannis (2007). Random k-labelsets: An ensemble method for multilabel classification (PDF). ECML. 
  4. ^ Heider, D; Senge, R; Cheng, W; Hüllermeier, E (2013). "Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction". Bioinformatics (Oxford, England) 29 (16): 1946–52.  
  5. ^ Zhang, M.L.; Zhou, Z.H. (2007). "ML-KNN: A lazy learning approach to multi-label learning". Pattern Recognition 40 (7): 2038–2048.  
  6. ^ Madjarov, Gjorgji; Kocev, Dragi; Gjorgjevikj, Dejan; Džeroski, Sašo (2012). "An extensive experimental comparison of methods for multi-label learning". Pattern Recognition 45 (9): 3084–3104.  
  7. ^ Zhang, M.L.; Zhou, Z.H. (2006). Multi-label neural networks with applications to functional genomics and text categorization (PDF). IEEE Transactions on Knowledge and Data Engineering 18. pp. 1338–1351. 
  8. ^ a b Godbole, Shantanu; Sarawagi, Sunita (2004). Discriminative methods for multi-labeled classification (PDF). Advances in Knowledge Discovery and Data Mining. pp. 22–30. 
  9. ^ Sechidis, Konstantinos; Tsoumakas, Grigorios; Vlahavas, Ioannis (2011). On the stratification of multi-label data (PDF).  

Further reading

  • Madjarov, Gjorgji; Kocev, Dragi; Gjorgjevikj, Dejan; Džeroski, Sašo (2012). "An extensive experimental comparison of methods for multi-label learning". Pattern Recognition 45 (9): 3084–3104.  
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.

Copyright © World Library Foundation. All rights reserved. eBooks from World eBook Library are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.