Performance Evaluation for Learning Algorithms: Techniques, Applications and Issues
 

Presenters:


Nathalie Japkowicz, University of Ottawa, Canada

Mohak Shah, GE Global Research, USA



Logistics:


Date and Time: June 26, 2012 ; 9:00am - 11:30am

Venue: Appleton Tower (LT1), University of Edinburgh, Scotland (refer to conference website for directions).

At: International Conference on Machine Learning, 2012, Edinburgh, Scotland

Duration: 2.5 hrs




Tutorial Slides: ICML2012-Tutorial.pdf (11 MB)




Tutorial Overview:


Machine learning has matured as a field with many sophisticated learning approaches being imported to practical applications. Due to its inherent interdisciplinary nature the field draws researchers from varying backgrounds. It is of critical importance in such a case that the researchers are aware of both the proper methodologies and the respective issues that arise in terms of evaluating novel learning approaches. This tutorial aims at educating as well as getting the machine-learning community to discuss these critical issues in the performance evaluation of learning algorithms. The tutorial will focus on various aspects of the evaluation process with a focus on classification algorithms and highlight the various choice decision vis-à-vis the issues, assumptions and constraints involved in doing so.


The tutorial will span four areas of classifier evaluation:


o Performance Measures (evaluation metrics and graphical methods)

o Error Estimation/Re-sampling Techniques

o Statistical Significance Testing

o Issues in data Set Selection and evaluation benchmarks design


It will also briefly cover R and WEKA tools that can be utilized to apply them.


The purpose of the tutorial is to promote an appreciation of the need for rigorous and objective evaluation and an understanding of the available alternatives along with their assumptions, constraints and context of application. Machine learning researchers and practitioners alike will all benefit from the contents of the tutorial, which discusses the need for sound evaluation strategies, practical approaches and tools for evaluation, going well beyond those described in existing machine learning and data mining textbooks, so far.


The tutorial will be, in part, based on a recently published book by the presenters on the subject (please follow the link for more information):


  1. N.Japkowicz and M. Shah,

Evaluating Learning Algorithms: A Classification Perspective,

Cambridge University Press, 2011





Prerequisites:


Familiarity with basic machine learning algorithms (esp. supervised classification algorithms), and basic concepts of probability and statistics, are expected.



Additional Readings and Resources:


In addition to the above mentioned book, following resources (available from respective Presenter’s webpage):


M. Shah. Generalized Agreement Statistics over Fixed Group of Experts, in Proceedings of European Conference on Machine Learning (ECML), Part III, LNAI 6913, pp. 191—206, Springer, 2011.

(Proposes two novel statistical performance metrics for agreement assessment from multiple raters)


W. Klement, P. Flach, N. Japkowicz and S. Matwin, Smooth Receiver Operating Characteristics (smROC) Curves, in Proceedings of the 2011 ECML-PKDD Conference.

(Introduces smROC Curves)


  1. M.Shah, et al., Evaluating Intensity Normalization on MRIs of human brain with Multiple Sclerosis, in Medical Image Analysis 15, pp. 267—282, 2011.

(An applied take on practical evaluation issues in medical imaging)


C. Drummond and N. Japkowicz, Warning: Statistical Benchmarking is Addictive. Kicking the Habit in Machine Learning, in Journal of Experimental and Theoretical Artificial Intelligence (JETAI), 22:67-80, 2010 .


  1. M.Shah and S. Shanian. Hold-out Risk Bounds for Classifier Performance Evaluation, in the 4th ICML Workshop on Evaluation Methods for Machine Learning, Montreal, Canada, 2009.

(Demonstrates utility of test set bounds over traditional confidence intervals)


  1. R.Alaiz-Rodriguez, N. Japkowicz, and P. Tischer, Visualizing Classifier Performance, in the Proceedings of the 20th IEEE International Conference on Tools for Artificial Intelligence (ICTAI'2008)

(Introduces a visualization framework for evaluation. Refer to the book webpage to access the software)


M. Shah. Risk Bounds for Classifier Evaluation: Possibilities and Challanges, in 3rd ICML Workshop on Evaluation Methods for Machine Learning, Helsinki, Finland, 2008.


N. Japkowicz, P. Sanghi, and P. Tischer, A Projection-Based Framework for Classifier Performance Evaluation, in the Proceedings of the ECML PKDD 2008 Conference .


R. Alaiz-Rodriguez and N. Japkowicz, Assessing the Impact of Changing environments on Classifier Performance, in the Proceedings of the 21st Canadian Conference in Artificial Intelligence (AI'2008) .


W. Elazmeh, N. Japkowicz, and S. Matwin, A Framework for Measuring Classification Difference with Imbalance, in the Proceedings of the 2006 European Conference on Machine Learning (ECML2006).


J. Souza, S. Matwin, and N. Japkowicz, Evaluating Data Mining Models: A Pattern Language, in the Proceedings of the 9th Conference on Pattern Language of Programs (PLOP'2002)