Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers | Department of Industrial and Systems Engineering

Hiva Ghanbari
PhD Candidate of ISE
Lehigh University
Tuesday, March 6, 2018 9:30-10:30
JDT 500

ABSTRACT:

The predictive quality of the most machine learning models is measured by the expected prediction error or so-called Area Under the Curve (AUC). However, these functions are not used in the empirical loss minimization, because their empirical approximations are nonconvex and non-smooth, and more importantly have zero derivative almost everywhere. Instead, other loss functions are used, such as the logistic loss. In this work, we show that in the case of linear predictors, and under the assumption that the data has normal distribution, the expected error and the expected AUC are not only smooth, but have well defined derivatives, which can be computed given on the first and second moments of the normal distribution. We show that these derivatives can be also approximated and used in the empirical risk minimization, thus proposing gradient-based optimization methods for direct optimization of prediction error and AUC is possible. Moreover, the proposed algorithm has no dependence on the size of the data set, unlike logistic regression and all other well-known empirical risk minimization techniques.

BIO:

Hiva Ghanbari is a PhD candidate in Industrial and Systems Engineering Department at Lehigh University. She has Bachelor and Master of Science in Industrial Engineering from Sharif University of Technology in Tehran, Iran. Her research lies at the intersection of optimization, statistics, and computer science; more specifically, she is interested in designing nonlinear optimization algorithms for machine learning problems.