Prediction Intervals of Machine Learning Models for Taxi Trip Length

Link to Paper
Link to Code

Abstract: Errors are always present in predictions produced by machine learning models. Producing a quantitative estimate of the uncertainty in a model’s output is crucial for many fields, especially those where predictive models drive important decisions. In this paper we discuss two methods for producing confidence estimates for neural network, random forest, and gradient boosted tree models. We then evaluate the prediction intervals produced by each algorithm by predicting expected ride length for a NYC taxi trip dataset. We show that inductive conformal prediction produces the most reliable intervals for all machine learning models investigated.

This work was presented at AMMCS 2019 in Waterloo, ON and published in the Conference Proceedings.