Random Forest on iOS

This chapter will provide you with an overview of the random forest algorithm. We will first look at the decision tree algorithm and, once we have a handle on it, try to understand the random forest algorithm. Then, we will use Core ML to create a machine learning program that leverages the random forest algorithm and predicts the possibility of a patient being diagnosed with breast cancer based on a given set of breast cancer patient data.

As we already saw in Chapter 1Introduction to Machine Learning on Mobile, any machine learning program has four phases: define the machine learning problem, prepare the data, build/rebuild/test the model, and deploy it for usage. In this chapter, we will try to relate these with random forest and solve the underlying machine learning problem.

Problem definition: The breast cancer data for certain patients is provided and we want to predict the possibility of diagnosing breast cancer for a new data item.

We will be covering the following topics:

  • Understanding decision trees and how to apply them to solve an ML problem
  • Understanding decision trees through a sample dataset and Excel
  • Understanding random forests
  • Solving the problem using a random forest in Core ML:
    • Technical requirements
    • Creating a model file using the scikit-learn and pandas libraries
    • Testing the model
    • Importing the scikit-learn model into the Core ML project
    • Writing an iOS mobile application and using the scikit-learn model in it to perform the breast cancer prediction