书名：Hands-On Neural Networks with Keras
作者名：Niloy Purkait
本章字数：260字
更新时间：2025-04-04 14:37:32

The curse of dimensionality

The natural question that follows is: how exactly do we build a model? Long story short, the features that we choose to collect while observing an outcome can all be plotted on a high-dimensional space. While this may sound complicated, it is just an extension of the Cartesian coordinate system that you may be familiar with from high school mathematics. Let's recall how to represent a single point on a graph, using the Cartesian coordinate system. For this task, we require two values, x and y. This is an example of a two-dimensional feature space, with the x and y axis each being a dimension in the representational space. Add a z axis, and we get a three-dimensional feature space. Essentially, we define ML problems in an n-dimensional feature space, where n refers to the number of features that we have on the phenomenon we are trying to predict. In our previous case of predicting viewer preference, if we solely use the Big Five personality test scores as input features, we will essentially have a five-dimensional feature space, where each dimension corresponds to a person's score on one of the five personality dimensions. In fact, modern ML problems can range from 100 to 100,000 dimensions (and sometimes even more). Since the number of possible configurations of features increases exponentially with respect to increases in the number of different features, it becomes quite hard, even for computers, to conceive and compute in such proportions. This problem in ML is generally referred to as the curse of dimensionality.