Ridge regression

Let's begin by exploring what ridge regression is and what it can and cannot do for you. With ridge regression, the normalization term is the sum of the squared weights, referred to as an L2-norm. Our model is trying to minimize RSS + λ(sum Bj2). As lambda increases, the coefficients shrink toward zero but never become zero. The benefit may be an improved predictive accuracy, but as it does not zero out the weights for any of your features, it could lead to issues in the model's interpretation and communication. To help with this problem, we will turn to LASSO.