Business understanding

In keeping with the water conservation/prediction theme, let's look at another dataset in the alr3 package, appropriately named water. During the writing of the first edition of this book, the severe drought in Southern California caused much alarm. Even the Governor, Jerry Brown, began to take action with a call to citizens to reduce water usage by 20 percent. For this exercise, let's say we have been commissioned by the state of California to predict water availability. The data provided to us contains 43 years of snow precipitation, measured at six different sites in the Owens Valley. It also contains a response variable for water availability as the stream runoff volume near Bishop, California, which feeds into the Owens Valley aqueduct, and eventually the Los Angeles aqueduct. Accurate predictions of the stream runoff will allow engineers, planners, and policy makers to plan conservation measures more effectively. The model we are looking to create will consist of the form Y = B0 + B1x1 +...Bnxn + e, where the predictor variables (features) can be from 1 to n.