Iris_Hands-On Ensemble Learning with R-QQ阅读男频玄幻网

书名：Hands-On Ensemble Learning with R
作者名：Prabhanjan Narayanachar Tattar
本章字数：105字
更新时间：2025-04-04 16:30:55

Iris

Iris is probably the most famous classification dataset. The great statistician Sir R. A. Fisher popularized the dataset, which he used for classifying the three types of iris plants based on length and width measurements of their petals and sepals. Fisher used this dataset to pioneer the invention of the statistical classifier linear discriminant analysis. Since there are three species of iris, we converted this into a binary classification problem, separated the dataset, and created a formula as seen here:

> data("iris")
> ir2 <- iris
> ir2$Species <- ifelse(ir2$Species=="setosa","S","NS")
> ir2$Species <- as.factor(ir2$Species)
> set.seed(12345)
> Train_Test <- sample(c("Train","Test"),nrow(ir2),replace = TRUE,prob=c(0.7,0.3))
> head(Train_Test)
[1] "Test"  "Test"  "Test"  "Test"  "Train" "Train"
> ir2_Train <- ir2[Train_Test=="Train",]
> ir2_TestX <- within(ir2[Train_Test=="Test",],rm(Species))
> ir2_TestY <- ir2[Train_Test=="Test","Species"]
> ir2_Formula <- as.formula("Species~.")