Advanced Machine Learning with R
Cory Lesmeister Dr. Sunil Kumar Chinnamgari更新时间:2021-06-24 14:25:35
最新章节:Leave a review - let other readers know what you thinkcoverpage
Title Page
Copyright and Credits
Advanced Machine Learning with R
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Reviews
Preparing and Understanding Data
Overview
Reading the data
Handling duplicate observations
Descriptive statistics
Exploring categorical variables
Handling missing values
Zero and near-zero variance features
Treating the data
Correlation and linearity
Summary
Linear Regression
Univariate linear regression
Building a univariate model
Reviewing model assumptions
Multivariate linear regression
Loading and preparing the data
Modeling and evaluation – stepwise regression
Modeling and evaluation – MARS
Reverse transformation of natural log predictions
Summary
Logistic Regression
Classification methods and linear regression
Logistic regression
Model training and evaluation
Training a logistic regression algorithm
Weight of evidence and information value
Feature selection
Cross-validation and logistic regression
Multivariate adaptive regression splines
Model comparison
Summary
Advanced Feature Selection in Linear Models
Regularization overview
Ridge regression
LASSO
Elastic net
Data creation
Modeling and evaluation
Ridge regression
LASSO
Elastic net
Summary
K-Nearest Neighbors and Support Vector Machines
K-nearest neighbors
Support vector machines
Manipulating data
Dataset creation
Data preparation
Modeling and evaluation
KNN modeling
Support vector machine
Summary
Tree-Based Classification
An overview of the techniques
Understanding a regression tree
Classification trees
Random forest
Gradient boosting
Datasets and modeling
Classification tree
Random forest
Extreme gradient boosting – classification
Feature selection with random forests
Summary
Neural Networks and Deep Learning
Introduction to neural networks
Deep learning – a not-so-deep overview
Deep learning resources and advanced methods
Creating a simple neural network
Data understanding and preparation
Modeling and evaluation
An example of deep learning
Keras and TensorFlow background
Loading the data
Creating the model function
Model training
Summary
Creating Ensembles and Multiclass Methods
Ensembles
Data understanding
Modeling and evaluation
Random forest model
Creating an ensemble
Summary
Cluster Analysis
Hierarchical clustering
Distance calculations
K-means clustering
Gower and PAM
Gower
PAM
Random forest
Dataset background
Data understanding and preparation
Modeling
Hierarchical clustering
K-means clustering
Gower and PAM
Random forest and PAM
Summary
Principal Component Analysis
An overview of the principal components
Rotation
Data
Data loading and review
Training and testing datasets
PCA modeling
Component extraction
Orthogonal rotation and interpretation
Creating scores from the components
Regression with MARS
Test data evaluation
Summary
Association Analysis
An overview of association analysis
Creating transactional data
Data understanding
Data preparation
Modeling and evaluation
Summary
Time Series and Causality
Univariate time series analysis
Understanding Granger causality
Time series data
Data exploration
Modeling and evaluation
Univariate time series forecasting
Examining the causality
Linear regression
Vector autoregression
Summary
Text Mining
Text mining framework and methods
Topic models
Other quantitative analysis
Data overview
Data frame creation
Word frequency
Word frequency in all addresses
Lincoln's word frequency
Sentiment analysis
N-grams
Topic models
Classifying text
Data preparation
LASSO model
Additional quantitative analysis
Summary
Exploring the Machine Learning Landscape
ML versus software engineering
Types of ML methods
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Transfer learning
ML terminology – a quick review
Deep learning
Big data
Natural language processing
Computer vision
Cost function
Model accuracy
Confusion matrix
Predictor variables
Response variable
Dimensionality reduction
Class imbalance problem
Model bias and variance
Underfitting and overfitting
Data preprocessing
Holdout sample
Hyperparameter tuning
Performance metrics
Feature engineering
Model interpretability
ML project pipeline
Business understanding
Understanding and sourcing the data
Preparing the data
Model building and evaluation
Model deployment
Learning paradigm
Datasets
Summary
Predicting Employee Attrition Using Ensemble Models
Philosophy behind ensembling
Getting started
Understanding the attrition problem and the dataset
K-nearest neighbors model for benchmarking the performance
Bagging
Bagged classification and regression trees (treeBag) implementation
Support vector machine bagging (SVMBag) implementation
Naive Bayes (nbBag) bagging implementation
Randomization with random forests
Implementing an attrition prediction model with random forests
Boosting
The GBM implementation
Building attrition prediction model with XGBoost
Stacking
Building attrition prediction model with stacking
Summary
Implementing a Jokes Recommendation Engine
Fundamental aspects of recommendation engines
Recommendation engine categories
Content-based filtering
Collaborative filtering
Hybrid filtering
Getting started
Understanding the Jokes recommendation problem and the dataset
Converting the DataFrame
Dividing the DataFrame
Building a recommendation system with an item-based collaborative filtering technique
Building a recommendation system with a user-based collaborative filtering technique
Building a recommendation system based on an association-rule mining technique
The Apriori algorithm
Content-based recommendation engine
Differentiating between ITCF and content-based recommendations
Building a hybrid recommendation system for Jokes recommendations
Summary
References
Sentiment Analysis of Amazon Reviews with NLP
The sentiment analysis problem
Getting started
Understanding the Amazon reviews dataset
Building a text sentiment classifier with the BoW approach
Pros and cons of the BoW approach
Understanding word embedding
Building a text sentiment classifier with pretrained word2vec word embedding based on Reuters news corpus
Building a text sentiment classifier with GloVe word embedding
Building a text sentiment classifier with fastText
Summary
Customer Segmentation Using Wholesale Data
Understanding customer segmentation
Understanding the wholesale customer dataset and the segmentation problem
Categories of clustering algorithms
Identifying the customer segments in wholesale customer data using k-means clustering
Working mechanics of the k-means algorithm
Identifying the customer segments in the wholesale customer data using DIANA
Identifying the customer segments in the wholesale customers data using AGNES
Summary
Image Recognition Using Deep Neural Networks
Technical requirements
Understanding computer vision
Achieving computer vision with deep learning
Convolutional Neural Networks
Layers of CNNs
Introduction to the MXNet framework
Understanding the MNIST dataset
Implementing a deep learning network for handwritten digit recognition
Implementing dropout to avoid overfitting
Implementing the LeNet architecture with the MXNet library
Implementing computer vision with pretrained models
Summary
Credit Card Fraud Detection Using Autoencoders
Machine learning in credit card fraud detection
Autoencoders explained
Types of AEs based on hidden layers
Types of AEs based on restrictions
Applications of AEs
The credit card fraud dataset
Building AEs with the H2O library in R
Autoencoder code implementation for credit card fraud detection
Summary
Automatic Prose Generation with Recurrent Neural Networks
Understanding language models
Exploring recurrent neural networks
Comparison of feedforward neural networks and RNNs
Backpropagation through time
Problems and solutions to gradients in RNN
Exploding gradients
Vanishing gradients
Building an automated prose generator with an RNN
Implementing the project
Summary
Winning the Casino Slot Machines with Reinforcement Learning
Understanding RL
Comparison of RL with other ML algorithms
Terminology of RL
The multi-arm bandit problem
Strategies for solving MABP
The epsilon-greedy algorithm
Boltzmann or softmax exploration
Decayed epsilon greedy
The upper confidence bound algorithm
Thompson sampling
Multi-arm bandit – real-world use cases
Solving the MABP with UCB and Thompson sampling algorithms
Summary
Creating a Package
Creating a new package
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
更新时间:2021-06-24 14:25:35