An introduction to statistical learning with applications in R

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Saved in:
Bibliographic Details
Main Authors: James, Gareth, Witten, Daniela autor/a, Hastie, Trevor autor/a, Tibshirani, Robert autor/a
Format: Texto biblioteca
Language:eng
Published: Nueva York Springer Science+Business Media 2013
Subjects:Estadística matemática, Métodos estadísticos, R (Lenguaje de programación para computadora),
Tags: Add Tag
No Tags, Be the first to tag this record!
id KOHA-OAI-ECOSUR:53755
record_format koha
institution ECOSUR
collection Koha
country México
countrycode MX
component Bibliográfico
access En linea
Fisico
databasecode cat-ecosur
tag biblioteca
region America del Norte
libraryname Sistema de Información Bibliotecario de ECOSUR (SIBE)
language eng
topic Estadística matemática
Métodos estadísticos
R (Lenguaje de programación para computadora)
Estadística matemática
Métodos estadísticos
R (Lenguaje de programación para computadora)
spellingShingle Estadística matemática
Métodos estadísticos
R (Lenguaje de programación para computadora)
Estadística matemática
Métodos estadísticos
R (Lenguaje de programación para computadora)
James, Gareth
Witten, Daniela autor/a
Hastie, Trevor autor/a
Tibshirani, Robert autor/a
An introduction to statistical learning with applications in R
description An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
format Texto
topic_facet Estadística matemática
Métodos estadísticos
R (Lenguaje de programación para computadora)
author James, Gareth
Witten, Daniela autor/a
Hastie, Trevor autor/a
Tibshirani, Robert autor/a
author_facet James, Gareth
Witten, Daniela autor/a
Hastie, Trevor autor/a
Tibshirani, Robert autor/a
author_sort James, Gareth
title An introduction to statistical learning with applications in R
title_short An introduction to statistical learning with applications in R
title_full An introduction to statistical learning with applications in R
title_fullStr An introduction to statistical learning with applications in R
title_full_unstemmed An introduction to statistical learning with applications in R
title_sort introduction to statistical learning with applications in r
publisher Nueva York Springer Science+Business Media
publishDate 2013
work_keys_str_mv AT jamesgareth anintroductiontostatisticallearningwithapplicationsinr
AT wittendanielaautora anintroductiontostatisticallearningwithapplicationsinr
AT hastietrevorautora anintroductiontostatisticallearningwithapplicationsinr
AT tibshiranirobertautora anintroductiontostatisticallearningwithapplicationsinr
AT jamesgareth introductiontostatisticallearningwithapplicationsinr
AT wittendanielaautora introductiontostatisticallearningwithapplicationsinr
AT hastietrevorautora introductiontostatisticallearningwithapplicationsinr
AT tibshiranirobertautora introductiontostatisticallearningwithapplicationsinr
_version_ 1762930765678510080
spelling KOHA-OAI-ECOSUR:537552023-03-18T12:26:47ZAn introduction to statistical learning with applications in R James, Gareth Witten, Daniela autor/a Hastie, Trevor autor/a Tibshirani, Robert autor/a textNueva York Springer Science+Business Media2013engAn Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.Incluye índice: páginas 419-426Preface.. 1 Introduction.. 2 Statistical Learning.. 2.1 What Is Statistical Learning?.. 2.1.1 Why Estimate f?.. 2.1.2 How Do We Estimate f?.. 2.1.3 The Trade-Off Between Prediction Accuracy and Model Interpretability.. 2.1.4 Supervised Versus Unsupervised Learning.. 2.1.5 Regression Versus Classification Problems.. 2.2 Assessing Model Accuracy.. 2.2.1 Measuring the Quality of Fit.. 2.2.2 The Bias-Variance Trade-Off.. 2.2.3 The Classification Setting.. 2.3 Lab: Introduction to R.. 2.3.1 Basic Commands.. 2.3.2 Graphics.. 2.3.3 Indexing Data.. 2.3.4 Loading Data.. 2.3.5 Additional Graphical and Numerical Summaries.. 2.4 Exercises.. 3 Linear Regression.. 3.1 Simple Linear Regression.. 3.1.1 Estimating the Coefficients.. 3.1.2 Assessing the Accuracy of the Coefficient Estimates.. 3.1.3 Assessing the Accuracy of the Model.. 3.2 Multiple Linear Regression.. 3.2.1 Estimating the Regression Coefficients.. 3.2.2 Some Important Questions.. 3.3 Other Considerations in the Regression Model.. 3.3.1 Qualitative Predictors.. 3.3.2 Extensions of the Linear Model.. 3.3.3 Potential Problems.. 3.4 The Marketing Plan.. 3.5 Comparison of Linear Regression with K-Nearest Neighbors.. 3.6 Lab: Linear Regression.. 3.6.1 Libraries.. 3.6.2 Simple Linear Regression.. 3.6.3 Multiple Linear Regression.. 3.6.4 Interaction Terms.. 3.6.5 Non-linear Transformations of the Predictors.. 3.6.6 Qualitative Predictors.. 3.6.7 Writing Functions.. 3.7 Exercises.. 4 Classification.. 4.1 An Overview of Classification.. 4.2 Why Not Linear Regression?.. 4.3 Logistic Regression.. 4.3.1 The Logistic Model.. 4.3.2 Estimating the Regression Coefficients.. 4.3.3 Making Predictions.. 4.3.4 Multiple Logistic Regression.. 4.3.5 Logistic Regression for >2 Response Classes.. 4.4 Linear Discriminant Analysis.. 4.4.1 Using Bayes' Theorem for Classification.. 4.4.2 Linear Discriminant Analysis for p = 1.. 4.4.3 Linear Discriminant Analysis for p >14.4.4 Quadratic Discriminant Analysis.. 4.5 A Comparison of Classification Methods.. 4.6 Lab: Logistic Regression, LDA, QDA, and KNN.. 4.6.1 The Stock Market Data.. 4.6.2 Logistic Regression.. 4.6.3 Linear Discriminant Analysis.. 4.6.4 Quadratic Discriminant Analysis.. 4.6.5 K-Nearest Neighbors.. 4.6.6 An Application to Caravan Insurance Data.. 4.7 Exercises.. 5 Resampling Methods.. 5.1 Cross-Validation.. 5.1.1 The Validation Set Approach.. 5.1.2 Leave-One-Out Cross-Validation.. 5.1.3 k-Fold Cross-Validation.. 5.1.4 Bias-Variance Trade-Off for k-Fold Cross-Validation.. 5.1.5 Cross-Validation on Classification Problems.. 5.2 The Bootstrap.. 5.3 Lab: Cross-Validation and the Bootstrap.. 5.3.1 The Validation Set Approach.. 5.3.2 Leave-One-Out Cross-Validation.. 5.3.3 k-Fold Cross-Validation.. 5.3.4 The Bootstrap.. 5.4 Exercises.. 6 Linear Model Selection and Regularization.. 6.1 Subset Selection.. 6.1.1 Best Subset Selection.. 6.1.2 Stepwise Selection.. 6.1.3 Choosing the Optimal Model.. 6.2 Shrinkage Methods.. 6.2.1 Ridge Regression.. 6.2.2 The Lasso.. 6.2.3 Selecting the Tuning Parameter.. 6.3 Dimension Reduction Methods.. 6.3.1 Principal Components Regression.. 6.3.2 Partial Least Squares.. 6.4 Considerations in High Dimensions.. 6.4.1 High-Dimensional Data.. 6.4.2 What Goes Wrong in High Dimensions?.. 6.4.3 Regression in High Dimensions.. 6.4.4 Interpreting Results in High Dimensions.. 6.5 Lab 1: Subset Selection Methods.. 6.5.1 Best Subset Selection.. 6.5.2 Forward and Backward Stepwise Selection.. 6.5.3 Choosing Among Models Using the Validation Set Approach and Cross-Validation.. 6.6 Lab 2: Ridge Regression and the Lasso.. 6.6.1 Ridge Regression.. 6.6.2 The Lasso.. 6.7 Lab 3: PCR and PLS Regression.. 6.7.1 Principal Components Regression.. 6.7.2 Partial Least Squares.. 6.8 Exercises.. 7 Moving Beyond Linearity.. 7.1 Polynomial Regression.. 7.2 Step Functions.. 7.3 Basis Functions.. 7.4 Regression Splines7.4.1 Piecewise Polynomials.. 7.4.2 Constraints and Splines.. 7.4.3 The Spline Basis Representation.. 7.4.4 Choosing the Number and Locations of the Knots.. 7.4.5 Comparison to Polynomial Regression.. 7.5 Smoothing Splines.. 7.5.1 An Overview of Smoothing Splines.. 7.5.2 Choosing the Smoothing Parameter λ.. 7.6 Local Regression.. 7.7 Generalized Additive Models.. 7.7.1 GAMs for Regression Problems.. 7.7.2 GAMs for Classification Problems.. 7.8 Lab: Non-linear Modeling.. 7.8.1 Polynomial Regression and Step Functions.. 7.8.2 Splines.. 7.8.3 GAMs.. 7.9 Exercises.. 8 Tree-Based Methods.. 8.1 The Basics of Decision Trees.. 8.1.1 Regression Trees.. 8.1.2 Classification Trees.. 8.1.3 Trees Versus Linear Models.. 8.1.4 Advantages and Disadvantages of Trees.. 8.2 Bagging, Random Forests, Boosting.. 8.2.1 Bagging.. 8.2.2 Random Forests.. 8.2.3 Boosting.. 8.3 Lab: Decision Trees.. 8.3.1 Fitting Classification Trees.. 8.3.2 Fitting Regression Trees.. 8.3.3 Bagging and Random Forests.. 8.3.4 Boosting.. 8.4 Exercises.. 9 Support Vector Machines.. 9.1 Maximal Margin Classifier.. 9.1.1 What Is a Hyperplane?.. 9.1.2 Classification Using a Separating Hyperplane.. 9.1.3 The Maximal Margin Classifier.. 9.1.4 Construction of the Maximal Margin Classifier.. 9.1.5 The Non-separable Case.. 9.2 Support Vector Classifiers.. 9.2.1 Overview of the Support Vector Classifier.. 9.2.2 Details of the Support Vector Classifier.. 9.3 Support Vector Machines.. 9.3.1 Classification with Non-linear Decision Boundaries.. 9.3.2 The Support Vector Machine.. 9.3.3 An Application to the Heart Disease Data.. 9.4 SVMs with More than Two Classes.. 9.4.1 One-Versus-One Classification.. 9.4.2 One-Versus-All Classification.. 9.5 Relationship to Logistic Regression.. 9.6 Lab: Support Vector Machines.. 9.6.1 Support Vector Classifier.. 9.6.2 Support Vector Machine.. 9.6.3 ROC Curves.. 9.6.4 SVM with Multiple Classes.. 9.6.5 Application to Gene Expression Data..9.7 Exercises.. 10 Unsupervised Learning.. 10.1 The Challenge of Unsupervised Learning.. 10.2 Principal Components Analysis.. 10.2.1 What Are Principal Components?.. 10.2.2 Another Interpretation of Principal Components.. 10.2.3 More on PCA.. 10.2.4 Other Uses for Principal Components.. 10.3 Clustering Methods.. 10.3.1 K-Means Clustering.. 10.3.2 Hierarchical Clustering.. 10.3.3 Practical Issues in Clustering.. 10.4 Lab 1: Principal Components Analysis.. 10.5 Lab 2: Clustering.. 10.5.1 K-Means Clustering.. 10.5.2 Hierarchical Clustering.. 10.6 Lab 3: NCI60 Data Example.. 10.6.1 PCA on the NCI60 Data.. 10.6.2 Clustering the Observations of the NCI60 Data.. 10.7 Exercises.. IndexAn Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.Estadística matemáticaMétodos estadísticosR (Lenguaje de programación para computadora)URN:ISBN:1461471370URN:ISBN:9781461471370