Chapter 2. Bootstrapping

As seen in the previous chapter, statistical inference is enhanced to a very large extent with the use of computational power. We also looked at the process of permutation tests, wherein the same test is applied multiple times for the resamples of the given data under the (null) hypothesis. The rationale behind resampling methods is also similar; we believe that if the sample is truly random and the observations are generated from the same identical distribution, we have a valid reason to resample the same set of observations with replacements. This is because any observation might as well occur multiple times rather than as a single instance.

This chapter will begin with a formal definition of resampling, followed by a look at the jackknife technique. This will be applied to multiple, albeit relatively easier, problems, and we will look at the definition of the pseudovalues first. The bootstrap method, invented by Efron, is probably the most useful resampling method. We will study this concept thoroughly and vary the applications from simple cases to regression models.

In this chapter, we will cover the following:

The jackknife technique: Our first resampling method that enables bias reduction
Bootstrap: A statistical method and generalization of the jackknife method
The boot package: The main R package for bootstrap methods
Bootstrap and testing hypothesis: Using the bootstrap method for hypothesis testing
Bootstrapping regression models: Applying the bootstrap method to the general regression model
Bootstrapping survival models: Applying the bootstrap method for the survival data
Bootstrapping time series models: The bootstrap method for the time series data – observations are dependent here