下一章 上一章 目录 设置
12、012 ...
-
4.3.2 Confidence interval: using the empirical Bootstrap method
In general, the judgment of the population mean of an unknown distribution is made under the central limit theorem and the assumption of a normal distribution. However, if the sample number of an unknown distribution is insufficient, it is difficult to establish that the mean approximation follows the normal distribution indicated by the central limit theorem. In this case, the confidence interval of the mean calculated based on the t distribution is not accurate enough. Therefore, for the small sample data, it is necessary to introduce the Bootstrap method to obtain the confidence intervals of the parameters. Bootstrap is a kind of non-parametric Monte Carlo method, whose essence is to resample the observation information and then statistically infer the distribution characteristics of the population. In other words, Bootstrap method is a resampling technology in statistical learning. Here, we demonstrate this method with the confidence level of 95% as an example.
Firstly, under the traditional method, we assume that the sample mean is , the overall mean is , and , so if we know the distribution of , we can easily get the confidence interval of is []. However, in the background of small samples, the distribution of is difficult to know, so we have to use the Bootstrap principle to improve the traditional method. The introduced parameter represents the difference between the mean calculated by the Bootstrap sample and the original sample mean, that is . According to the law of large numbers, when the number of samples obtained by Bootstrap method is large enough, the distribution of is the approximation of the distribution of . At this moment, and can be used as estimates of and , so as to calculate the confidence interval of as []. In this project, this method can be used to replace the classical method of solving confidence intervals in Part 4.2.