Monte Carlo simulation, resampling

本文主要是介绍Monte Carlo simulation, resampling，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

目的：

原因：工具能做什么

是什么：Computer simulation that generates are large number of simulated samples of data based on an assumed Data Generating Process (DGP) that characterizes the population from which the simulated samples are drawn.

Patterns in those simulated samples are then summarized and described.
Such patterns can be evaluated in terms of substantive theory or in terms of the statistical properties of some

estimator

是什么：Data Generating Process (DGP)

A DGP describes how a values of a variable of interest are produced in the population.
Most DGP’s of interest include a systematic component and a stochastic component.
We use statistical analysis to infer characteristics of the DGP by analyzing observable data sampled from the population.
In applied statistical work, we never know the DGP – if we did, we wouldn’t need statistical estimates of it.
In Monte Carol simulations, we do know the DGP because we create it.

是什么：Resampling

Like Monte Carlo simulations, resampling methods use a computer to generate a large number of simulated samples of data.
Also like Monte Carlo simulations, patterns in these simulated samples are then summarized, and the results used

to evaluate substantive theory or statistical estimators.

What is different is that the simulated samples are generated by drawning new samples (with replacement) from the sample of data you have
In resampling methods, the researcher DOES NOT know or control the DGP, but the goal of learning about the DGP remains the same.

Monte Carlo Simulation of OLS

Know Your Assumptions:

set.seed(123456) # Set the seed for reproducible results
sims= 500 # Set the number of simulations at the top of the script
alpha.1 = numeric(sims) # Empty vector for storing the simulated intercepts
B.1 = numeric(sims) # Empty vector for storing the simulated slopes
a = .2 # True value for the intercept
b =.5 # True value for the slope
n = 1000 # sample size
X = runif(n, -1, 1) # Create a sample of n observations on the variable X.
# Note that this variable is outside the loop, because X
# should be fixed in repeated samples.
for(i in 1:sims)– # Start the loop
Y = a + b*X + rnorm(n, 0, 1) # The true DGP, with N(0, 1) error
model = lm(Y ˜ X) # Estimate OLS Model
alpha.1[i] = model$coef[1] # Put the estimate for the intercept
# in the vector alpha.1
B.1[i] = model$coef[2] # Put the estimate for X in the vector B.1