Simple Random Sample (SRS)
A simple random sample is a sample drawn uniformly at random without replacement.
- “Uniformly” means every individual has the same chance of being selected.
- “Without replacement” means we won’t pick the same individual more than once.
To perform an SRS from a List or Array options
, we use np.random.choice(options, n, replace=False)
.
If we use replace=True
, then we’re sampling uniformly at random with replacement – there’s no simpler term for this.
If we want to sample rows from a DataFrame, we can use the .sample
method on a DataFrame. That is,
returns a random subset of n
rows of df
, drawn without replacement (i.e. the default is replace=False
, unlike np.random.choice
).
The effect of sample size
- The law of large numbers states that when we repeat a chance experiment more and more times, the empirical distribution will look more and more like the true probability distribution.
- Similarly, if we take a large simple random sample, then the sample distribution is likely to be a good approximation of the true population distribution.
- In general, statistics computed on larger samples tend to be better estimates of the Population parameters than statistics computed on smaller samples.