Understanding sampling distributions
1. Click the "Animated sample" button. Five scores
from a normal distribution will be sampled and plotted in a histogram. The mean of the sample will be computed and plotted in a second
histogram. Repeat this 3 or 4 times or until you understand the how the "Distribution
of Means" is created. The red line extends
from the mean one standard deviation in
each directon. The colored vertical bars on the X-axis correspond to the statistic
of the same color.
2. Click the "5 samples" button to sample 5 samples of 5 scores each. The five means will be plotted. Click the "500 samples" and/or "2000 samples" until the distribution of means has stabilized. The sampling distribution of the mean is the distribution that is approached as the number of samples approaches infinity. With 5,000 to 10,000 you get a pretty good approximation.
3. The distribution plotted in (2) above is the sampling distribution of the mean of a sample size of 5. Approximate the sampling distribution of the mean for other sample sizes.
4. Any statistic you can compute in a sample has a sampling distribution. Approximate the sampling distribution of other statistics. The statistics available to compute are:
Mean
Median
Standard deviation (sd) (Using N in the denominator)
Variance (Using N in the denominator)
Mean absolute deviation from the mean (MAD)
Range
Understanding the Standard
error
1. The standard error is the standard deviation of the sampling distribution. Approximate
the sampling distribution of the mean for N=5. The standard deviation of the distribution
is the standard error of the mean. Find the standard error of the mean and the standard
error of the range for N=10 using the normal distribution.
2. Determine how the standard error is affected by sample size. Plot the standard error of the mean as a function of sample size for different standard deviations? Can you discover a formula relating the standard error of the mean to the sample size and the standard deviation? If so, see if it holds for distributions other than the normal distribution.
3. Redo #2 above for the median.
Understanding Bias
1. A statistic is unbiased if the mean of the sampling distribution of the statistic
is the parameter. Test to see if the sample mean is an unbiased estimate of the population
mean. Try out different sample sizes and distributions.
2. Find a distribution/sample size combination for which the sample median is a biased estimate of the population median.
3. Is the sample variance an unbiased estimate of the population variance? If not, see if you can find a correction based on sample size. Does the correction hold for distributions other than the normal distribution?
4. For what statistic is the mean of the sampling distribution dependent on sample size?
Understanding Efficiency
1. For a normal distribution, compare the size of the standard error of
the median and the standard error of the mean. Find a relationship that holds
(approximately) across sample sizes?
2. Does this relationship hold for a uniform distribution?
3. Find a distribution for which the standard error of the median is smaller than the standard error of the mean. (You may find this difficult, but don't give up.)
4. Compare the standard error of the standard deviation and the standard error of the mean absolute deviation from the mean (MAD). Does the relationship depend on the distribution?
Understanding the
Central Limit Theorem
1. The central limit theorem states that the sampling distribution of the mean
approaches a normal distribution as the sample size increases. Sample from the uniform
distribution and determine how large a sample size is needed for the distribution
to be a very close approximation of the normal distribution.
2. Do the same thing sampling from the skewed distribution.
3. Determine whether the sampling distribution of the median approaches a normal distribution as sample size increases.