Model: Hypothesis Tests for Means with StatKey

Hypothesis Test for a Population Mean

For a hypothesis test for a mean in StatKey we need the data, the null hypothesis, and the significance level.

Go to StatKey and select Randomization Hypothesis Tests: Test for Single Mean.
Choose the appropriate data set, or select Edit Data to paste your own data in. Each number must be on its own line.
Enter the null hypothesis value.
Generate at least 5000 samples
Select the appropriate test: `square` Left tail, `square` Two tail, `square` Right tail
1. Then enter the appropriate significance level: 0.025 or 0.05 or something else.
2. Draw a picture of this noting the resulting critical value on your number line.
Make a decision:
1. Is your sample mean in the red region (the critical region)? Yes? The Reject the Null
2. Is your sample mean in the black region (the expected region)? Yes? Fail to reject the Null
Calculate the p-value.
1. Replace the critical number with your sample mean. (the number below the significance level on the x-axis)
2. Add this to your picture.
Re-affirm your decision based on the p-value:
1. Is the p-value less than the significance level? Yes. Then definitely reject the null hypothesis
2. Is the p-value greater than the significance level? Yes? Then fail to reject the null hypothesis

Example:

A realtor in Pendleton told me that the average home price is about $200K. Based on discussions with some newcomers to Pendleton, I thought it might be higher.

`H_0` : `mu=200,000`
`H_a` : `mu>200,000`

I am going to choose a significance level of 5% since this is important, but not too important.

So I collect some data:

Home Prices, in thousands, Pendleton, OR, 2019
179.9	135.9	449	266	240
289.5	204.9	229	277.9	399
165	349.9	42	625	159.9
397	270	210	289	214.9
189	450	280	229	214.9
130	179.5	180	415	400
175	178.5	187	270	390
275	469.9	105	229.9	219
250	350	249.9	259.9	389.9
149.9	159	189	235	399

It is always a good idea to summarize the data before running a hypothesis test for a mean.

Summary and histogram of Pendleton Home Prices 2019

This is a reasonably large sample (n=50 > 30) and the histogram is a relatively symmetric bell shape with one large outlier, `(z=(625-264)/112~~3.223)` .

This sample has a mean and median that are higher than $200,000, but is it significantly higher?

Enter the data into StatKey to get a sampling distribution to test the hypotheses:
Go to StatKey and select Randomization Hypothesis Tests: Test for Single Mean.
Edit Data to enter my data.
Enter my null hypothesis value.
Generate 1000s of samples.
Select the Right Tail test.
Enter my significance level.

sampling distribution for mean home price in Pendleton

The mean of my sample, $263.84K is definitely in the red zone. The test is positive, e.i., I have evidence that the claim of the realtor is wrong.
StatKey states that the critical value is $225.85. My mean of 263.84 is definitely above that.
I can also find a `z` -score for the sample mean based on this sampling distribution and the realtor's claim, `(z=(barx-mu)/"std error"~~(263.84-200)/15.614~~4.089)` . Wow, this sample mean is over 4 standard deviations from the mean!

To get the p-value, I then replace the critical number that StatKey gave me, 225.850, with my sample mean, 263.842.
According to StatKey, the p-value is 0.000.

Interpretation of the p-value: The chance of finding another sample with a mean higher than $263.842K, assuming that the mean should be $200K, is almost 0.

Decision: Reject the null hypothesis.

Conclusion: We have sufficient evidence that the average home price in Pendleton is greater than $200 thousand.

Return to the Public Course Page