Model: Hypothesis Tests for Means with StatKey

Hypothesis Test for a Population Mean

For a hypothesis test for a mean in StatKey we need the data, the null hypothesis, and the significance level.

  1. Go to StatKey and select Randomization Hypothesis Tests: Test for  Single Mean.
  2. Choose the appropriate data set, or select Edit Data to paste your own data in. Each number must be on its own line.
  3. Enter the null hypothesis value.
  4. Generate at least 5000 samples
  5. Select the appropriate test: `square` Left tail, `square` Two tail, `square` Right tail
    1. Then enter the appropriate significance level: 0.025 or 0.05 or something else.
    2. Draw a picture of this noting the resulting critical value on your number line.
  6. Make a decision:
    1. Is your sample mean in the red region (the critical region)? Yes? The Reject the Null
    2. Is your sample mean in the black region (the expected region)? Yes? Fail to reject the Null
  7. Calculate the p-value.
    1. Replace the critical number with your sample mean. (the number below the significance level on the x-axis)
    2. Add this to your picture.
  8. Re-affirm your decision based on the p-value:
    1. Is the p-value less than the significance level? Yes. Then definitely reject the null hypothesis
    2. Is the p-value greater than the significance level? Yes? Then fail to reject the null hypothesis

Example:

A realtor in Pendleton told me that the average home price is about $200K. Based on discussions with some newcomers to Pendleton, I thought it might be higher.

`H_0` : `mu=200,000` 
`H_a` : `mu>200,000`

I am going to choose a significance level of 5% since this is important, but not too important.

So I collect some data:

179.9 135.9 449 266 240
289.5 204.9 229 277.9 399
165 349.9 42 625 159.9
397 270 210 289 214.9
189 450 280 229 214.9
130 179.5 180 415 400
175 178.5 187 270 390
275 469.9 105 229.9 219
250 350 249.9 259.9 389.9
149.9 159 189 235 399

It is always a good idea to summarize the data before running a hypothesis test for a mean.

Summary and histogram of Pendleton Home Prices 2019

This is a reasonably large sample (n=50 > 30) and the histogram is a relatively symmetric bell shape with one large outlier, `(z=(625-264)/112~~3.223)` .

This sample has a mean and median that are higher than $200,000, but is it significantly higher?

Enter the data into StatKey to get a sampling distribution to test the hypotheses: 
    Go to StatKey and select Randomization Hypothesis Tests: Test for  Single Mean.
    Edit Data to enter my data.
    Enter my null hypothesis value.
    Generate 1000s of samples.
    Select the Right Tail test.
    Enter my significance level.

sampling distribution for mean home price in Pendleton

The mean of my sample, $263.84K is definitely in the red zone. The test is positive, e.i., I have evidence that the claim of the realtor is wrong.
StatKey states that the critical value is $225.85. My mean of 263.84 is definitely above that.
I can also find a `z` -score for the sample mean based on this sampling distribution and the realtor's claim, `(z=(barx-mu)/"std error"~~(263.84-200)/15.614~~4.089)` . Wow, this sample mean is over 4 standard deviations from the mean!

To get the p-value, I then replace the critical number that StatKey gave me, 225.850, with my sample mean, 263.842.
According to StatKey, the p-value is 0.000. 

Interpretation of the p-value: The chance of finding another sample with a mean higher than $263.842K, assuming that the mean should be $200K, is almost 0.

Decision: Reject the null hypothesis.

Conclusion: We have sufficient evidence that the average home price in Pendleton is greater than $200 thousand.