Practice Exam 5

1.

A medical researcher wants to test whether there exists a connection between smoking and heart disease because the mean age at which heart disease is first detected is the same for smokers and nonsmokers. Some 20 smokers were matched with 20 non smokers according to age, lifestyle, medical history, occupation sex and so on. The researcher wants a significance level of .05. The average difference (Smokers minus nonsmokers) was -6 years. The standard deviation of the difference is 12 years.

  1. Formulate the null and alternate hypothesis

Null: Mean Smoker - Mean Nonsmoker=0

Alternate: Mean Smoker - Mean NonSmoker NE 0








  1. Select the test statistic and note which table you will look up and calculate all relevant variables necessary to compute the test statistic:

Paired difference formulae

t table





  1. Give your decision rule and draw a graph which shows the areas of acceptance and rejection.

Two tail alpha = .05 Each tail = .025

df= 20-1 = 19

Critical t= plus or minus 2.093






  1. Compare your calculations to your decision rule and show where it is on the graph.

t= -6/2.68=-2.23


  1. What are your conclusions? (Elaborate in terms of the problem studied.)

Reject null. There is a difference between smokers and nonsmokers in the detection of heart disease.






2.

One government official believed that for-profit nursing homes tend to discriminate between patients who pay themselves and patients who are supported by Medicaid by forcing Medicaid patients to leave the nursing home faster. 18 nursing homes were tested. 11 were nonprofit with mean Medicaid patient days of 75.4 and standard deviation of 16.3 days. 7 were for-profit with mean Medicaid patient days of 40.4 and standard deviation of 30.8 days. Use a .01 significance level. Assume that the variance for all nonprofit and profit nursing homes are the same.

  1. Formulate the null and alternate hypothesis

Null: Mean nonprofit - Mean profit LE 0

Alternate: Mean nonprofit - Mean profit > 0








  1. Select the test statistic and note which table you will look up and calculate all relevant variables necessary to compute the test statistic:

Pooled variance - variance same

t - small sample

df=11+7-2=16




  1. Give your decision rule and draw a graph which shows the areas of acceptance and rejection.

Critical t = 2.583

One tail alpha = .01






  1. Compare your calculations to your decision rule and show where it is on the graph.

Pooled variance = 521.8

t=(75.4-40.4)/11.04=3.17



  1. What are your conclusions? (Elaborate in terms of the problem studied.)

Reject Null. For profits do tend to keep Medicaid patients for shorter periods than nonprofits.








3.

NBC wants to know whether there is any difference in the proportion of males or females who favor Seinfeld. Two random samples of 80 males and 80 females were taken. 60% of the males were watching Seinfeld. 70% of the females were watching Seinfeld. NBC wants a significance of .15.

  1. Formulate the null and alternate hypothesis

Null: Proportion Male - Proportion Female = 0

Alternate: Proportion Male - Proportion Female NE 0






  1. Select the test statistic and note which table you will look up and calculate all relevant variables necessary to compute the test statistic:

Difference between proportions

Standard error of difference = .075
z=(.6-.7)/.075=-1.33

  1. Give your decision rule and draw a graph which shows the areas of acceptance and rejection.

Two tail alpha=.15 Each tail = .075

Critical z = 1.44





  1. Compare your calculations to your decision rule and show where it is on the graph.

Inside critical value.


  1. What are your conclusions? (Elaborate in terms of the problem studied.)

Do Not Reject null. There is no difference between males and females in viewing Seinfeld.






 

The following is computer output from a linear regression from campaign expenditures and the number of votes gotten in the 1996 primaries:

Expenditures ($)

Votes

 

 

 

 

 

4000

60000

 

 

 

 

 

16000

60000

 

 

 

 

 

24000

180000

 

 

 

 

 

40000

660000

 

 

 

 

 

68000

540000

 

 

 

 

 

84000

420000

 

 

 

 

 

108000

900000

 

 

 

 

 

124000

600000

 

 

 

 

 

156000

660000

 

 

 

 

 

200000

720000

 

 

 

 

 

 

 

 

 

 

 

 

Campaign Spending and Votes

 

 

 

 

 

 

 

 

 

 

 

 

Regression Statistics

Standard Error of Prediction

204505.6

 

Correlation Coefficient

0.749405459

 

 

 

 

 

R Square

0.561608542

 

 

 

 

 

Adjusted R Square

0.50680961

 

 

 

 

 

Standard Error

204505.6114

 

 

 

 

 

Observations

10

 

 

 

 

 

 

 

 

 

 

 

 

ANOVA

 

 

 

 

 

 

 

df

SS

MS

F

Significance F

 

Regression

1

4.2862E+11

4.29E+11

10.24853

0.012587

 

Residual

8

3.3458E+11

4.18E+10

 

 

 

Total

9

7.632E+11

 

 

 

 

 

 

 

 

 

 

 

 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

201815.8611

108320.0399

1.863144

0.099449

-47970.8

451602.5

X Variable 1

3.376021103

1.054567416

3.201333

0.012587

0.944183

5.807859

What is the relationship between votes and campaign expenditures? Explain it in words that the general public will understand.  What is the dependent variable? What is the independent variable?



Votes = 3.376 (Expenditures) + 201816

There is a positive relationship between votes (dependent variable) and expenditures (independent variable).

The more you spend, the more votes you get.




What is the slope? What does it tell you about the relationship between the two variables? At the .05 significance level can we say b does not equal 0? Circle the relevant data on the print-out.

The slope is 3.376. It tells you for every dollar you spend, what you will get in votes.

There is a slope other than zero. The p-value is .01.




What is the intercept? What does it tell you about the relationship between expenditures and votes? Is it valid? Please cite evidence from the print-out.


The intercept is 201816. It says that if you spend nothing you will get 201,816 votes.

The intercept at the .05 significance level is zero. The p-value is .09.





What is the coefficient of determination or R square? What does it tell you about this regression model?


This tells us that 56% of the variation is explained by the regression model.







If a candidate calculates that she needs 1 million votes to win the State of Washington. How much should she spend on her campaign?

1000000=( 3.376 * Votes) + 201816

Votes = 236428.9











  1. What judgments would you make about this prediction?

 

Outside range of data and should not be relied upon.

  1.  
  2. A random sample of people were asked whether they preferred Brand A, B, C, or D. After the results were obtained, they were tabulated by gender. The market researchers want to determine if gender (being male or female) affects brand preference.








  1. Using the data construct a table to calculate a Chi-square which shows observed frequencies versus expected frequences:

 

Brand A

Brand B

Brand C

 

 


Male

30/18

18/25

12/17

 

60

Female

20/32

12/5

3/8

 

40

 

50

30

20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Calculate the chi-square and test this statistic to the .01 level.

24.01

Critical chi square = 9.21 df=(2-1)(3-1)=2 alpha=.01





c) State your conclusions to the management of the company.

There is a relationship between gender and brand preference.