SURGERY RESIDENT BLOG

And review books to boot!

Biostatistics: Pearson’s Correlation

Biostatistics: Pearson’s Correlation

5. CHAPTER 17, PROBLEM 5
In a study conducted in Italy, 10 patients with hypertriglyceridemia were placed on a low-fat, high-carbohydrate diet. Before the start of the diet, cholesterol and triglyceride measurements were recorded for each subject.

Patient Cholesterol Level (mmol/l) Triglyceride level (mmol/l)
1 5.12 2.30
2 6.18 2.54
3 6.77 2.95
4 6.65 3.77
5 6.36 4.18
6 5.90 5.31
7 5.48 5.53
8 6.02 8.83
9 10.34 9.48
10 8.51 14.20

a) Construct a two-way scatter plot for these data.

– See attachment

b) Does there appear to be any evidence of a linear relationship between cholesterol and triglyceride levels prior to the diet?

– There appears to be a linear correlation between cholesterol and triglyceride levels prior to the diet.

c) Compute r; the Pearson correlation coefficient.

HOW TO APPROACH A PEARSON CORRLEATION APPROACH (lec 17-5)

– Create a table of your data
– Variable 1? Cholestrol level
– Variable 1 = X.  What is Xbar? (5.12+6.18+6.77+6.65+6.36+5.90+5.48+6.02+10.34+8.51)/10 = 6.733

– Variable 2? Triglyceride level
– Variable 2 = Y. What is Ybar? (2.30+2.54+2.95+3.77+4.18+5.31+5.53+8.83+9.48+14.20)/10 = 5.909

– n is total number of (X,Y) combinations you have.
What is n? 10

– Sx = SD of X.  Sx = ?  1.56
– Sy = SD of Y.  Sy = ?  3.818

CALCULATE R
– r = [Summation (Xi – Xbar)(Yi – Ybar)] / (n-1)(Sx*Sy)

This is a simpler equation for r
– r = [Summation (Xi*Yi) – n*Xbar*Ybar ] / (n-1)(SxSy)

– r= [(5.12*2.30) + (6.18*2.54) + (6.77*2.95)+(6.65 *3.77)+(6.36*4.18)+(5.90*5.31)+(5.48*5.53)+(6.02*8.83)+(10.34*9.48)+(8.51*14.20)
– 10(6.733)(5.909) ]  / (9)(1.56)(3.818)

= 432.755 – 397.853 /  53.604
=  0.651

r = 0.651

d) At the 0.05 level of significance, test the null hypothesis that the population correlation p is equal to 0. What do you conclude?

HYPOTHESIS TEST
– Ho: ρ = 0  versus Ha: ρ ≠ 0.

SIGNIFICANCE LEVEL: Assume 0.05 significance level.

ASSUMPTIONS IN ORDER TO USE THE T-TEST
– We can use an approximate t-test if the following assumptions are met:
a) The pairs (Xi,Yi) are randomly selected from the population; and
b) both X and Y are normally distributed

CALCULATIONS
– The standard deviation of r is approximately
sqrt[(1-r2)/(n-2)]

– Use that to compute the test statistic

t = (r-0) / sqrt[(1-r2)/(n-2)]

Another way to write t is this:
t = r*sqrt[(n-2)/[(1-r2)]

– This t statistic has a t(n-2) degrees of freedom in the usual way.

t = 0.651*sqrt[(10-2)/(1-0.651^2)]
t = 0.651*sqrt[(8)/(0.576199)]
t = 2.42

df = n-2 = 8.

p/2 < 0.025
p < 0.05

I conclude that there is a significant positive relationship between cholesterol level and triglyceride level.

Share

Categorised as: Uncategorized


Comments are closed.

Powered by Google Talk Widget