Problem 1
Simple Linear Regression (Formula Method)
In this problem let us try to find the relation between Exam Score y and Study hours x.
Then we will interpret the regression equation, the correlation coefficient R, and coefficientof determination R squared.
This process is called a Basic Regression Analysis
Question

A teacher wants to check if there is any connection between the number of hours a student prepares for an exam with their score. He records study hours and exam score.
Fit the regression line of exam score y on study hours x. Interpret the regression equation, R, and R squared.

Independent variable
Study hours, x
Dependent variable
Exam score, y
Observed data
x (hours) 1234 5678
y (score) 50545761 62676973
Solution
Step 1 Calculate The Sums (Use a Calculator in STAT mode)
Σx = 36 Σy = 493 Σx² = 204 Σxy = 2352 n = 8
Step 2 compute the regression coefficient which is for a mathematician the slope (byx)
$$ b_{yx} = \frac{n\Sigma xy - (\Sigma x)(\Sigma y)}{n\Sigma x^{2} - (\Sigma x)^{2}} $$ $$ b_{yx} = \frac{8(2352) - 36(493)}{8(204) - 36^{2}} $$ $$ b_{yx} = \frac{18816 - 17748}{1632 - 1296} $$ $$ b_{yx} = \frac{1068}{336} = 3.179 $$
Step 3 Now Mean Values Of x and y
$$ \bar{x} = \frac{\Sigma x}{n} = \frac{36}{8} = 4.500 $$ $$ \bar{y} = \frac{\Sigma y}{n} = \frac{493}{8} = 61.625 $$
Step 4 Now Plugin to the formulae and simplify to get the regression equation
y − ȳ = byx(x − x̄) (centred form)
$$ y - 61.625 = 3.179(x - 4.500) $$ $$ y = 47.321 + 3.179x $$
Final regression equation (rounded to 3 decimals)
ŷ = 47.321 + 3.179x
Correlation coefficient and coefficient of determination
Karl Pearson’s formula
$$ R = \frac{n\Sigma xy - (\Sigma x)(\Sigma y)} {\sqrt{\left[n\Sigma x^{2} - (\Sigma x)^{2}\right]\left[n\Sigma y^{2} - (\Sigma y)^{2}\right]}} $$
First find Σy² = 30809 (50² + 54² + 57² + 61² + 62² + 67² + 69² + 73²)
$$ R = \frac{8(2352) - 36(493)} {\sqrt{\left[8(204) - 36^{2}\right]\left[8(30809) - 493^{2}\right]}} $$ $$ R = \frac{18816 - 17748} {\sqrt{\left[1632 - 1296\right]\left[246472 - 243049\right]}} $$ $$ R = \frac{1068}{\sqrt{(336)(3423)}} = 0.99586 $$
R² = (0.99586)² = 0.99174 which is 99.174 percent
Interpretation

The slope (regression coefficient) gives the average change in score for a one hour increase in study time. Here it is about 3.179 marks per hour.That means, every extra hour the student studies, the score increases by 3.179

The value of R is positive and very close to 1, so the relationship is strongly positive and close to linear.

The value of R squared gives us the amount of influence of study time in exam scores is explained by this linear model is about 99.174 percent.

Formulae

Available Formula Sheets

Latest
LATEST CONTENT Auto
Home