CORRELATION
Correlation Coefficient R
Real data is paired, scattered, and not perfectly predictable — so we need a number to measure the relationship.
In many real-life situations, we collect paired data, where two quantities are observed together
for the same individual or object.
For example, in medical studies, researchers often record Age and Blood Pressure of patients.
From experience and prior knowledge, medical researchers know that these two variables are connected — as age increases,
blood pressure often tends to increase.
However, they do not know how strongly they are connected.
Here, Age is one variable and Blood Pressure is the other.
The data suggests a relationship, but simply looking at the table or even a graph is not enough
to measure the strength of this relationship accurately.
In such studies, the variable or parameter whose value we want to explain or predict is called the
dependent variable.
The variable that is believed to influence it is called the independent variable.
In this example:
- Age is the independent variable.
- Blood Pressure is the dependent variable, since its value is expected to change with age.
Identifying the dependent and independent variables helps us correctly interpret the relationship between the variables
and apply statistical tools such as Karl Pearson’s correlation coefficient in a meaningful way.
Consider the following paired data:
| Age (years) |
Blood Pressure (mmHg) |
| 25 | 112 |
| 30 | 118 |
| 35 | 121 |
| 40 | 128 |
| 45 | 132 |
| 50 | 138 |
| 55 | 142 |
| 60 | 150 |
Here, Age is one variable and Blood Pressure is the other.
The data suggests a relationship, but simply looking at the table or even a graph is not enough
to measure the strength of this relationship accurately.
Why Correlation Is Needed
In such situations, we use Karl Pearson’s Correlation Coefficient.
Karl Pearson’s correlation coefficient, denoted by \(R\), helps us measure
both the degree and the direction of a relationship between two variables.
Meaning of Karl Pearson’s Correlation Coefficient
The value of \(R\) tells us:
Direction of relationship
- Positive relationship
- Negative relationship
- No relationship
Strength of relationship
- Weak
- Moderate
- Strong
- Very strong
- Perfect
By looking at the numerical value of \(R\), we can clearly understand:
- whether the variables move in the same direction or opposite directions, and
- how closely they are related.
Thus, Karl Pearson’s correlation coefficient provides a clear, quantitative measure of how two variables are related in real-world data.