CORRELATION

Correlation Coefficient R

Real data is paired, scattered, and not perfectly predictable — so we need a number to measure the relationship.

In many real-life situations, we collect paired data, where two quantities are observed together for the same individual or object.

For example, in medical studies, researchers often record Age and Blood Pressure of patients. From experience and prior knowledge, medical researchers know that these two variables are connected — as age increases, blood pressure often tends to increase.

However, they do not know how strongly they are connected. Here, Age is one variable and Blood Pressure is the other. The data suggests a relationship, but simply looking at the table or even a graph is not enough to measure the strength of this relationship accurately.

In such studies, the variable or parameter whose value we want to explain or predict is called the dependent variable. The variable that is believed to influence it is called the independent variable.

In this example:

  • Age is the independent variable.
  • Blood Pressure is the dependent variable, since its value is expected to change with age.

Identifying the dependent and independent variables helps us correctly interpret the relationship between the variables and apply statistical tools such as Karl Pearson’s correlation coefficient in a meaningful way.

Consider the following paired data:

Age (years) Blood Pressure (mmHg)
25112
30118
35121
40128
45132
50138
55142
60150

Here, Age is one variable and Blood Pressure is the other. The data suggests a relationship, but simply looking at the table or even a graph is not enough to measure the strength of this relationship accurately.

Why Correlation Is Needed

In such situations, we use Karl Pearson’s Correlation Coefficient.

Karl Pearson’s correlation coefficient, denoted by \(R\), helps us measure both the degree and the direction of a relationship between two variables.

Meaning of Karl Pearson’s Correlation Coefficient

The value of \(R\) tells us:

Direction of relationship

  • Positive relationship
  • Negative relationship
  • No relationship

Strength of relationship

  • Weak
  • Moderate
  • Strong
  • Very strong
  • Perfect

By looking at the numerical value of \(R\), we can clearly understand:

  • whether the variables move in the same direction or opposite directions, and
  • how closely they are related.

Thus, Karl Pearson’s correlation coefficient provides a clear, quantitative measure of how two variables are related in real-world data.

CORRELATION

Correlation Coefficient R

Real data is paired, scattered, and not perfectly predictable — so we need a number to measure the relationship.

In many real-life situations, we collect paired data, where two quantities are observed together for the same individual or object.

For example, in medical studies, researchers often record Age and Blood Pressure of patients. From experience and prior knowledge, medical researchers know that these two variables are connected — as age increases, blood pressure often tends to increase.

However, they do not know how strongly they are connected. The data suggests a relationship, but simply looking at the table or even a graph is not enough to measure the strength of this relationship accurately.

In such studies, the dependent variable is the variable whose value we want to explain or predict, and the independent variable is the variable believed to influence it.

In this example:

  • Age is the independent variable.
  • Blood Pressure is the dependent variable.

Consider the following paired data:

Age (years) Blood Pressure (mmHg)
25112
30118
35121
40128
45132
50138
55142
60150

Why Correlation Is Needed

Karl Pearson’s correlation coefficient, denoted by \(R\), helps us measure both the degree and the direction of a relationship.

Karl Pearson’s correlation coefficient (mean deviation form):

\[ R = \frac{\sum (x - \bar{x})(y - \bar{y})} {\sqrt{\sum (x - \bar{x})^2 \; \sum (y - \bar{y})^2}} \]

Simplified (Computational) Formula

\[ R = \frac{n\sum xy - (\sum x)(\sum y)} {\sqrt{\left[n\sum x^2 - (\sum x)^2\right] \left[n\sum y^2 - (\sum y)^2\right]}} \]

Meaning of Karl Pearson’s Correlation Coefficient

Direction

  • Positive
  • Negative
  • No correlation

Strength

  • Weak → Perfect
CORRELATION SCALE

Understanding R Value Segments

The full range of R from −1 to +1 can be divided into meaningful segments that show both direction and strength clearly.

R = −1 to −0.7

Very Strong Negative

R = −0.7 to −0.5

Strong Negative

R = −0.5 to −0.3

Moderate Negative

R = −0.3 to 0

Weak Negative

R = 0 to +0.3

Weak Positive

R = +0.3 to +0.5

Moderate Positive

R = +0.5 to +0.7

Strong Positive

R = +0.7 to +1

Very Strong Positive

HEAT SCALE OF R

Strength & Direction Heat Map

−1.0 −0.7 −0.5 −0.3 0.0 +0.3 +0.5 +0.7 +1.0
−1.0 −0.7 −0.5 −0.3 0.0 +0.3 +0.5 +0.7 +1.0
Formulae

Available Formula Sheets

Latest
LATEST CONTENT Auto
Home