Scatter Plot and Karl Pearson's Correlation Cofficient

Subject: Business Statistics

Overview

As a statistical tool, correlation analysis is employed to determine the relationship between variables. It should be emphasized that one of the most popular statistical methods used by applied statisticians is correlation analysis.

Scatter Plot

Graphs having plots between two variables are called scatter plots. The values of one variable are stored on the X-axis, whereas the values of another variable are kept on the Y-axis. The diagram produced by displaying these pairs of X and Y data is called a scatter diagram. The two factors under consideration are said to be connected if the depicted dots demonstrate an upward or downward trend. There is a significant relationship between the dots if they are close together and show a trend of either growing or falling values.

Correaltion:

A statistical tool called correlation is used to assess how closely two or more variables are related.

Types of Correlation

  • Positive correlation
  • Negative correlation
  • Linear correlation
  • Non-linear correlation
  • Partial correlation
  • Multiple correlationn

Measures of Correlation

  • Graphic method or scatter diagram method
  • Karl Pearson's correlation coefficient
  • Sparkman's correlation coefficient

Graphic method or scatter diagram method:

It is the most straightforward technique for examining correlation between two variables. This method maintains the values of one variable in the X-axis and another variable in the Y-axis. The diagram created by locating these pairs of X and Y values is referred to as a scatter diagram. The two factors under consideration are said to be connected if the depicted dots demonstrate an upward or downward trend. There is a significant relationship between the dots if they are closely spaced and show a trend of either growing or decreasing value.

Karl Pearson's Correlation Coefficient

Karl Pearson's correlation coefficient simply assesses the degree of linear relationship between two variables. Karl Pearson's correlation coefficient between X and Y is typically represented by the symbols rxy, r(X,Y), or just r. It is also known as a correlation, the simple correlation coefficient, or the product moment correlation coefficient. This is how it is explained:

r=\{\frac{COV(X,Y)}{\{sqrt{var(X) \}{sqrt{var(Y)}\}

where, COV(X,Y) is read as covarience between X and Y. This measures the simultraneous changes between two variables.

and COV(X,Y) = 1/n\{sum{(X - \{overline{X})\} }\} (Y -\{overline{Y}\})

=1/n\{sum{XY - \{overline{X}\} \{overline{Y}\}}\}

Properties of Karl Pearson's Correlation Coefficient

  • Correlation cofficient (r) lies between -1 to +1.
  • Correlation cofficient (r) is the geomatric mean between two regression coefficients i.e. r= + -\{sqrt{byx.bxy}\} where, byx = regression coefficient of regression line of Y on X, bxy =regression coefficient of regression line of xon X
  • Correlation cofficient is independent of change of origin as well as scale.
  • Correlation cofficient is a relative stastical measures.
  • Two independent variables are uncorrelated but the converse may not be true i.e. uncorrelated variables may not be independent.

Example:

Calculate the coefficient of correlation for the following data:

X 2 3 4 5 6
Y 7 9 10 14 15

Solution:

X Y

x=X -\{\overline{X}\}

(\{\overline{X}\}= 4)

x2

y= Y -\{\overline{y}\}

(\{\overline{Y}\}= 11 )

y2 xy
2 7 -2 4 -4 16 8
3 9 -1 1

-2

4 2
4 10 0 0 -1 1 0
5 14 1 1 3 9 3
6 15 2 4 4 16 8
\{\sum{X}\}=20 \{\sum{Y}\}=55 \{\sum{x}\}=0 \{\sum{x2}\}=10 \{\sum{y}\}=0 \{\sum{y2}\}=46 \{\sum{xy}\}=21

we have,\{\overline{X}\} =\{\frac{\{\sum{X}}{n}\} =\{\frac{20}{5}\} = 4

\{\overline{Y}\} =\{\frac{\{\sum{Y}}{n}\} =\{\frac{55}{5}\= 11

now correlation coefficient, r =\{\frac{\{sum{xy}\} }{ \{sqrt{x2}\} \{sqrt{y2}\} }\} = \{\frac{21}{\{sqrt{10}\} \{sqrt{46}\} }\} =0.98

therefore, r= 0.98. this shows that there is almost perfect positive correlation between X and Y.

References

Chaudary, A.K. (2061).Business statistics. kathmandu:Bhundipuran Prakshan

Dhakal Bashanta (2014).Business Statistics,Buddha academic publisher

Things to remember
  • Comprehend both the value and the restrictions of correlation analysis.
  • Understand when it is suitable to apply pearson's coefficient of correlation.
  • Find the relationship between two variables that a scatter diagram suggests.

© 2021 Saralmind. All Rights Reserved.