Spearman Correlation Coefficient is a nonparametric method. It calculates the ranks by sorting the two variables, and then calculates the difference between the ranks to measure the correlation between the two variables.
Table of Contents
Spearman Correlation Coefficient
Spearman correlation coefficient calculates the direction of the correlation between two variables X (independent variable) and Y (dependent variable). When Spearman’s correlation coefficient is:
- 0 < ρ <= 1: When X increases, Y tends to increase.
- -1 <= ρ <0: When X increases, Y tends to decrease.
- ρ = 0: When X increases, Y has no trend.
We define the following assumptions:
- Null hypothesis (H0):ρ is equal to 0。
- Alternative hypothesis (H1):ρ is not equal to 0。
Next, we calculate the correlation ( ρ ) and define a critical value. Generally speaking, the critical value will be 0.05. When p-value is greater than or equal to the critical value, H0 is true.
Ranking
The following is part of the data obtained from Women Entrepreneurship and Labor Force.
No. | Country | Women Entrepreneurship Index (X) | Entrepreneurship Index (Y) |
---|---|---|---|
1 | Germany | 63.6 | 67.4 |
2 | Greece | 43.0 | 42.0 |
3 | Ireland | 64.3 | 65.3 |
4 | Italy | 51.4 | 41.3 |
5 | Latvia | 56.6 | 54.5 |
6 | Lithuania | 58.5 | 54.6 |
7 | Netherlands | 69.3 | 66.5 |
8 | Slovakia | 54.8 | 45.4 |
9 | Slovenia | 55.9 | 53.1 |
10 | Spain | 52.5 | 49.6 |
First, sort the data according to the Women Entrepreneurship Index, and then rank it in Rank(X). Again, sort the data according to Entrepreneurship Index and rank it in Rank(Y). After that, subtract the pairs of Xi and Yi to di. Finally, sum up all di^2.
Country | Index (X) | Index (Y) | Rank (X) | Rank (Y) | di | di x di |
---|---|---|---|---|---|---|
Germany | 63.6 | 67.4 | 8 | 10 | -2 | 4 |
Greece | 43.0 | 42.0 | 1 | 2 | -1 | 1 |
Ireland | 64.3 | 65.3 | 9 | 8 | 1 | 1 |
Italy | 51.4 | 41.3 | 2 | 1 | 1 | 1 |
Latvia | 56.6 | 54.5 | 6 | 6 | 0 | 0 |
Lithuania | 58.5 | 54.6 | 7 | 7 | 0 | 0 |
Netherlands | 69.3 | 66.5 | 10 | 9 | 1 | 1 |
Slovakia | 54.8 | 45.4 | 4 | 3 | 1 | 1 |
Slovenia | 55.9 | 53.1 | 5 | 5 | 0 | 0 |
Spain | 52.5 | 49.6 | 3 | 4 | -1 | 1 |
Sum | 10 |
Calculate Correlation ( ρ )
Use the following formula to calculate the correlation ( ρ ).
Therefore, according to the above table, we can obtain ρ = 0.9393.
Python SciPy
We can use SciPy’s spearmanr() to calculate the correlation ( ρ ) and p-value.
correlation, p = spearmanr(x, y)
- x, y: Two samples. The type is array_like.
- correlation: correlation ρ. The type is float.
- p: p-value. The type is float.
Example
from scipy.stats import spearmanr x = [63.6, 43.0, 64.3, 51.4, 56.6, 58.5, 69.3, 54.8, 55.9, 52.5] y = [67.4, 42.0, 65.3, 41.3, 54.5, 54.6, 66.5, 45.4, 53.1, 49.6] correlation, p = spearmanr(x, y) print('Correlation:', correlation) print('p-value:', p) if p >= 0.05: print('H0 is accepted') else: print('H0 is rejected')
The output is as follows.
Correlation: 0.9393939393939393 p-value: 5.484052998513666e-05 H0 is rejected