>>Return to Tell Me About Statistics!

The single correlation coefficient is explained in vol. 46. Let us consider how to numerically express the degree of correlation from the data.
As a specific example, Table 1 shows the height and weight data of ten students.

[Table 1]

The average height and weight are 150 cm and 50 kg, respectively. A correlation diagram was drawn, and the average height was marked as a horizontal line and the average weight as a vertical line.

[Figure 1]

*The distribution of points exhibits a linear tendency upward to the right.

The four areas divided by the average lines were designated as I–IV. If the variables x and y are unrelated, the points are evenly distributed in areas I–IV. If there is a correlation between x and y and y tends to increase as x increases, then there will be more points in areas I and III and fewer points in areas II and IV. Conversely, if y tends to decrease as x increases, there will be more in II and IV and less in I and III.

In this case, there are many points in areas I and III and only one point each in areas II and IV; therefore, it can be inferred that there is a strong correlation between height and weight.

A deviation indicates whether the data are above or below the average or to the right or left of it. That is, the correlation coefficient can be determined using this deviation.

How to calculate a single correlation coefficient

Now, let us actually calculate the single correlation coefficient.

[Table 2]

【Calculation procedure】
[1] Find the deviations (the measured value minus the average value) for the height and weight data and enter them in columns (3) and (4) of the table.
[2] Square the numbers in column (3) and write them in column (5).
[3] Similarly, square the numbers in column (4) and enter them in column (6).
[4] Find the sum of the numbers in column (5). (This is called the sum of squared deviations of height y and is expressed as “Syy.”)
[5] Similarly, find the sum of the numbers in column (6). (This is called the sum of squared deviations of weight x and is expressed as “Sxx.”)
[6] Multiply the numbers in columns (3) and (4) and write them in column (7). (The sum of the numbers in column (7) is called the sum of products and is expressed as “Sxy.”
[7] Find a single correlation coefficient using the following formula:

In this way, we find that the single correlation coefficient is 0.916.

Points to note about single correlation coefficients

In the correlation diagram shown in the figure, the data from two people, E and F, are located in II and IV, and those from eight people are in I and III. As shown in Table 1, the sum of the products of E and F is negative and the sum of the products of the other eight people is positive. The sum of the products is a mixture of negative and positive values, and the closer it is to 0, the smaller the correlation coefficient.

Note that if all data values are the same, such as the “weight” numbers in Table 3, a single correlation coefficient cannot be determined.

[Table 3]

>>Return to Tell Me About Statistics!

Categories:

Tags:

Comments are closed