# 4. Reliability

The reliability of a questionnaire provides an indication of the precision of an instrument. The concept concerns the replicability of the measured results: to what extent do we find the same results when we use an instrument for a second (or third) time, or to what extent are the results of a comparable set of items the same?

## 4.1 Internal consistency

The reliability of a questionnaire can be determined in multiple ways. The same questionnaire can be administered to the same respondents twice, after which the two measurements can be compared with each other (test-retest reliability). Furthermore, scores on one half of the questionnaire can be compared with the other half of the questionnaire (split-half reliability). The most used method and suitable for a questionnaire like WPI is the use of Cronbach’s alpha (a-coefficient). This is a measure for internal consistency (Nunnaly, 1978). With a greater than 0.85, one can speak about a reasonably homogeneous group of items (Green, Salkind, & Akey, 2002).

In order to assess the reliability of the WPI the reliability and the generalizability of the factors were calculated, as well as the internal consistency (Cronbach’s alpha) of every scale. This was done separately for the Advice and Selection group. The reliability of the factor scores was calculated with the formula for the stratified alpha (see Nunnally, p. 248). The generalizability was calculated using the formula for alpha in which however the items were not the parameters, but the scale scores (see Snijders, Tellegen & Laros, 1988). The generalizability indicates the expected correlation with a factor score based on a different, similarly large, sample of scale scores from the same domain of that specific factor.

Table 4.1. Reliability and generalizability of factors for both norm groups | ||||

Advice group ( N = 712) | Selection group ( N = 369) | |||

Factor | Reliability*/ Generalizability** | SEM***/ SEE**** | Reliability*/ Generalizability** | SEM***/ SEE**** |

Influence | .96 / .80 | 1.00/ .89 | .95 / .80 | 1.00/ .89 |

Sociability | .97 / .83 | .91/ .82 | .96 / .81 | .97/ .87 |

Exuberance | .97 / .82 | .94/ .85 | .96 / .81 | .97/ 0.87 |

Structure | .96 / .68 | 1.37/ 1.13 | .95 / .70 | 1.13/ 1.10 |

Stability | .96 / .82 | .94/ .85 | .96 / .81 | .97/ .87 |

Mean | .96/ .79 | 1.03/ .91 | .96/ .79 | 1.01/ .92 |

* Reliability is the stratified alpha ** Generalizability is Cronbach’s alpha *** Standard error of measurement **** Standard error of estimate |

The reliability and generalizability of the factors in both groups are shown in Table 4.1. In addition, the standard error of measurement and standard error of estimate are shown.

The reliability of the scales for both groups is shown in Table 4.2. The questionnaire consists of predominantly homogeneous, reliable and stable scales. On factor level, we can say that the reliability is very high (>0.95). There are virtually no differences between the factors in terms of their reliability. The Advice group has higher alphas in comparison with the Selection group, on the scale level as well as on the factor level. However, the differences are negligible.

Table 4.2. Reliability of the scales for both norm groups | |||||

Advice group ( N = 712) | Selection group ( N = 369) | ||||

Scale | Number of items | Alpha* | SEM**/ SEE*** | Alpha* | SEM**/ SEE*** |

Status | 10 | .90 | .67/ .63 | .88 | .74/ .69 |

Dominance | 12 | .92 | .59/ .57 | .88 | .74/ .69 |

Competition | 9 | .88 | .74/ .69 | .89 | .70/ .66 |

Self-presentation | 8 | .90 | .67/ .63 | .87 | .77/ .72 |

Need for contact | 10 | .87 | .77/ .72 | .86 | .81/ .75 |

Leisure contact | 12 | .91 | .63/ .60 | .89 | .70/ .66 |

Self-disclosure | 10 | .90 | .67/ .63 | .88 | .74/ .69 |

Trust | 10 | .88 | .74/ .69 | .85 | .84/ .77 |

Friendliness | 12 | .91 | .63/ .60 | .89 | .70/ .66 |

Attentiveness | 19 | .94 | .51/ .49 | .93 | .55/ .53 |

Energy | 12 | .89 | .70/ .66 | .86 | .81/ .75 |

Personal growth | 10 | .82 | .94/ .85 | .78 | 1.06/ .94 |

Perseverance | 13 | .88 | .74/ .69 | .86 | .81/ .75 |

Adaptability | 7 | .86 | .81/ .75 | .86 | .81/ .75 |

Originality | 10 | .93 | .55/ .53 | .92 | .59/ .57 |

Independence | 11 | .87 | .77/ .72 | .83 | .91/ .82 |

Orderliness | 10 | .90 | .67/ .63 | .88 | .74/ .69 |

Precision | 10 | .91 | .63/ .60 | .89 | .70/ .66 |

Regularity | 10 | .92 | .59/ .57 | .89 | .70/ .66 |

Conformity | 12 | .91 | .63/ .60 | .89 | .70/ .66 |

Decisiveness | 11 | .90 | .67/ .63 | .89 | .70/ .66 |

Self-confidence | 12 | .92 | .59/ 0.57 | .89 | .70/ .66 |

Optimism | 14 | .90 | .67/ .63 | .87 | .77/ .72 |

Frustration-tolerance | 11 | .90 | .67/ 0.63 | .89 | .70/ .66 |

Resilience | 11 | .91 | .63/ .60 | .90 | .67/ .63 |

Mean | .90 | .68/ .64 | .88 | .75/ .70 | |

* Internal consistency ** Standard error of measurement *** Standard error of estimate | |||||

## 4.2 Stability of the scales and factors over time

A retest was performed in order to assess the stability of the scales and factors over time. In total 53 persons took the retest. This group of respondents consisted of 13 women (24.5%) and 40 men (75.5%). The average age was 36.25 years old, ranging from 23 years to 59 years old (for one of the respondents, the age was not known). The questionnaire was first taken in February of 2007 and 9 months later. This was done in an advice context.

The mean reliability (Cronbach’s alpha) of the scales in the test-retest study was 0.91 in the first administration and 0.91 in the second administration as well. The reliability of the factors (stratified alpha) was 0.96 in both the first and second administration of the questionnaire. In calculating the reliability of the factors it was assumed that the total error variance was the same as for the Advice norm group. The variance on the factor level in both groups is roughly the same as in the norm group.

Correlations between the scale scores of the first administration and the second administration were calculated to assess the stability of the scales. These are represented in Table 4.3.

Table 4.3. Correlations with the second administration (N = 53) | ||||||

Factor | Scale | Mean stenscore first administration* | Mean stenscore second administration* | Correlation* | t-value** | p-value** |

Influence | Status | 5.0 (2.1) | 5.1 (2.2) | .85 | ||

Dominance | 4.9 (2.0) | 5.1 (2.0) | .87 | |||

Competition | 5.9 (2.3) | 6.2 (2.1) | .87 | |||

Self presentation | 4.8 (1.8) | 4.9 (1.8) | .78 | |||

Sociability | Need for contact | 4.5 (2.1) | 4.5 (2.0) | .82 | ||

Leisure contact | 4.4 (2.1) | 4.4 (2.0) | .88 | |||

Self-disclosure | 5.2 (2.4) | 5.0 (2.1) | .88 | |||

Trust | 6.2 (2.2) | 5.9 (2.4) | .78 | |||

Friendliness | 5.0 (2.2) | 4.7 (2.0) | .89 | |||

Attentiveness | 4.9 (1.9) | 4.4 (1.7) | .78 | -2.96 | .005 | |

Exuberance | Energy | 4.8 (1.8) | 4.7 (1.9) | .77 | ||

Personal growth | 5.1 (2.3) | 5.0 (2.2) | .82 | |||

Perseverance | 5.4 (2.0) | 5.2 (1.5) | .65 | |||

Adaptability | 5.3 (1.9) | 5.1 (1.8) | .79 | |||

Originality | 5.1 (2.2) | 4.7 (2.4) | .89 | -2.17 | .034 | |

Independence | 6.2 (1.6) | 6.4 (1.6) | .64 | |||

Structure | Orderliness | 4.9 (2.2) | 4.9 (2.0) | .88 | ||

Precision | 5.5 (2.1) | 5.0 (1.9) | .87 | -3.63 | .001 | |

Regularity | 5.5 (1.9) | 5.5 (1.9) | .88 | |||

Conformity | 5.5 (1.7) | 5.1 (2.0) | .85 | |||

Decisiveness | 4.8 (1.9) | 4.7 (2.1) | .82 | |||

Stability | Self-confidence | 5.5 (1.9) | 5.2 (1.9) | .86 | ||

Optimism | 5.6 (1.9) | 5.1 (1.9) | .77 | -3.31 | .002 | |

Frustration-tolerance | 6.0 (2.1) | 5.8 (2.2) | .87 | |||

Resilience | 5.7 (2.0) | 5.6 (2.0) | .78 | |||

* All correlation were tested at the .01 level ** Only shown for significant differences |

The correlations between the factors and the average standardized factor scores in the first and second administration is represented in Table 4.4.

Table 4.4. Mean standardized factor scores and corresponding t– and p-values (N = 53) | |||||

Factor | Mean stenscore first administration* | Mean stenscore second administration* | Correlation** | t-value | p-value |

Influence | 5.0 (2.0) | 5.2 (2.0) | .90 | 1.81 | .08 |

Sociability | 4.8 (2.3) | 4.5 (2.1) | .87 | -1.95 | .06 |

Exuberance | 5.3 (2.0) | 5.1 (2.0) | .79 | -0.99 | .33 |

Structure | 5.0 (2.3) | 4.7 (2.3) | .93 | -2.73 | .01 |

Stability | 5.8 (2.1) | 5.4 (2.3) | .87 | -2.78 | .01 |

Mean | 5.2 (2.1) | 5.0 (2.1) | .87 | ||

* The corresponding standard deviations are presented in brackets ** All correlations were tested at the .01 level |

T-tests were used in order to check whether the means were significantly different from each other. The t-tests showed significant differences for four of the scales (*Attentiveness*, *Precision*, *Optimism* and *Originality*) and two of the factors (*Structure *and *Stability*). The t-values and corresponding p value for the scales for which the first and second administration differed significantly are represented in Table 4.3. The t-values and corresponding p value for all the factors are represented in Table 4.4.

The correlations between the first and second administration are about as high as the reliabilities of the respective scales and factors. Test-retest correlations are in general lower than the reliabilities because there are real changes (in personality) in the research group. The stability of the scores on the WPI with an intervening period of nine months appears to be very high.