Three of the key properties of psychometric tests which we will explain now are:

⦁ Validity
⦁ Reliability
⦁ Normative Measures

What does Validity mean…

This means that a psychometric test must prove that it measures what it purports to measure.
For example, a tool that claims to measure temperature but actually measures weight, is not a valid measure of temperature.

In the same way a psychometric test that claims to measure say emotional resilience, but in fact measures something else cannot be considered to be valid. Because psychometric tests must prove that they have validity, they can be used with confidence. We know they are measuring what they are claiming to measure.

What is Reliability…

Psychometric tests must prove that they are reliable, i.e. that they will provide a similar result on a later measurement.

For example, if you put a brick on a scale and it weighs 2kg today, and you put the same brick on the same scale tomorrow, but now it weighs 3kg then that scale cannot be said to be reliable.

Because test developers must prove that their tests are reliable, we once again can use them with confidence, knowing that they will provide similar results on the same person over time. If a test claims to measure something like emotional resilience, and the test is both valid and reliable, we know with a high degree of confidence, that the test is in fact measuring resilience and not something else. We also know that if a candidate repeats the test sometime in the future, we will find a similar measurement to the original. The test is stable and does not fluctuate wildly in its measurements.

Compare this to say using a panel interview trying to assess a candidate’s resilience. The interview question may be – “how well do you cope with pressure?” We have no science, statistics, or research to prove to us that the question in fact assesses resilience. We also do not know what the candidate understands by the question or the word pressure. We do not know if the candidate will provide a similar answer if asked the same question in a few months’ time. We do not know if different candidates will understand the question in the same way.
We also don’t know if the panel interviewers have the same understanding of the question, a shared understanding of the construct called “resilience”, or whether they will give the candidate similar or wildly divergent ratings on his answer, or even whether they will understand his answer in the same way. Under these circumstances, the assessment is highly subjective, based on divergent judgements and not a measurement, and we cannot have confidence in the final rating given by the panel as a valid and reliable assessment of the candidate’s resilience.

All of these pitfalls are eliminated by psychometric tests. Since the test is valid, we know the test has in fact measured resilience. And since the test is reliable, we know that it gave us a stable and robust measurement that is not subject to fluctuation. Consequently, we can have confidence in the result and draw firm conclusions from it.

What are Normative Measures….

Psychometric tests usually compare a person’s performance on the test against a large group of people who did the test during the process of test development. This “large group of people” is called the norm group. By comparing a candidate’s performance on a test to the norm group, we can say with a high level of confidence that his performance is below average, average, or above average compared to the norm group.

To understand how this works, we need to explain the difference between what are termed “Raw Scores” and “Standard Scores.

Raw Scores and Standard Scores

A raw score is the score that a person achieved out of the number of items that the test consisted of. For example, if a person achieved 20 correct answers out a total of 30 test items, his score of 20 would be his raw score. We have no way of knowing whether 20 is a below average, average or above average score. However, when we compare it to the norm group, this becomes possible.

When compared to a norm group, the RAW SCORE is converted into a STANDARD SCORE, using a NORM TABLE provided by the test developer. The standard score works either on a scale from 1 – 9 (called a stanine) or 1 – 10 (called a sten).

Where does the Norm Table come from?

The Test Developer. Test developers will often research their tests on large samples of many different groups sharing common characteristics. For instance, they could all be managers, or people with a tertiary education, or people with matric, or technical workers, or sales staff, or MBA students and so on. The test developer will therefore provide the psychologist with many different norm tables, and the psychologist would select the most appropriate one to use in each selection scenario.