It's worth repeating: Right tool, right job.
Welcome once again to the AHA Report. No, it's not your imagination, I'm talking about validation again, in a piece from my archives. Why? Because to make pre-hire testing successful, you need to do more than define a list of job-related competencies and then test for proficiency in them. (if you even choose to go that far - which some people in this industry, amazingly, still do not.) Ideally, you should make sure any test you use really does predict performance in the actual job in question. In practice, you should try to get as close as possible to this ideal of validation.
I just can't stress this enough. General personality tests such as Myers-Briggs may be useful in general situations, like helping managers get to know themselves and each other at a leadership retreat. But then, so would whitewater rafting - and Meyers-Briggs doesn't produce action photos as a takeaway. Generic tests for job-related competencies may be useful for generic jobs. But have you ever hired someone for a "generic" job, besides perhaps the CEO's nephew?
The only thing that can predict success in the specific job you're hiring for is a test designed for that job, or pretty close to it. Anything else risks being worse than useless, by casting doubt on the value of testing itself. And we don't need any more of that, thank you very much.
Is this test validated for your industry?
Make sure you understand validity generalization before you answer.
The test vendor says, "This test was developed specifically for the banking industry."
You say, "Sounds good. I work for a bank. I'll buy it!"
Mr. Politically Incorrect says, "Bzztt! Nope."
Now, some vendors probably believe their own press and think their tests are validated for a given industry. But that does not make them correct. To understand why, we need to review the concepts of "validation" and "validity generalization."
Validation
Think about it. What do the following statements really mean about a hiring test?
- "Our test is validated for use in the XYZ industry." What, for every job? Are all companies in this industry identical?
- "Our test contains industry norms for the XYZ industry." What, everyone in the "industry norm" base is a high performer?
- "This is the 'average' score for people with this job title." So, all people with the same title do identical tasks and are high performers?
Suppose you took an employment test yourself. Would you really care whether it was used widely in the industry, or would you want to know whether its scores actually predicted your performance?
Validation means that someone, somewhere, at some time, did a formal study to see whether test scores predicted job performance in a specific job.
Test Choice
What does validation have to do with test choice? As I discussed in a past article, few tests are designed for hiring; that is, their content is not based on job performance, and their scores don't predict it either.
Why should this be a big deal? Because improperly used tests have a real financial impact on 1) qualified people, and 2) organizations that hire unqualified people. Job-qualified minorities, for example, have a history of being excluded based on unsupported job requirements and inappropriate tests. Yes, that includes interviews.
A true hiring test is different. Valid hiring tests are based on a generally-accepted theory of job performance; scores are supported by studies that show they predict on-the-job performance; scores are show to be stable over time; and test developers follow guidelines intended to make their tests rock solid. This is a good thing.
Assuming we're only shopping among true hiring tests, how do we choose which one to use? We look at job analysis data. Job analysis identifies the critical competencies required to perform the job. People cannot just "believe" an XYZ test will work for all jobs.
The next challenge is to make sure scores predict job performance. This is something a reasonable person would want to do, right? After all, if the test content is critical to the job, logic states we should make sure it works.
Deja Validation
Once upon a time people believed tests should be re-validated every time they were implemented. Then someone asked, "Why do we have to re-validate a test every time it is implemented when someone else might have already done all the work?"
This started a series of investigations and studies that concluded, "If two jobs are essentially the same, then the validity data can be 'transported' from one job to another. Sweet!" (assuming Ph.D.s would actually say something like that).
So is it okay to use an "industry validated" XYZ test without doing any further work? Nope. Sorry. We can save some time, but only if we know the two jobs are essentially the same.
This is done by comparing the parameters from the first validation study (what was measured and so forth) to the parameters from the second study. If the jobs are similar, if the performance criteria are similar, and if the first study followed professional practices, then and only then can we "transport" the data from one job to the next.
Validation Guidelines
Aside from legalese and validity generalization meta analyses, the bottom line is:
1. Responsible people need to that know a specific test score predicts performance.
2. Validity generalization is not an excuse to use a specific test just because a vendor claims the test was "validated." To point out a few best practices here (excerpts from the 1978 Uniform Guidelines):
"Under no circumstances will the general reputation of a test or other selection procedures, its author or its publisher, or casual reports of its validity be accepted in lieu of evidence of validity. Specifically ruled out are: assumptions of validity based on a procedure's name or descriptive labels; all forms of promotional literature; data bearing on the frequency of a procedure's usage; testimonial statements and credentials of sellers, users, or consultants; and other nonempirical or anecdotal accounts of selection practices or selection outcomes.
...Enforcement agencies will take into account the fact that a thorough job analysis was conducted and that careful development and use of a selection procedure in accordance with professional standards enhance the probability that the selection procedure is valid for the job."
Are these guidelines just a bureaucrat's dream? No. Are they the "law of the land?" No. But what reasonable person can argue against using a test that "fits" the job or against knowing that scores actually predict job performance?
Sticky Issues
Research study findings are reported in terms of trends and correlations. They are not "perfect proof"; they just represent a high probability the results were not do to chance. Take, for example, the concept of meta-analysis. This technique statistically combines results from many similar studies while "mathematically" controlling for sample size, test error, and so forth. Meta analysis is supposed to "minimize" the experimental error between one study and another.
The results of a meta analysis or other statistical report reads something like this: "The data had a correlation of +.30 with a probability of chance less than or equal to 5 percent." In other words (human ones): The numbers from Source A were roughly in alignment 9% of the time with numbers from source B. (That's not "technically" correct in statistical terms, but it will do for the purposes of discussion). Of course, a +.30 correlation coefficient still leaves us wondering about the 91% of data that were out of alignment.
Is meta analysis data as exact as reading a thermometer? No. Is it an indication that every study was tightly controlled? No. Can you "take results to the bank"? Only if you are prepared to argue that an average of averages is a precision measurement. It's not.
How about a uniform definition of job "performance"? For example, what happens when a mentally dull but politically skilled employee scores "low" on a test but is rated "high" by his or her manager? Is the test incorrect? Or is the rating incorrect? (I'd bet on the rating.)
The bottom line is, it is okay to use tests:
- Designed for hiring
- As long as test content is supported by job analysis
- As long as test scores predict job performance
- As long as professional test protocol is followed
It is not okay to use employment tests:
- That are not designed for hiring
- That are not supported by job analysis
- If test scores do not predict job performance
Other considerations:
- An interview is a test.
- All tests should be examined and reviewed to reduce adverse impact.
- Validity generalization is only acceptable when the two jobs are essentially the same (and supporting data shows they are the same).
Think about it: Isn't that just what a hiring manager wants to know about a pre-hire test?
|