Issue 1/95

Pulished on April 19th, 1995
Copyright 1995

|| Home page || AeroNews Index ||


The Secret of
Aptitude Testing

Recognizing the complexity
of what we are trying to predict.


by Dr. Stanley N. Roscoe
V.-P. Research & Development



Despite the investment of huge sums by the military in the development and validation of selection batteries, their tests account for no more than about 25 percent of the variance in training success and have no evident correlation with operational performance.

The need for valid tests of complex operational aptitude is increasing as the explosion in information technology and associated automation makes more complex operations possible and the cost of placing the wrong person in charge greater than ever. Increasing the information available gives the operator more to attend to, and automation makes it all the more important and difficult to keep track of everything that is going on and decide when some intervention is critical. Pilots used to call it "staying ahead of the airplane," "good judgment," and "airmanship." Now it's called "situational awareness," and in the case of crew operations, we call it "resource management."

The costs of haphazard personnel selection are not limited to those resulting from bad judgment and mismanagement of critical operations. It is also costly to invest in the training of individuals who fail to reach criterion performance levels after training or, worse yet, pass all training tests but then are unable to stand up under operational stress. As so often happens with paramedic trainees, the individual may have all of the skills and knowledge normally required but be unable to put them together in the confusion of a multicar accident scene or a subway fire.

The Difficulties


The failure to develop tests of high predictive validity for complex operational aptitude has been caused by several factors, the first of which is the usual clouding of operational performance criteria against which to validate any such test. If measures of complex job performance are unreliable, as they typically are, there is no way that the high predictive validity of a test can be shown statistically. The pass-fail criterion would be of value if approximately equal numbers of trainees passed and failed, but when the ratio is four or five to one, as in pilot training programs, for example, it is almost worthless. Rating scales are no better when almost all trainees are given the same grade.

Aside from the criterion problem, development of effective aptitude tests has been crippled by the notion that performance of complex operations depends on a collection of individually simple abilities. Consistent with this idea, batteries have been developed to test reaction time, manual dexterity, short- and long-term memory, spatial orientation, and the like. The fact that such batteries account for only about 25 percent of the variance in training success is also caused in part by the correlations among the so-called factors measured by the individual tests. Any one or two of the tests provides almost as much predictive power as the entire battery. Administering the rest of the battery is a waste.

 

The Secret

The secret of operational aptitude testing is to recognize the complexity of what we are trying to predict and construct a measuring instrument of similar complexity. The fact that expanding a test battery adds little predictive validity does not mean that a selection test should be short to be cost effective. It is wishful to expect situational awareness and stress tolerance to be revealed reliably in a short test. If a day or even part of two days is required by most candidates to approach a terminal performance level on an aptitude test, its application would still be cost effective if only candidates of high aptitude were selected and the potential failures were rejected before large sums had been invested in their training.

While situational complexity is necessary to test situational awareness, it is not sufficient. To avoid confounding basic aptitude with the effect of prior training in specific tasks, the elements that comprise the test must be unlike any real-world activities such as operating computers or controlling specific vehicles. Furthermore, the individual subtasks must be sufficiently simple to allow their mastery in a short practice period before combining them in the test situation. Sufficient situational complexity can be achieved by the manner in which the individually simple subtasks are combined in an adaptive scenario involving multiple sources of information and multiple response alternatives.

An operator of complex systems or director of complex operations must search for, evaluate, and integrate information about all relevant events, conditions, and resources, quickly assess changes in situational priorities, and allocate attention accordingly. To determine an individual's aptitude for meeting these demands requires a complex test in which high scores depend on:

Finding out what's important now and in the long run and allocating priorities accordingly;

Perceiving a situation correctly by avoiding preconceived assumptions and subjective biases and being vigilant;

Discovering rules that are not explicit through induction and deduction;

Recognizing serendipitous opportunities quickly and seizing them before they pass;

Ignoring irrelevant distractions and tolerating frustration when things are going badly;

Coping with the stress of high workload periods and poor performance indications; and finally

Coping with the boredom of routine tasks and resisting complacency during periods of low workload.

The WOMBAT

The PC-based WOMBAT Situational Awareness and Stress Tolerance Tests are designed to embody all these demands and constraints. The individual tasks involve pursuit tracking, pattern recognition, and short-term memory, and on each a testee can reach his or her asymptotic performance level after a short practice period. The three-dimensional tracking task is unlike anything called for in real-world vehicle control. In a quadrant-location task, as each pattern of numbers is learned, it is replaced by a more difficult pattern of greater scoring worth. A two-back serial digit-cancelling task, with no real-world counterpart, is both tediously boring and frustrating. The solid-figure rotation and matching task requires spatial orientation and rapid diagnosis from the candidate.

These three tasks comprise the menu of scoring alternatives available to the testee on request. Each is relatively culture-free in that it has no real-world counterpart, and each can be learned quickly by the apt testee. The attention demands of the WOMBAT game are expanded by the ever changing information presented by peripheral indicators. To score well the testee must monitor the peripheral indicators vigilantly to follow the shifting priorities of the various activities as indicated by their potential scoring worths and current scoring rates and to detect indications of failure modes that may require immediate termination of one activity in favor of another.

 

DuoWOMBAT-CS

The WOMBAT is a test of the operational aptitude of the individual working in complex situations without regard to interactions with other individuals in a team or crew relationship. The latter situation calls for additional personal attributes, primarily social in nature, that have gained the attention of training and operations managers and government regulators, most notably in the civil aviation community. Training in "cockpit resource management" has been instituted by most of the world's airlines, despite the fact that the evaluation of its effectiveness and worth has been almost entirely subjective. By consensus CRM has high "face validity."

Although certain so-called personality tests are believed by some to reflect traits conducive to effective and harmonious interactions with other team or crew members, until recently there has been no test specifically designed to call for the working exercise of those traits. To address this need, the solo WOMBAT-CS has been expanded to the DuoWOM-BAT-CS Crew Resource Management Test. Two testees, sitting side-by-side at two networked WOMBAT-CSs, work out their joint strategy for trading off duties to maximize their combined scores. Other modifications have been made to the scenario to facilitate teamwork and adjust scoring weights appropriately.



|| Home page || AeroNews Index ||