2019 I/ITSEC

Ensuring Psychometric Validity Within an Automated Performance Measurement Standard (Room 320C)

Previous research and development work by the Navy has focused on developing an industry standard for system generated performance measurement that facilitates the mining of individual operator and aircrew performance data from simulators as an effort to continue the advancement of training systems. However, a standard alone does not provide the guidance necessary for implementation that will ensure this measurement medium adheres to psychometric principles. A major focus of science is ensuring reliability and validity when measuring psychological constructs. That is, when we measure a psychological construct (e.g., intelligence) is there consistency (i.e., reliability) within and between measures (e.g., observers, computer systems) and, are we measuring what we intend to measure (i.e., validity). While this latter distinction may seem obvious, people often make the mistake of measuring something that is unintended (i.e. construct contamination) or not fully measuring what they intend to measure (i.e., construct deficiency), and then use that information to inform feedback and decisions. The consequences of these measurement mistakes may be critical. As we work toward a standard that guides engineers in the development of system generated measures of performance, this standard must also incorporate important psychometric concepts and analytical techniques such as reliability, validity, local data norming, regression weighting, and structural equation modeling to name a few. One of the more robust findings in psychology is that ratings made by humans inherently contain some degree error. This error reduces the usefulness and legitimacy of findings. While system generated measures of performance will be void of these common human errors, they will still need to adhere to psychometric principles to ensure their utility in applied and academic settings. This paper will provide a review of the science of human performance measurement and provide specific recommendations for policies and approaches that support the automated performance standard.