top of page

Towards Multi-Fidelity Test and Evaluation of Artificial Intelligence and Machine Learning-Based Systems

With the broad range of Machine Learning (ML) and Artificial Intelligence (AI)-enabled systems being deployed across industries, there is a need for systematic ways to test these systems. While there is a significant effort from ML and AI researchers to develop new methods for testing ML/AI models, many of these approaches are inadequate for DoD acquisition programs because of the unique challenges introduced by the organizational separation between the entities that develop AI-enabled systems and the government organizations that test them. To address these challenges, the emerging paradigm is to rigorously test these systems throughout the development process and after their deployment. Therefore, to implement testing across the acquisition life cycle, there is a need to use the data and models at various levels of fidelity to develop a cohesive body of evidence that can be used to support the design and execution of efficient test programs for systems acquisition. This paper provides a perspective on what is needed to implement efficient and effective testing for an AI-enabled system based on a literature review and walk-through of an illustrative computer vision example.

bottom of page