Developing a methodology for verification and validation of AI-components.

Problem Context

The Dutch Ministry of Defence is looking to incorporate AI-based robotic systems in their operation, thereby creating the need for methods that can ensure the safe and effective use of such systems. Traditional methods are not capable of addressing AI-components in a system, as these generally behave as black boxes and can be updated/change behavior over time.


The objective of Predict.AI is to develop a methodology for Verification and Validation (V&V) of data-driven AI-components using a scenario-based approach. In this approach, given that there is no scenario data available for a regular performance evaluation, a scenario description is compared to the training data of the AI-component, to assess whether it will perform adequately for operation in this scenario.


We have developed methodology in the new and challenging field of assessing AI-performance without having the large amount of data traditionally used for evaluation. The type of AI considered is neural network-based image classification, a core component for robotic systems to perceive their environment. First, analysis performed on changing image conditions between training data and the foreseen scenario, resulted in prediction models for the classification accuracy in the scenario. Second, we have studied changes in objects characteristics by analysing classifier margins, which resulted in a new method for computing margins in embedding space. Here an embedding space is a representation layer in a Neural Network, where typically the last layers of the network are used, in which distance metrics such as Euclidean distance relate to inter-class differences. Thirdly, the changing conditions have also been explored using Large Language Models, resulting in a method to model domain shift based on textual descriptions. Finally, the changes have been analysed using small-sample statistics, which resulted in bounds for the predictions of classification accuracy in the scenario.


  • Klamer Schutte, Scientist, e-mail:
  • Richard den Hollander, Scientist, e-mail: