CareFree - AI-based System Health Assessment Scaled for Industrial Use
Integrating human knowledge and AI techniques for better diagnosis and effective predictive maintenance of high tech systems.
Effective Maintenance of High-tech Systems
The Dutch Industry is at the forefront of innovation in high-tech systems like those for the professional printing industry, where increasing throughput and product quality is offered at lower and lower costs. Meeting their clients’ demands, the companies that build these systems need to guarantee carefree high-performance workflows based on their systems with no unexpected downtimes as well as short maintenance and repair times.
With the CareFree project, TNO ESI and the industrial partner Canon Production Printing (CPP, formerly Océ) want to demonstrate the potential of AI to do just that. The goal of the project is generate AI fueled with engineering knowledge and data insights to provide novel and more efficient tools to manage the maintenance of professional printers: given the errors and the messages produced by the machine, AI techniques are developed to facilitate the identification of the failing parts or processes (diagnosis) and to predict future problems (prognosis), making therefore possible to take action to avoid long and expensive downtimes (predictive maintenance).
Hybrid AI as Foundation for Diagnosis and Prognosis
The diagnosis and prognosis of such high-tech systems is complex, as this task is set to ensure the lasting performance of the overall system – and not merely to prevent the breakdown of parts:
We face control loops that interweave on various timescales to optimize system-level behavior. As these loops respond to differences in tasks and circumstances, but also to increasing wear and tear, they realize adaptive behavior that can mask indicators of failures or future breakdowns. Technically, we face cyber-physical system that rely on information flows and complex event processing, thus introducing indirect or reciprocal effects that render simple root-cause analysis techniques moot.
All this led us to realize that we need to establish diagnosis and prognosis with techniques that
- allow the inference of system and component states from incomplete and imprecise observations, that is under uncertainty
- allow the execution of diagnosis and prognosis based on machine-generated data and within human-centered processes in one approach
- allow us to realize all necessary computational models from both past observations, and from domain expertise to capture causality
At TNO ESI, we concluded that we require Hybrid AI for this, specifically in the form of probabilistic and causal reasoning, as offered, for example, by Bayes nets.
CareFree Core Approach
Bayesian networks (also known as Bayesian belief networks) provide a graphical representation of a set of interesting variables and their joint probability distributions. In other words they model how different elements in complex systems are connected through a cause-effect relationship, whilst taking into account uncertainty. Given an event that occurred and a list of possible causes, Bayesian networks can predict the likelihood for any of these causes to be the actual contributing factor.
Bayesian networks offer therefore a suitable setting to frame our diagnosis and prognosis problem.
The Bayes Networks are generated within a multi-step process that compiles nets from network fragments which represent diagnostic models for components and their relations which we extract from system specifications and other sources, like FMECAs. While some of the probabilistic parameters of the Bayes nets are bound to functions, others, like failure rates, are learned from data.
With this hybrid approach we were able to reach valid diagnosis for several test scenarios; and this confirmed our choice and motivate us in carrying on with our research.
Our current research focuses on the following topics
- Modeling control loops – our current reasoning engine is a Bayes net which cannot contain loops so we are working on a means to deal with these.
- Modeling performance – we want to diagnose operational degradation not just full system failures
- Interventions - When doing an “active test” you effectively make a change to the system. How to effectively combine the information gathered through a series of tests?
- Use of test strategies – which test to perform to give maximal discriminative diagnostic evidence at the lowest cost
- Data driven optimization - Customization of models based on printer usage
- Jos Hegge, Sr Project Manager, TNO, e-mail: firstname.lastname@example.org