Description

Developing insights on the impact and potential of Large Language Models.

Problem Context

It is currently unclear what impact Large Language Models (LLMs) may have on the innovation landscape of the Netherlands. Because of the potential of fundamental impact into a wide variety of fields, applications areas, businesses and societal developments, along with the massive amount of attention that is being given to this technology, both by (US) big tech, China and the EU, it is important to answer this from the perspective of TNO, our clients and stakeholders.

Solution

The aim of this project is to:

  1. Develop insights and a vision on the phenomenon of LLMs, an assessment of potential deployment of LLMs in society at large and the identification of potential and risks.
  2. Identify and develop a set of metrics and quantitative assessment tools that can meaningfully determine the impact that LLMs may have in Dutch society, ranging from implicit biases, norms, and the scope and quality of embedded knowledge. As there has already a lot been published that may include metrics and tools, this objective also involves a literature search on relevant tools and metrics. When needed, we may develop new metrics and assessment tools to address potential gaps.
  3. Explore the application potential of LLMs in series of selected domains/use cases, using the tools, metrics and insights resulting from objective 1 and 2.

Results

Within the EVAL project we investigated how to meaningful assess LLM on its societal impact. Specifically, using the seven requirements for Trustworthy AI from the EU HLEG as a basis, we explored possible metrics for assessing LLMs on their alignment with EU values. A general scan of available metrics was performed, along with use cases in the logistics and health/nutrition domains, and a case study on bias and consistency of a number of popular LLMs in the context of the 2023 Dutch elections. Project results have formed the basis for a more extensive research program on Responsible GenAI (GRAIL).

Contact

  • Lizette Maljaars, Senior Project Manager, e-mail: lizette.maljaars@tno.nl
  • Joachim de Greeff, Senior Consultant, e-mail: joachim.degreeff@tno.nl