Connect with us

Hi, what are you looking for?

Tech & Science

AI and pharmaceutical medicine development: A new standard is in the making

AI: How does this technological revolution fit with the pharmaceutical regulators who oversee the pharmaceutical sector at national and supranational levels?

A future for medtech? — Image by © Tim Sandle
A future for medtech? — Image by © Tim Sandle

AI in pharmaceuticals is set to transform drug discovery, clinical trials, manufacturing, and marketing by analysing vast datasets to speed up processes, reduce costs, and enable personalised medicine.

Applications being worked on include identifying drug candidates and predicting protein structures to optimising supply chains and automating regulatory tasks, though challenges like data quality and transparency remain. There are examples whereby AI has helps find new targets, design molecules faster, recruit trial patients better, and create tailored treatments, making drug development more efficient and precise. 

Yet how does this technological revolution fit with the pharmaceutical regulators who oversee the pharmaceutical sector at national and supranational levels?

The European Medicines Agency (EMA) has become the first pharmaceutical regulator to produce a draft guidance on the use of artificial intelligence as applied to the development and manufacture of medicinal products. This comes at an important juncture, since the benefits and errors in relation to AI are at a pivotal point.

Termed Annex 22, the draft document represents a new regulatory annex focused on the governance, validation, and oversight of AI/ML systems used in Good Manufacturing Practice (GMP) environments. The draft text closely complements Annex 11 (which is in place for computerised systems); where the two documents are designed to prevent unsafe use of adaptive or opaque models in critical GxP processes.

As to what Annex 22 contains, my assessment is:

Scope – very strict boundaries

Annex 22 applies only to static, deterministic AI/ML models used in critical GMP processes. This means that static machine learning models; deterministic models (same input → same output); and critical applications only with strict controls, are permitted.

Whereas explicitly excluded are dynamic / self‑learning models; probabilistic models; generative AI and Large Language Models (LLMs). The Annex specifically states that generative AI/LLM use is only acceptable for non‑critical GMP tasks with HITL oversight. Human-in-the-Loop (HITL) is an AI and machine learning approach where human interaction and intelligence are integrated into the system’s training, testing, and operational cycles.

This is a very high bar and one that disqualifies many commercial AI tools, unless they are configured to become heavily constrained.

Leaping forward with quantum technology. — Image by © Tim Sandle.

Heavy emphasis on cross‑functional accountability

The Annex mandates that all subject matter experts,  data scientists, Quality Assurance (QA), IT, and vendors must collaborate from algorithm selection to operation. In order to chart this process, clear documentation is required regardless of whether the model is built in‑house or by suppliers. For this, quality risk management needs to underpin all decisions.

Furthermore, each pharmaceutical organisation using AI needs to develop and put in place a strong governance framework for AI.

Intended use – must be extremely well defined

Acceptance testing in pharmaceuticals consists of formal, documented, and GMP-compliant validation of equipment, specifically through Factory Acceptance Testing (FAT) and Site Acceptance Testing (SAT). FAT verifies equipment at the vendor’s site before shipping, while SAT confirms functional, integrated performance in the final operating environment. 

In relation to this, the Annex indicates that before acceptance testing commences there needs to be a full characterisation of the input sample space, including the identification of rare variations. To achieve this, subgroups must be identified (e.g., site, equipment, defect type) and HITL responsibilities must be explicitly defined and monitored.

Acceptance criteria – statistical expectations

To assess the success of AI, the Annex requires:

  • Clear test metrics (accuracy, sensitivity, etc.).
  • Acceptance criteria set by experts before testing begins.
  • The AI model performance must be superior the process it replaces.

This presupposes that the current manual/automated process intended to be replaced by AI have known, documented performance metrics.

Asian markets stuttered into the weekend, with eyes on US data and next week's Federal Reserve rate decision
Data – Copyright AFP Mohd RASFAN

Test data – high statistical and procedural rigor

Test data used to assess the AI must represent the entire input space (including rare edge cases). The Annex also calls on the data set to be large enough for statistical significance and to be labelled with extremely high accuracy.

Interestingly, the Annex also states that to assess an AI the users must avoid generative AI‑created test data.

Test data independency – strong separation-of-duties

To ensure the AI development process remains free from bias, the Annex has put in place a series of controls. These include:

  • No shared use of training and test data (to ensure that data remains free from contamination).
  • Access‑controlled, audited repositories.
  • Developers must never access test data.
  • Staff who have seen test data cannot train the same model unless under 4‑eyes control.
  • Physical objects used for testing cannot be reused for training.

Hence, this requirement enforces strict data segregation.

Test execution

To test out the suitability of the AI, the Annex requires the following:

  • Demonstrating generalisation (no over/underfitting).
  • A fully predefined test plan with metrics, test scripts, and data references.
  • Deviation handling identical to standard GMP deviation processes.
  • Retention of all test artefacts including audit trails and physical test objects.

Explainability – mandatory in critical applications

Each AI model must provide feature attributions. These are explainable AI (what is sometimes abbreviated to XAI) techniques that assign importance scores to input features, quantifying their influence on a machine learning model’s prediction. These methods help determine how specific inputs—like pharmaceutical product yield—drive model behaviour for making predictions. The intention is to offer insights into model transparency and decision-making. 

Director Christopher Wray the FBI believes the Covid-19 pandemic was "most likely" caused by an incident in a laboratory in Wuhan, China
Director Christopher Wray the FBI believes the Covid-19 pandemic was “most likely” caused by an incident in a laboratory in Wuhan, China – Copyright POOL/AFP Yuki IWAMURA

To demonstrate ‘explainability’ SHAP and LIME are popular, model-agnostic techniques and they are used to understand machine learning model predictions, differing mainly in their approach:

  • LIME (Local Interpretable Model-agnostic Explanations) builds simple, local linear models around specific predictions.
  • SHAP (SHapley Additive exPlanations) uses game theory (Shapley values) is used for more robust, mathematically grounded feature attributions, offering both local and global insights.

AI “black boxes” are called out as being unacceptable in GMP environments.

Confidence – controls against uncertain predictions

To build confidence that the AI model is doing what it is intended to do, each model must:

  • Log confidence scores.
  • Employ thresholds to avoid unreliable outputs.
  • Output “undecided” where confidence is low.

These features are seen as preventing inappropriate automated decisions from occurring.

Operation – strict life‑cycle governance

To ensure that the AI model operates across its intended lifecycle, the Annex requires that each change is documented and assessed and that configuration controls are in place to detect unauthorised changes.

AI is likely to increase in pace and application within pharmaceuticals. The draft Annex provides some clarity as to what will be expected by pharmaceutical regulators within the European Union. The Annex text has recently closed for public comment and a finalised version is expected to be issued later in 2026.

Avatar photo
Written By

Dr. Tim Sandle is Digital Journal's Editor-at-Large for science news. Tim specializes in science, technology, environmental, business, and health journalism. He is additionally a practising microbiologist; and an author. He is also interested in history, politics and current affairs.

You may also like:

Business

Prime Minister Mark Carney and the leader of Canada's oil‑rich Alberta province took a major step Friday toward building an oil pipeline.

Business

An electronic board shows the Nikkei 225 index on the Tokyo Stock Exchange at an office building in Tokyo - Copyright AFP Kazuhiro NOGIGlobal...

Business

The findings show that LLM expertise ranks highest in demand, with average salaries approaching $200,000 per year, reflecting the adoption of AI.

World

Aerospace giant Boeing confirmed that China had committed to purchasing 200 aircraft.