Misleading Notions of "Regulatory-Grade" Data in Real-World Evidence Research

The concept of “regulatory-grade” data it has been used to describe data that meets the requirements necessary for regulatory decision-making in real world evidence (RWE) research. This term suggests that the data is of high quality, reliability, and integrity, suitable for supporting regulatory submissions and decisions regarding the safety, efficacy, and quality of medical products. However, the notion of "regulatory-grade" data is not recognized or defined by regulatory agencies, and it is misleading. Whether a particular data source is fit for purpose must be evaluated in the context of the specific intended use, and the quality and reliability of the data source should be assessed on a project-specific basis and according to regulatory requirements.



Health Authorities RWE guidance focus on outlining how a data source may be fit for purpose, meeting principles of relevance, reliability, accuracy, completeness, data origin and duration of follow-up to address a specific research question and do not define or recognize a specific category of data as “regulatory-grade”, nor do they provide defined measurable metrics to assess the "goodness" of a data source.



The selection of a data source is not an isolated activity; it requires careful consideration of various factors such as the type of drug under investigation, the study design, the outcomes of interest, and how these outcomes are defined.



There are several reasons why the concept of “regulatory-grade” data is inappropriate and why we should stop using it:



  1. Context-Specific Requirements:
    • Regulatory agencies like the FDA and EMA assess data based on its relevance and reliability for the specific regulatory decision at hand. The standards for data quality vary depending on the context, such as drug approval, post-market surveillance, or health technology assessments.
  2. Quality and Reliability Standards:
    • Data used in regulatory submissions must meet rigorous standards of accuracy, completeness, and traceability. However, these standards are not uniformly codified into a single category called "regulatory-grade." Instead, they are embedded within various guidelines and requirements specific to different types of data and analyses.
  3. Diverse Data Sources:
    • RWE research utilizes a wide range of data sources, including electronic health records (EHRs), claims data, patient registries, and more. Each data source has its own set of quality and reliability criteria, making it impractical to label all high-quality data under a single "regulatory-grade" umbrella.
  4. Evolving Standards:
    • The standards for data quality and reliability are continually evolving as new technologies and methodologies emerge. Regulatory agencies update their guidelines regularly to reflect these advancements, which means that the criteria for acceptable data are not static and cannot be encapsulated by a fixed term like “regulatory-grade”.
  5. Regulatory Flexibility:
    • Regulatory bodies often exercise flexibility and discretion in evaluating data, considering the context and the specific needs of the regulatory decision. This adaptive approach ensures that the most relevant and reliable data are used, regardless of whether they fit a predefined "grade".



In summary, the term "regulatory-grade" data is incorrect because regulatory agencies prioritize context-specific standards, diverse data sources, evolving guidelines, and flexibility in their evaluations. They focus on the quality and appropriateness of data for its intended use rather than adhering to a fixed category.