Federated Real-World Evidence: The Hidden Consistency Problem

What federated real-world evidence is and why it matters

Federated real-world evidence studies distribute analysis across multiple healthcare sites rather than centralizing patient data. Each site analyzes its own data locally. Results or statistical summaries are aggregated centrally. This architecture enables studies that would otherwise be impossible: datasets large enough to detect rare outcomes, diverse enough to support subgroup analysis, and assembled without the legal and logistical barriers of a central data repository.

For FDA submissions, federated RWE offers a path to generating evidence at scale from routine clinical data. For sponsors, it reduces data transfer risk and accelerates study timelines. For networks like OHDSI, it is the core methodology enabling international collaboration.

The benefits are real. So is the risk that most federated study designs do not address.

The core assumption that often goes unexamined

Every federated study rests on an assumption: that the same concept ID at Site A represents the same clinical population as at Site B. Call it the equivalence assumption. It is the foundation on which the statistical combination of sites is built.

If Site A and Site B are measuring different patients under the same label, combining their data does not increase statistical power on a shared population. It combines two different populations under a single analysis while treating them as one.

Most federated study designs document that sites passed structural data quality checks. Almost none document whether sites were semantically equivalent. These are different questions with different answers.

How distributed clinical data analysis creates hidden inconsistency

The problem is structural. When data is collected and coded independently at each site, local coding practice governs what ends up in the dataset. Those practices vary in ways that are systematic, institution-specific, and largely invisible to researchers working with aggregated results.

The variation is not random noise. It reflects real differences in clinical culture, EHR configuration, documentation workflows, and the judgment calls clinicians make at the point of care. A site with a strong cardiology program may document and code cardiovascular concepts more completely and specifically than a generalist community hospital. A site with a particular EHR vendor may have different default coding pathways than a site on a different system.

These differences do not show up in structural data quality checks because those checks evaluate whether data is complete, conformant, and plausible within a site. They do not evaluate whether one site's complete, conformant, plausible dataset is measuring the same clinical phenomenon as another site's.

What silent failure looks like in practice

A federated RWE study that fails due to cross-site semantic inconsistency does not fail obviously. It produces results. Those results may be statistically significant, clinically plausible, and internally consistent. They may pass peer review. They may be submitted to FDA.

The failure mode is that the effect estimate reflects a mixture of populations rather than a single defined one. If one site's patients are systematically sicker, or younger, or coded under a stricter diagnostic threshold, the combined result reflects that compositional difference rather than the clinical reality the study intended to measure.

The most dangerous version is when cross-site variation partially cancels out, producing an effect estimate that appears stable across sensitivity analyses while masking the underlying inconsistency. The study looks robust. The problem is invisible.

What federated RWE needs beyond structural data quality

Documenting that sites passed the Data Quality Dashboard is necessary. It is not sufficient for a credible federated study. What is also needed is a concept-level evaluation of whether sites are semantically equivalent on the specific concepts driving the study.

That evaluation needs to operate across several dimensions for each concept:

Source code distribution: how patients are distributed across the specific billing and clinical codes that map to this concept at each site
Co-occurrence patterns: which other conditions co-occur with this concept and whether that pattern is consistent
Measurement availability: whether relevant measurement data is present and comparable
Demographic profile: whether the age, gender, and demographic composition of coded patients is consistent
Drug co-prescription patterns: whether treatment patterns reflect a consistent patient population
Specialty mix: whether the clinical specialties involved in diagnosis and treatment are comparable

Vercori is designed to evaluate all six dimensions for each concept in a study, generate a divergence score, and produce a documented report of findings with reviewer decisions recorded in a tamper-evident audit log.

Regulatory context for federated RWE consistency

FDA guidance on real-world evidence has progressively emphasized that data reliability requires more than structural correctness. The agency expects sponsors to demonstrate that data from multiple sources is fit for the specific purpose of the study, which includes demonstrating consistency in how key concepts are defined and applied across sites.

Vercori is built to produce the documentation needed to answer that question directly: a site-by-site, concept-by-concept consistency assessment designed to attach to a submission package or reference in a study methods section.

Common Questions

Federated RWE consistency: frequently asked questions

Does this problem affect all federated studies or only certain types?

It affects any study that combines data from multiple independently coded sites. The magnitude of risk varies by concept, therapeutic area, and how much coding practice varies across the specific sites in a network. A concept-level evaluation identifies which concepts in a specific study carry meaningful cross-site risk rather than applying a blanket assessment.

If sites are in the same OHDSI network, does that reduce the risk?

OHDSI network membership means sites have mapped their data to OMOP and meet certain data quality standards. It does not mean sites code concepts identically. Cross-site semantic variation has been documented within OHDSI networks in published research. Shared network membership reduces structural heterogeneity, not semantic heterogeneity.

Can statistical methods correct for cross-site semantic inconsistency?

Standard methods like site-stratification and mixed-effects models can account for some forms of cross-site heterogeneity. They cannot correct for the underlying problem if the source of heterogeneity is unknown or unmeasured. Knowing which concepts diverge, and by how much, is a prerequisite for making informed methodological choices.

How is Vercori designed to fit into an existing federated study workflow?

Vercori is designed to run before the study analysis, during the data characterization phase. Each site would generate a semantic fingerprint of its local OMOP data for the study concepts. Fingerprints are compared centrally. The resulting consistency report is intended to inform protocol decisions before data is locked and analysis begins.

What happens when Vercori identifies a divergent concept?

The report is designed to classify each concept and document the nature and magnitude of the divergence. Qualified reviewers record a decision: whether the divergence is explainable and acceptable, requires a protocol adjustment, or needs additional clinical review. Those decisions and their rationale are recorded in the audit log and included in the final report.

Federated RWE fails
silently.

What federated real-world evidence is and why it matters

The core assumption that often goes unexamined

How distributed clinical data analysis creates hidden inconsistency

What silent failure looks like in practice

What federated RWE needs beyond structural data quality

Regulatory context for federated RWE consistency

Built for the way federated networks actually operate.

Federated RWE consistency: frequently asked questions

Does this problem affect all federated studies or only certain types?

If sites are in the same OHDSI network, does that reduce the risk?

Can statistical methods correct for cross-site semantic inconsistency?

How is Vercori designed to fit into an existing federated study workflow?

What happens when Vercori identifies a divergent concept?

Looking for pilot partners.

Run your federated study on data you have actually verified.

Federated RWE failssilently.

What federated real-world evidence is and why it matters

The core assumption that often goes unexamined

How distributed clinical data analysis creates hidden inconsistency

What silent failure looks like in practice

What federated RWE needs beyond structural data quality

Regulatory context for federated RWE consistency

Built for the way federated networks actually operate.

Federated RWE consistency: frequently asked questions

Does this problem affect all federated studies or only certain types?

If sites are in the same OHDSI network, does that reduce the risk?

Can statistical methods correct for cross-site semantic inconsistency?

How is Vercori designed to fit into an existing federated study workflow?

What happens when Vercori identifies a divergent concept?

Looking for pilot partners.

Run your federated study on data you have actually verified.

Federated RWE fails
silently.