Federated Real-World Evidence

Federated RWE fails
silently.

Distributed clinical data analysis builds statistical power by combining sites. The assumption underneath that combination is that sites are equivalent. When it is wrong, the study does not fail visibly. It produces a number that is wrong.

What federated real-world evidence is and why it matters

Federated real-world evidence studies distribute analysis across multiple healthcare sites rather than centralizing patient data. Each site analyzes its own data locally. Results or statistical summaries are aggregated centrally. This architecture enables studies that would otherwise be impossible: datasets large enough to detect rare outcomes, diverse enough to support subgroup analysis, and assembled without the legal and logistical barriers of a central data repository.

For FDA submissions, federated RWE offers a path to generating evidence at scale from routine clinical data. For sponsors, it reduces data transfer risk and accelerates study timelines. For networks like OHDSI, it is the core methodology enabling international collaboration.

The benefits are real. So is the risk that most federated study designs do not address.

The core assumption that often goes unexamined

Every federated study rests on an assumption: that the same concept ID at Site A represents the same clinical population as at Site B. Call it the equivalence assumption. It is the foundation on which the statistical combination of sites is built.

If Site A and Site B are measuring different patients under the same label, combining their data does not increase statistical power on a shared population. It combines two different populations under a single analysis while treating them as one.

Most federated study designs document that sites passed structural data quality checks. Almost none document whether sites were semantically equivalent. These are different questions with different answers.

How distributed clinical data analysis creates hidden inconsistency

The problem is structural. When data is collected and coded independently at each site, local coding practice governs what ends up in the dataset. Those practices vary in ways that are systematic, institution-specific, and largely invisible to researchers working with aggregated results.

The variation is not random noise. It reflects real differences in clinical culture, EHR configuration, documentation workflows, and the judgment calls clinicians make at the point of care. A site with a strong cardiology program may document and code cardiovascular concepts more completely and specifically than a generalist community hospital. A site with a particular EHR vendor may have different default coding pathways than a site on a different system.

These differences do not show up in structural data quality checks because those checks evaluate whether data is complete, conformant, and plausible within a site. They do not evaluate whether one site's complete, conformant, plausible dataset is measuring the same clinical phenomenon as another site's.

What silent failure looks like in practice

A federated RWE study that fails due to cross-site semantic inconsistency does not fail obviously. It produces results. Those results may be statistically significant, clinically plausible, and internally consistent. They may pass peer review. They may be submitted to FDA.

The failure mode is that the effect estimate reflects a mixture of populations rather than a single defined one. If one site's patients are systematically sicker, or younger, or coded under a stricter diagnostic threshold, the combined result reflects that compositional difference rather than the clinical reality the study intended to measure.

The most dangerous version is when cross-site variation partially cancels out, producing an effect estimate that appears stable across sensitivity analyses while masking the underlying inconsistency. The study looks robust. The problem is invisible.

What federated RWE needs beyond structural data quality

Documenting that sites passed the Data Quality Dashboard is necessary. It is not sufficient for a credible federated study. What is also needed is a concept-level evaluation of whether sites are semantically equivalent on the specific concepts driving the study.

That evaluation needs to operate across several dimensions for each concept:

Vercori is designed to evaluate all six dimensions for each concept in a study, generate a divergence score, and produce a documented report of findings with reviewer decisions recorded in a tamper-evident audit log.

Regulatory context for federated RWE consistency

FDA guidance on real-world evidence has progressively emphasized that data reliability requires more than structural correctness. The agency expects sponsors to demonstrate that data from multiple sources is fit for the specific purpose of the study, which includes demonstrating consistency in how key concepts are defined and applied across sites.

Vercori is built to produce the documentation needed to answer that question directly: a site-by-site, concept-by-concept consistency assessment designed to attach to a submission package or reference in a study methods section.

How Vercori Works

Built for the way federated networks actually operate.

Each institution runs its own local analysis. Vercori receives per-site, per-concept fingerprints, never patient data, never source codes. The platform is designed to:

1

Designed to score every concept by comparing how it is actually recorded across sites and quantifying divergence at the individual concept level.

2

Designed to flag every concept with unresolved divergence before it reaches your analysis, holding results until a qualified reviewer has documented a resolution.

3

Designed to record the full chain including classifications, reviewer decisions, resolution rationales, and gating actions in a tamper-evident log packaged for regulatory submission.

Common Questions

Federated RWE consistency: frequently asked questions

Does this problem affect all federated studies or only certain types?

It affects any study that combines data from multiple independently coded sites. The magnitude of risk varies by concept, therapeutic area, and how much coding practice varies across the specific sites in a network. A concept-level evaluation identifies which concepts in a specific study carry meaningful cross-site risk rather than applying a blanket assessment.

If sites are in the same OHDSI network, does that reduce the risk?

OHDSI network membership means sites have mapped their data to OMOP and meet certain data quality standards. It does not mean sites code concepts identically. Cross-site semantic variation has been documented within OHDSI networks in published research. Shared network membership reduces structural heterogeneity, not semantic heterogeneity.

Can statistical methods correct for cross-site semantic inconsistency?

Standard methods like site-stratification and mixed-effects models can account for some forms of cross-site heterogeneity. They cannot correct for the underlying problem if the source of heterogeneity is unknown or unmeasured. Knowing which concepts diverge, and by how much, is a prerequisite for making informed methodological choices.

How is Vercori designed to fit into an existing federated study workflow?

Vercori is designed to run before the study analysis, during the data characterization phase. Each site would generate a semantic fingerprint of its local OMOP data for the study concepts. Fingerprints are compared centrally. The resulting consistency report is intended to inform protocol decisions before data is locked and analysis begins.

What happens when Vercori identifies a divergent concept?

The report is designed to classify each concept and document the nature and magnitude of the divergence. Qualified reviewers record a decision: whether the divergence is explainable and acceptable, requires a protocol adjustment, or needs additional clinical review. Those decisions and their rationale are recorded in the audit log and included in the final report.

Pilot Program

Looking for pilot partners.

Vercori is in active pilot development. We are working with a small number of founding partners to build and validate the platform against real OMOP network use cases. If you run multi-site OMOP studies, operate a network site, or advise pharma sponsors on real-world evidence, we want to hear from you. Pilot studies are scoped individually based on network size and use case.

Book a demo  →

Run your federated study on data you have actually verified.

Vercori is designed to evaluate cross-site semantic consistency before your study runs. Timeline is scoped individually with each pilot partner.

Get in touch  →