Data Quality Dashboard Limitations for Federated Studies

What the Data Quality Dashboard actually checks

The OHDSI Data Quality Dashboard is the standard tool for evaluating OMOP data quality at a site. It runs a systematic set of checks against a single OMOP instance and evaluates data across three categories:

Completeness: whether expected data fields are populated and whether expected records are present for the patient population
Conformance: whether data values follow OMOP conventions, including correct concept ID usage, value domains, and relational integrity
Plausibility: whether values are clinically plausible given what else is known about a patient, including temporal plausibility and atemporal distributional checks

These checks are rigorous and valuable. A site that passes DQD has data that is structurally sound, correctly mapped, and internally plausible. That is a meaningful quality standard.

It is also entirely within a single site. The DQD was designed to evaluate one OMOP instance. It has no mechanism for comparing how that site's data relates to any other site's data.

What DQD does not check

The DQD does not evaluate whether two sites that both pass their individual DQD checks are semantically equivalent. It cannot, because it only sees one site at a time. The questions it does not answer include:

Are the patients coded under concept X at Site A clinically similar to those coded under concept X at Site B?
Do sites apply the same diagnostic confirmation thresholds before assigning a diagnosis code?
Are the source codes that map to a given OMOP concept distributed similarly across sites, or do different sites rely on different underlying codes?
Do co-occurrence patterns suggest that sites are capturing the same clinical population?
Are treatment patterns and specialty involvement consistent, or do they reflect systematically different patient populations?

None of these questions have answers in a single-site DQD report. For a study that operates within a single site, that is fine. For a federated study that combines data across sites, these are the questions that determine whether the combination is valid.

Saying "all sites passed DQD" answers the within-site quality question. It does not answer the cross-site equivalence question. For federated studies, both answers are required.

DQD vs Vercori: what each tool addresses

Check	DQD	Vercori
Structural completeness within a site	✓ Yes	✓ Inherits
Value conformance to OMOP conventions	✓ Yes	✓ Inherits
Temporal and atemporal plausibility	✓ Yes	✓ Inherits
Cross-site semantic consistency	— Out of scope	★ Core function
Diagnostic confirmation variation	— Out of scope	★ Detected
Source code distribution comparison	— Out of scope	★ Measured
Cross-site co-occurrence analysis	— Out of scope	★ Measured
Submission-ready consistency documentation	— Out of scope	★ Produced
Tamper-evident reviewer audit log	— Out of scope	★ Produced

Why this distinction matters for federated study design

In a single-site study, DQD is sufficient as a data quality evaluation. The study operates within one institution's data, and the within-site quality questions are the right ones to ask.

In a federated study, data from multiple sites is combined. The validity of that combination depends on whether sites are semantically equivalent on the concepts being analyzed. That is not a question the DQD was designed to answer, and running DQD at each site and reporting that all sites passed does not address it.

The gap matters more as the number of sites and the clinical complexity of the study increase. A two-site study with well-characterized sites on a simple concept may have manageable cross-site risk. A ten-site study across diverse institution types on complex chronic conditions has substantial cross-site semantic risk that requires explicit evaluation and documentation.

Using both tools together

DQD and Vercori are designed to be complementary. DQD should run first, establishing that each site's data is structurally sound before cross-site comparison begins. Vercori is then designed to evaluate whether the structurally sound data across sites is semantically consistent.

Together, they are built to answer the full data quality question for a federated study: is each site's data correct within itself, and are the sites measuring the same clinical reality across the network?

The output of both evaluations should be documented and available for regulatory review. For submissions where real-world data quality is a material question, the combination of within-site structural quality documentation and cross-site semantic consistency documentation is the complete answer.

Common Questions

DQD and Vercori: frequently asked questions

Does Vercori replace the Data Quality Dashboard?

No. The DQD evaluates within-site structural data quality. Vercori is designed to evaluate cross-site semantic consistency. They address different questions and both are needed for a complete federated study data quality evaluation. The right approach is to run DQD first at each site, then run Vercori across sites.

Are there other OHDSI tools that address this gap?

ARES provides network-level data characterization and can surface distributional differences across sites. CohortDiagnostics evaluates cohort definitions across sites. Neither tool is designed to close the specific gap Vercori addresses: a concept-level cross-site semantic consistency evaluation with reviewer decisions recorded in a tamper-evident audit log, producing documentation designed to attach to a study submission.

What would Vercori require from sites that have already run DQD?

Sites that have completed DQD have already established the structural quality of their OMOP data. Vercori is designed to then run a local semantic fingerprint analysis at each site. The technical requirements for that additional step are intended to be modest and would not require re-running DQD or modifying existing data.

Can a Vercori evaluation surface problems that DQD missed?

It is designed to surface a different category of problem. DQD checks structural correctness. Vercori is built to check semantic consistency across sites. A concept can be structurally perfect at every site while being defined differently enough across sites to materially affect study results. Vercori is designed to identify those differences. DQD is not built to do that.

How is Vercori designed to document its findings for regulatory purposes?

Each finding is designed to be assessed by a qualified reviewer whose identity, decision, and rationale are recorded in a tamper-evident audit log. The final report is built to be signed and time-stamped, designed for attachment to a regulatory submission package or reference in a study methods section.

The Data Quality Dashboard
stops at the site boundary.

What the Data Quality Dashboard actually checks

What DQD does not check

DQD vs Vercori: what each tool addresses

Why this distinction matters for federated study design

Using both tools together

Built for the way federated networks actually operate.

DQD and Vercori: frequently asked questions

Does Vercori replace the Data Quality Dashboard?

Are there other OHDSI tools that address this gap?

What would Vercori require from sites that have already run DQD?

Can a Vercori evaluation surface problems that DQD missed?

How is Vercori designed to document its findings for regulatory purposes?

Looking for pilot partners.

Close the gap DQD leaves open.

The Data Quality Dashboardstops at the site boundary.

What the Data Quality Dashboard actually checks

What DQD does not check

DQD vs Vercori: what each tool addresses

Why this distinction matters for federated study design

Using both tools together

Built for the way federated networks actually operate.

DQD and Vercori: frequently asked questions

Does Vercori replace the Data Quality Dashboard?

Are there other OHDSI tools that address this gap?

What would Vercori require from sites that have already run DQD?

Can a Vercori evaluation surface problems that DQD missed?

How is Vercori designed to document its findings for regulatory purposes?

Looking for pilot partners.

Close the gap DQD leaves open.

The Data Quality Dashboard
stops at the site boundary.