Real World Data in Clinical Trials: Practical Guide

Real world data in clinical trials explained: sources, regulatory context, use cases, and a practical guide to capturing RWD in your next study.

Jan 27, 2026

Infographic illustrating real-world data sources in clinical trials, from wearables and EHR to patient-reported outcomes

Writer

George Charalambous

Fact-checked by

Thijs Sondag

Last Updated

April 22, 2026

Regulators now accept real world data as evidence. But the harder question for most research teams is more practical. How do we actually capture it inside a live trial, alongside our existing endpoints, without building a second data pipeline from scratch?

Real world data (RWD) now sits alongside traditional randomised controlled trial data in more submissions every year. The US Food and Drug Administration (FDA) has issued a growing stack of guidance since the 21st Century Cures Act. The European Medicines Agency (EMA) is scaling its DARWIN EU network. Australia's Therapeutic Goods Administration (TGA) accepts real world evidence (RWE) for specific regulatory decisions.

The direction of travel is clear. What's less clear is how a principal investigator, study manager, or sponsor actually brings RWD into a protocol.

This guide is for the people doing that work. We'll cover what RWD is, where it comes from, how it gets used across the trial lifecycle, what regulators expect, and how to operationalise it without blowing up your timeline.

Along the way we'll share practical examples from studies we've supported. Those include a rare disease registry with 15 years of longitudinal data, a respiratory observational study running on wearable devices, and a 6,000 participant decentralised vaccine trial across five countries. If you want the short version, our real world evidence platform is built to make prospective RWD capture part of the same participant experience as the rest of the trial.

What is real world data in clinical trials?

Real world data in clinical trials refers to patient health information collected outside the traditional controlled trial setting. Typical sources include electronic health records, administrative claims, disease registries, patient reported outcomes, wearable devices, and mobile health apps. In a clinical trial context, RWD can supplement the data collected through protocol driven site visits, and in specific cases, support regulatory decisions on its own.

The FDA's working definition is aligned with that. RWD is data relating to patient health status or the delivery of health care routinely collected from a variety of sources. Think of RWD as the raw material. Real world evidence is what you build from it through well designed analysis.

Real world data vs real world evidence (RWD vs RWE)

In practice, RWD and RWE get used interchangeably, which causes problems for submissions. However, they are different things.

Real world data (RWD) is the raw patient data collected outside a clinical trial setting. These are the inputs.
Real world evidence (RWE) is the clinical evidence generated from analysing RWD. This is the output.

Aspect	RWD	RWE
What it is	Raw patient data	Clinical evidence
Sources	EHR, claims, registries, wearables, PROMs	Analyses and studies built on RWD
Purpose	Inputs	Outputs
Example	A year of heart rate readings from participants' wearables	A published analysis showing those patterns predict hospital readmission

The TGA, FDA, and EMA all draw this line in their guidance. When you're writing a protocol, you're planning what RWD you'll capture. When you're writing a submission, you're presenting RWE.

Where real world data comes from

Good RWE starts with choosing the right real world data sources for your research question. Most trials that use RWD draw from a mix of the following.

Electronic health records (EHR) and medical records

Clinician entered information about diagnoses, procedures, medications, and lab results. Rich clinical detail, but variable quality across sites and often stuck in systems that don't talk to each other.

Administrative and claims data

Payer and provider billing information. Good for utilisation, cost, and population scale analyses. However, it lacks clinical detail like symptoms and lab values.

Disease and patient registries

Longitudinal, condition specific datasets built around a defined population. Registries are a strong fit for rare diseases, long term safety surveillance, and natural history studies.

When we supported FSHD Global in building their rare disease patient registry, the data ended up covering 15 years of records. That included participant reported outcomes, clinical assessments, and genetic information across a globally distributed population. That kind of longitudinal record is almost impossible to reconstruct retrospectively.

Patient reported outcomes (PROMs and PREMs)

Symptom, function, quality of life, and experience measures captured directly from participants. Validated instruments from our PROMs and PREMs library, including EQ-5D, PROMIS, and condition specific scales, do most of the work here.

Wearables, sensors, and digital biomarkers

Continuous physiological data from consumer wearables, clinical grade sensors, and connected medical devices. Cardiac rhythm, activity, sleep, and respiratory data are now part of mainstream research. The Diag-NOSE respiratory research study combined home spirometry, wearable activity data, and symptom PROMs to build a richer picture than clinic visits alone could produce.

Participant generated health data (PGHD) from mobile apps

App captured symptom diaries, medication adherence, reminder response, and uploaded content like photos, voice notes, and short videos. Our Integration Engine lets you link this back to your study database and, where relevant, to EHR systems.

Labs, imaging, and genomic data

Results from local pathology and imaging providers, plus genomic datasets. The size of these inputs has grown quickly. Projects like OurDNA, which we supported across diverse Australian populations, show that genomic data can become a usable RWD source at scale when the capture pipeline is thought through in advance.

How real world data is used across the trial lifecycle

Real world data in clinical trials isn't only a post approval tool. It shows up at every stage of the study lifecycle.

Study feasibility and protocol design

Use EHR and registry data to pressure test eligibility thresholds before you finalise the protocol. So if only 200 patients a year meet your screening criteria across your network, you'll find out before you over commit to a recruitment timeline.

Patient recruitment and site selection

Site level RWD tells you which clinics actually see the population you need. It beats guessing from historic enrolment rates at sites your team has worked with before.

External or synthetic control arms

That said, regulators will now accept RWD based external control arms in specific contexts, especially for rare diseases and oncology where a placebo arm would be unethical or impractical. The FDA's 2023 guidance goes into detail on the conditions that make this acceptable.

Pragmatic and hybrid trials

A pragmatic trial sits closer to routine care than a classical RCT. A hybrid trial blends randomised interventional elements with real world follow up. Our decentralised clinical trial platform was designed for that mixed model.

Safety surveillance and label expansion

RWD supports ongoing safety monitoring after approval and provides the evidence base for label expansions. For that reason, many sponsors now build ongoing RWD capture into the original trial design, so they aren't starting from scratch after launch.

Health economics and outcomes research (HEOR)

Payers and health technology assessment bodies want evidence a therapy works in actual use, not only in ideal trial conditions. RWD is where that evidence lives.

Want to see how this works for your study? Our real world evidence platform supports prospective RWD capture across wearables, PROMs, app reported symptoms, and connected devices in a single participant experience.

Regulatory context for RWD in clinical trials

If you're planning to use RWD to support a regulatory decision, the agencies' positions matter. The short version is that all three major regulators accept RWE in specific circumstances, and all three expect the data to be fit for purpose.

FDA RWE framework and 21st Century Cures Act

The 21st Century Cures Act (2016) directed the FDA to create a framework for evaluating RWE in regulatory decisions. The FDA has since released multiple RWE guidance documents covering data standards, study design, submission format, and the use of external control arms. Notable approvals using RWE include label expansions, post approval requirements, and rare disease indications.

EMA and DARWIN EU

The EMA launched the Data Analysis and Real World Interrogation Network (DARWIN EU) in 2022. It connects national data sources across member states to support regulatory decisions with RWE, and the network is scaling through to 2026. Expect EMA submissions to increasingly draw on DARWIN EU benchmarks.

TGA guidance in Australia

The TGA publishes specific guidance on real world evidence and when it can support regulatory decisions in Australia. For Australian sponsors, that's the first reference to read. For global sponsors, TGA's position matters because Australian registration often follows a different evidence pathway than the US or EU.

ICH E6(R3) and data quality expectations

The International Council for Harmonisation's E6(R3) good clinical practice (GCP) update moves toward modern, risk based trial management that accommodates RWD. Data provenance, traceability, and audit trails apply to RWD as strictly as they do to traditional trial data.

Still, regulators are careful with RWD. "Supports" is the right verb. RWE rarely replaces an RCT on its own.

Data quality and fit for purpose assessment

Not all real world data in clinical trials is research grade. The FDA's framework centres on two questions: is the data relevant to your research question, and is it reliable enough to support regulatory decisions?

A fit for purpose assessment usually covers:

Relevance: Does the source population match your target population? Does it capture the outcomes you care about?
Reliability: Are the data elements well defined, consistently collected, and validated? What's the missingness rate across sites?
Provenance: Where did the data come from? How was it captured, transformed, and stored?
Linkage: Can you connect records across sources, for example EHR to claims to PROMs, without introducing bias?
Consent and privacy: Do participants understand what's being collected and how it's being used? Can you support re-consent if your analysis plan changes?

The consent piece is where many retrospective RWD projects run into trouble. Prospective capture through dynamic eConsent sidesteps that. Participants explicitly agree to each data use at enrolment, and they give additional consent for new sub-studies as those emerge.

Benefits and limitations of real world data

Benefits of RWD in clinical trials

Broader population coverage: RWD reaches patients who never make it into traditional trials, including older adults, participants with comorbidities, and people in remote communities.
Faster feasibility checks: Protocol assumptions get tested against real populations before enrolment opens.
Long follow up: Registry and EHR data can extend follow up by years beyond the formal trial endpoint.
Lower participant burden: Much of the data collects passively through routine care or wearables.
Regulatory flexibility: FDA, EMA, and TGA all accept RWE for specific decisions where an RCT isn't feasible.
Commercial readiness: Payers want real world evidence of effectiveness, not only efficacy.

Limitations and common pitfalls

Data heterogeneity: EHR data varies across sites, vendors, and clinicians.
Missing outcomes: Important endpoints like quality of life, symptoms, and side effects often aren't captured in routine care data.
Selection bias: Who ends up in a given dataset isn't random, and the analysis has to account for that.
Linkage risk: Connecting datasets can introduce privacy issues and re-identification risk if not handled carefully.
Regulatory uncertainty: RWE acceptance varies by indication, agency, and decision type.
Analysis complexity: Propensity scores, causal inference methods, and sensitivity analyses are demanding work.

So if your protocol assumes RWD will fill a specific evidence gap, pressure test that assumption early with a small feasibility analysis. It's far cheaper to rethink data sourcing at month two than at month twelve.

How to operationalise RWD in your clinical trial

That's where most guides stop. For research teams building a study, here's the practical path we walk through with partners.

Step 1: Map evidence needs to fit for purpose sources

Start with the question the study has to answer. List the endpoints, the populations, the follow up period, and the regulatory users of the final evidence. Then map each piece of evidence to the RWD source most likely to produce it well. Avoid the "let's collect everything" trap.

Step 2: Decide prospective vs retrospective capture

Retrospective RWD from existing EHR, claims, and registry data is faster to start but constrained by what's already been collected. Prospective RWD from participant apps, wearables, and ePRO takes more setup, but is built to answer your specific question. In practice, most rigorous trials use a mix of the two.

Step 3: Design consent and governance upfront

Build consent for each RWD source into the initial protocol. Dynamic eConsent lets participants agree to the specific uses, and add consent later for new sub-studies. This is the single biggest operational difference between trials that can expand their RWD use over time and trials that get stuck at database lock.

Step 4: Build capture into the participant experience

A clean participant app is the difference between 90% adherence and 40%. Digital diaries, reminder logic, wearable data sync, and PROMs collection should feel like one experience, not six. There's more detail on why this matters in our post on mobile data collection for research studies.

When we supported the BRACE Trial with Murdoch Children's Research Institute, the decentralised platform reached over 6,000 participants across five countries during the pandemic. Adherence held above 90%. The same participant app carried eConsent, symptom diaries, and wearable data, so the RWD flowed in the background while the participant focused on the study itself.

Step 5: Plan analytics and reporting

Decide how RWD flows into your analysis before enrolment starts. An analytics dashboard with real time views of data quality, missingness, and adherence saves you from finding problems at database lock, when fixing them is most expensive.

Ready to plan your RWD capture strategy? Explore our clinical trial solutions to see how WeGuide supports prospective real world data collection across clinical trials, registries, and observational studies.

Frequently asked questions about RWD in clinical trials

Can real world data replace a clinical trial?

Usually no, not on its own. For most drug approvals, regulators still expect RCT data as the primary efficacy evidence. RWD can support specific decisions like label expansions, rare disease indications where an RCT isn't feasible, and ongoing safety monitoring. RWE and RCTs are complementary, not competing.

What's the difference between RWD and an observational study?

RWD is a data type. An observational study is a study design that may or may not use RWD. Our observational studies platform supports collecting RWD prospectively from participants through PROMs, wearables, and app activity, rather than relying only on retrospective EHR data.

Is RWD accepted by the FDA for drug approvals?

The FDA accepts RWE to support specific regulatory decisions under its post-2016 framework. Acceptance depends on the indication, the decision type, and the quality of the underlying data. The FDA's published RWE guidance is the place to start if you're planning a submission that includes RWE.

How do you protect participant privacy with real world data?

Through layered controls. Dynamic eConsent for explicit participant consent to each data use. De-identification standards appropriate to the data type. Secure data storage with role based access.

Clear data governance documentation matters too. WeGuide is a TGA certified Class I medical device software platform, and supports GCP aligned data management across all of these controls.

Bringing it all together

Real world data in clinical trials has moved from a side project to a core piece of modern research. Today, regulators accept it, payers want it, and participants generate more of it every year through the devices and apps already part of their daily lives.

The practical work lives in the protocol. Map your evidence needs to the right sources. Build consent and governance in from day one.

Design the participant experience so the data collects itself. Plan your analytics before enrolment opens.

Our team has supported rare disease registries, decentralised trials, population cohorts, and observational research running entirely on digital capture. On balance, the studies that get the most value from RWD are the ones that plan for it upfront, not the ones that bolt it on at the end.

If you're thinking about how to bring real world data into your next clinical trial, we'd like to help. Take a look at our observational studies platform, or book a conversation to talk through your study design.

‍

Questions we frequently get asked about this topic

No items found.