
TTE per-protocol methods: manuscript and supplemental text
Source:vignettes/tte-methods.Rmd
tte-methods.RmdThis vignette provides drop-in text describing the target trial
emulation (TTE) per-protocol methodology implemented in the
swereg-TTE family of functions (TTEEnrollment,
TTEPlan, and friends). It is provided in two flavours so
that you can paste the appropriate level of detail into your own
manuscript.
- Manuscript methods — prose-only, suitable for the main body of a journal article. No equations.
- Supplemental methods — detailed, with formulas and per-step model specifications, suitable for a supplement or methods appendix.
Both versions describe the same workflow:
- Build a sequential-trials enrollment panel from the registry skeleton.
- Adjust for baseline confounding of treatment with
s2_ipw(). - Censor at protocol deviation and compute stabilised
inverse-probability-of- censoring weights with
s4_prepare_for_analysis()(which internally callss5_prepare_outcome()ands6_ipcw_pp()). - Fit a weighted Poisson marginal structural model for the incidence
rate ratio with
$irr(), using person-level cluster-robust standard errors.
The implementation follows the canonical sequential-trial-emulation
literature (Hernán and Robins 2008; Danaei et al. 2013; Hernán and
Robins 2016; Caniglia et al. 2023; Cashin et al. 2025) and produces
results that agree with the reference R implementation
TrialEmulation (Su et al. 2024) on simulated data with a
known true effect.
1. Manuscript methods
We applied target trial emulation, a framework for analysing
observational data under explicit protocols that mirror a hypothetical
randomized trial, to estimate the per-protocol effect of
treatment on outcome in the Swedish national
health registries (Hernán and Robins 2008, 2016). The framework requires
that the analyst specify eligibility criteria, treatment strategies, the
time of treatment assignment (“time zero”), follow-up rules, and the
causal estimand before any data analysis, and that each element of this
protocol be emulated using the observational data (Cashin et
al. 2025).
Sequential trials design
Because eligible individuals can become exposed at many different calendar times in registry data, we did not define a single time zero. Instead we used the sequential-trials (nested-trials) design of Hernán et al. (2008), as formalised for pharmacoepidemiology by Danaei et al. (2013) and adapted to time-varying eligibility windows by Caniglia et al. (2023). At every eligible time unit a new target trial is opened. Each person who is eligible at that time enters the trial as either an initiator (the unit in which they first received the treatment of interest) or a non-initiator (one or more units in which they were eligible and untreated). Individuals can therefore contribute to several sequential trials before either initiating treatment or leaving the eligible pool. This design uses each person’s treatment-naive person-time efficiently and, by anchoring time zero on eligibility rather than on the eventual exposure, prevents the immortal time bias that arises when person- time before treatment initiation is misclassified as exposed (Hernán and Robins 2016; Caniglia et al. 2023).
Estimand: per-protocol effect
Our estimand is the per-protocol effect: the effect that would have been observed if every enrolled participant had remained on the treatment strategy assigned at baseline of the trial in which they enrolled (Hernán and Robins 2008; Danaei et al. 2013).
Censoring at protocol deviation and adjustment for confounding
In every emulated trial, follow-up was artificially censored when a participant’s current treatment status diverged from the strategy assigned at that trial’s baseline (Danaei et al. 2013). Two sources of bias then need to be addressed. First, baseline treatment assignment is not random, so we adjusted for baseline confounding by inverse probability of treatment weighting based on covariates measured at each trial’s time zero (Hernán and Robins 2008; Danaei et al. 2013). Second, artificial censoring at protocol deviation is informative whenever the same time-varying factors that predict continued adherence also predict the outcome. We adjusted for this using inverse probability of censoring weighting, with separate models fit by assigned treatment arm (Hernán and Robins 2008; Danaei et al. 2013; Su et al. 2024). The censoring-weight models had numerators conditional on baseline covariates only, which stabilises the weights and improves precision, and denominators conditional on the same baseline covariates plus the most recent values of the time-varying covariates that predict adherence (Hernán and Robins 2008; Danaei et al. 2013). The final analysis weight for each person-time observation is the product of the baseline treatment weight and the cumulative censoring weight up to that observation.
Outcome model and inference
We dropped rows corresponding to censoring events from the analysis
dataset and fit a weighted Poisson regression of the binary event
indicator on assigned baseline treatment, using log person-time as the
offset, with the product of the baseline and cumulative censoring
weights as the analysis weight. The exponent of the treatment
coefficient estimates the marginal per-protocol incidence rate ratio
under sustained treatment. Because each person can contribute repeated
observations within and across emulated trials, we computed
cluster-robust standard errors clustered on the person identifier
(Hernán and Robins 2008; Danaei et al. 2013). This is the same general
estimation strategy implemented in the TrialEmulation R
package (Su et al. 2024); our implementation differs mainly in using a
weighted Poisson rate model on the collapsed analysis data rather than a
pooled logistic discrete-time hazard.
Software
All analyses were performed in R (R Core Team). Sequential-trial
enrollment, baseline inverse probability of treatment weighting,
censoring at protocol deviation, stabilised inverse probability of
censoring weighting, weight truncation, and the final weighted Poisson
incidence-rate-ratio model were implemented in the swereg
package, which orchestrates the per-protocol target trial emulation
pipeline described above. The censoring-weight denominator and numerator
models were fit as generalised additive models using mgcv.
The final weighted incidence-rate-ratio regression and its
cluster-robust standard errors were computed by
survey::svyglm() on a design clustered on the person
identifier. All data manipulation was performed in
data.table.
2. Supplemental methods
Notation
Index individuals by , sequential trials by their baseline time , and follow-up time within a trial by . Let be the indicator of being on the protocol-defined treatment at trial-time , with the assigned baseline treatment. Let be the vector of time-varying covariates measured at , with their baseline values; let be time-fixed (e.g., sex, birth year) covariates. Let be the outcome indicator at trial-time and the indicator of artificial censoring (the participant has deviated from the baseline-assigned strategy by trial-time ).
Sequential trials and the analysis panel
Following Hernán et al. (2008), Danaei et al. (2013), and Caniglia et al. (2023), we expand the registry skeleton into a panel of (person, trial, follow-up time) rows. Person contributes a row to trial whenever they are eligible at trial-time zero of trial and remain uncensored and event-free through trial-time . As in Su et al. (2024), this expansion is what allows a single weighted regression to recover the marginal per-protocol contrast pooled across all sequential trials.
Estimand
The per-protocol estimand is the marginal contrast in the discrete-time hazard of between the potential outcomes under sustained treatment () and sustained non-treatment ():
This is the observational analogue of the per-protocol effect of a trial in which everyone adhered to their baseline assignment throughout follow-up (Hernán and Robins 2008; Danaei et al. 2013).
Baseline IPW for non-random assignment
At each trial baseline , treatment is not randomly assigned, so we fit a logistic model for the propensity of initiating the assigned strategy at conditional on :
The stabilised baseline weight (Hernán and Robins 2008) is
These weights are stored on each (person, trial) baseline row by
swereg::TTEEnrollment$s2_ipw(). By construction
is constant within a (person, trial) across follow-up times
.
Censoring at protocol deviation
For the per-protocol contrast, we artificially censor rows at the
first
at which
(Danaei et al. 2013; Su et al. 2024). Define
to be 1 at the row in which deviation first occurs, and 0 at all prior
rows. The censoring-event row is itself removed from the final analysis
dataset by swereg::TTEEnrollment$s4_prepare_for_analysis();
only rows with
enter the outcome model.
Stabilised IPCW for adherence
To unbias the per-protocol estimand against informative artificial censoring, each uncensored row is weighted by the inverse of its estimated probability of remaining uncensored to that visit, stabilised by the corresponding probability conditional only on baseline information (Hernán and Robins 2008; Danaei et al. 2013 Appendix; Su et al. 2024). The stabilised IPCW for individual at trial-time in arm is
This is exactly equation (3) of Su et al. (2024) and matches the stabilised- weight construction in Hernán and Robins (2008) and the Appendix of Danaei et al. (2013, the equation).
swereg::TTEEnrollment$s6_ipcw_pp() estimates the two
probabilities by two separate pooled logistic regressions, fit by
treatment arm:
Numerator (baseline covariates only) — the model whose role is to stabilise the weight, not to remove confounding (Hernán and Robins 2008; Danaei et al. 2013):
Denominator (baseline plus time-varying covariates) — the model that does the causal work; correct specification is required for consistent estimation of the per-protocol effect (Hernán and Robins 2008; Su et al. 2024):
Here
and
are smooth functions of follow-up time (we use a quadratic in
,
following Danaei et al. 2013;
swereg::TTEEnrollment$s6_ipcw_pp() also supports a
generalized additive model on the time index via
estimate_ipcw_pp_with_gam = TRUE). Both models are fit
separately for
and
so that adherence dynamics are allowed to differ by arm.
The per-row stabilised censoring weight is the cumulative product
over follow-up times up to (but not including) the current row, exactly
as in the Danaei et al. (2013) appendix construction of
.
To prevent undue influence of extreme weights we truncate the stabilised
weights at configurable percentiles via
swereg::TTEEnrollment$s3_truncate_weights() (Danaei et
al. 2013 used a cap of 10).
Final analysis weight
Each non-censoring-event row carries the analysis weight
The baseline factor adjusts for confounding of treatment assignment, the cumulative factor adjusts for informative censoring at protocol deviation, and their product yields a pseudo-population in which the marginal per-protocol contrast is identified under the standard sequential- exchangeability, positivity, consistency, and no-interference assumptions (Hernán and Robins 2008; Su et al. 2024).
Outcome model: weighted Poisson MSM for the IRR
swereg::TTEEnrollment$irr() fits a weighted Poisson
marginal structural model for the rate of
as a function of the assigned baseline treatment
:
estimated by
survey::svyglm(..., family = quasipoisson()). The estimand
is the marginal per-protocol incidence rate ratio. This is operationally
equivalent — for a rare outcome and a sufficiently fine time grid — to
the pooled logistic discrete-time hazard MSM used in
TrialEmulation (Su et al. 2024, eq. 4); we use the Poisson
rate parameterisation because the swereg skeleton is already organised
as person-time intervals.
Validation against the canonical reference implementation
We verified that the swereg implementation agrees with the canonical
reference implementation in TrialEmulation (Su et al. 2024)
on simulated data with a known true effect: point estimates, standard
errors, and confidence interval widths agree to within finite-sample
noise across sample sizes from 2,000 to 50,000 individuals, with the
only systematic difference being the well-known non-equivalence of the
rate ratio and odds ratio for events that are not extremely rare. The
validation simulations and reproducible scripts are bundled in the
package’s dev/ directory.
Inference: cluster-robust variance
Because (1) each person contributes correlated rows across follow-up
times within a trial, (2) each person can appear in multiple sequential
trials, and (3) the weights are estimated rather than known, the
model-based standard errors from glm() are
anti-conservative. Following the recommendation of Hernán and Robins
(2008), Danaei et al. (2013), and Su et al. (2024), we report
cluster-robust (Huber–White sandwich) standard errors with clustering on
the person identifier. In the swereg implementation this is computed via
survey::svydesign(ids = ~person_id) and propagated to the
confidence interval for
by swereg::TTEEnrollment$irr().
References
- Hernán MA, Robins JM. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 2008;19(6):766–779.
- Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol 2016;183(8):758–764.
- Danaei G, García Rodríguez LA, Cantero OF, Logan R, Hernán MA. Observational data for comparative effectiveness research: an emulation of randomised trials of statins and primary prevention of coronary heart disease. Stat Methods Med Res 2013;22(1):70–96.
- Caniglia EC, et al. Emulating a sequence of target trials to avoid immortal time bias: an application in pregnancy. Am J Epidemiol 2023.
- Cashin AG, et al. Emulating a target trial — the TARGET statement. JAMA 2025.
- Su L, Rezvani R, Seaman SR, Bartlett JW. TrialEmulation: An R package to emulate target trials for time-to-event data from electronic health records. arXiv:2402.12083, 2024.