Bundles the ETT grid, skeleton file paths, and design column names into a single object using a builder pattern. Create an empty plan with [TTEPlan$new()], then add ETTs one at a time with `$add_one_ett()`. Supports `plan[[i]]` to extract the i-th enrollment spec for interactive testing.
Design parameters (confounder_vars, person_id_var, treatment_var, etc.) are stored per-ETT in the `ett` data.table, allowing different ETTs to use different confounders or design columns. Within an enrollment_id (same follow_up + age_group), design params must match.
Computed properties
- max_follow_up
(read-only) The maximum `follow_up` across all ETTs. Used by `$enrollment_spec()` to set `design$follow_up_time` so that enrollment covers the longest follow-up per enrollment group. Returns `NA` when no ETTs have been added.
Methods
- `$add_one_ett(...)`
Add one ETT row to the plan. Returns `invisible(self)`.
- `$save(dir)`
Save the plan to disk as `.qs2`. Returns `invisible(path)`.
- `$enrollment_spec(i)`
Extract the i-th enrollment spec as a list with design, age_range, etc.
- `$s1_generate_enrollments_and_ipw(...)`
Run Loop 1: skeleton files to trial panels + IPW.
- `$s2_generate_analysis_files_and_ipcw_pp(...)`
Run Loop 2: per-ETT IPCW-PP + analysis file generation.
See also
[qs2_read()] to load from disk
Other tte_classes:
TTEDesign,
TTEEnrollment
Public fields
project_prefixCharacter, string used for file naming.
ettNULL or a data.table with per-ETT columns.
skeleton_filesCharacter vector of skeleton file paths.
global_max_isoyearweekAdmin censoring boundary.
specParsed study spec (from [tteplan_read_spec()]), or NULL.
expected_skeleton_file_countExpected number of skeleton files, or NULL.
code_registrydata.table from [RegistryStudy]`$summary_table()`, or NULL.
expected_n_idsTotal number of individuals across all batches, or NULL.
created_atPOSIXct. When this plan was created.
registry_study_created_atPOSIXct or NULL. When the source RegistryStudy was created.
skeleton_created_atPOSIXct or NULL. When skeleton files were created (from first file's attribute).
period_widthInteger, band width in weeks for enrollment (default: 4L).
enrollment_countsNamed list of per-enrollment TARGET Item 8 data. Each element is a list with:
- attrition
Long-format data.table (trial_id, criterion, n_persons, n_person_trials, n_intervention, n_comparator) showing cumulative attrition at each eligibility step. Includes a
"before_exclusions"row with pre-filtering counts.- matching
data.table (trial_id, n_intervention_total, n_comparator_total, n_intervention_enrolled, n_comparator_enrolled).
output_dirCharacter. Directory where enrollment/analysis files are stored.
results_enrollmentNamed list of per-enrollment analysis results (keyed by enrollment_id).
results_ettNamed list of per-ETT analysis results (keyed by ett_id).
spec_reloaded_atPOSIXct or NULL. When `$reload_spec()` was last called to refresh cosmetic labels.
spec_reload_skipped_diffsCharacter vector of structural spec differences that `$reload_spec()` chose not to apply, or NULL.
spec_versionCharacter. Spec version tag (e.g. `"v003"`) that selects the YAML filename and the results sub-directory.
dir_tteplan_cp[CandidatePath] for the directory where `tteplan.qs2` and its companion enrollment/analysis files live.
dir_spec_cp[CandidatePath] for the directory containing the spec YAML (`spec_vXXX.yaml`).
dir_results_cp[CandidatePath] for the results base directory. `dir_results` (active binding) appends `spec_version` to this.
registrystudyEmbedded [RegistryStudy] R6 object. Owns the rawbatch and skeleton directory candidates; accessed via `plan$data_skeleton` and `plan$data_rawbatch`.
n_skeleton_files_limitOptional integer. When non-NULL, `tteplan_load()` caps `self$skeleton_files` to this many entries after refreshing them from `self$registrystudy`. Used for dev configs that only want a subset of skeletons.
Active bindings
max_follow_up(read-only) Maximum follow_up across all ETTs.
dir_tteplan(read-only) Directory where `tteplan.qs2` is saved, resolved from `self$dir_tteplan_cp` on the current host.
dir_spec(read-only) Directory containing the spec YAML, resolved from `self$dir_spec_cp`.
dir_results_base(read-only) Results base directory, resolved from `self$dir_results_cp`. `dir_results` appends `spec_version`.
dir_results(read-only) Results directory with version suffix: `file.path(self$dir_results_base, self$spec_version)`.
tteplan(read-only) Full path to `tteplan.qs2`.
spec_path(read-only) Full path to the spec YAML (`spec_vXXX.yaml`) selected by `self$spec_version`.
spec_xlsx(read-only) Full path to `spec_<version>.xlsx` inside `self$dir_results`, where `<version>` is `self$spec_version`.
tables_xlsx(read-only) Full path to `tables.xlsx` inside `self$dir_results`.
data_skeleton(read-only) Delegates to `self$registrystudy$data_skeleton_dir`.
data_rawbatch(read-only) Delegates to `self$registrystudy$data_rawbatch_dir`.
Methods
TTEPlan$new()
Create a new TTEPlan object.
Usage
TTEPlan$new(project_prefix, skeleton_files, global_max_isoyearweek, ett = NULL)TTEPlan$check_version()
Check if this object's schema version matches the current class version. Errors if the object was saved with an older schema.
TTEPlan$print_spec_summary()
Print a target trial specification summary. Console-friendly summary derived from the study specification stored on this plan. When `$code_registry` is available, variable names are shown in red and matched code details in blue (ANSI colors).
TTEPlan$print_target_checklist()
Print a TARGET-aligned reporting checklist.
Generates a self-contained document following the TARGET Statement (Cashin et al., JAMA 2025) 21-item checklist for transparent reporting of target trial emulations. Each item includes the full TARGET description, auto-filled content from the swereg spec where available, and `[FILL IN]` placeholders for PI completion.
TTEPlan$add_one_ett()
Add one ETT to the plan.
An ETT (Emulated Target Trial) is one outcome x follow_up x age_group combination. ETTs sharing an enrollment_id use the same trial panels (same matching, same age group, same confounders). They differ only in outcome and/or follow-up duration. This avoids redundant re-enrollment for each outcome/follow-up combo.
Usage
TTEPlan$add_one_ett(
enrollment_id,
outcome_var,
outcome_name,
follow_up,
confounder_vars,
time_treatment_var,
eligible_var,
argset = list()
)Arguments
enrollment_idCharacter, enrollment group identifier (e.g., "01").
outcome_varCharacter, name of the outcome column.
outcome_nameCharacter, short human-readable outcome label (used in forest plot rows and Table S10).
follow_upInteger, follow-up duration in weeks.
confounder_varsCharacter vector of confounder column names.
time_treatment_varCharacter or NULL, time-varying treatment column.
eligible_varCharacter or NULL, eligibility column.
argsetNamed list with age_group, age_min, age_max (and optional person_id_var, outcome_description).
TTEPlan$save()
Save the plan to disk as `tteplan.qs2`.
Writes to `self$tteplan` by default – that is, `tteplan.qs2` inside the directory resolved from `self$dir_tteplan_cp`. Supply `dir` to override the destination (deprecated; used only by in-flight scripts that don't yet have a `dir_tteplan_cp`).
Captures the destination path FIRST, then invalidates every [CandidatePath] on the plan (and on its embedded [RegistryStudy]) so the on-disk file never carries the saving host's resolved paths. Reload with [tteplan_load()].
TTEPlan$enrollment_spec()
Extract enrollment spec for the i-th enrollment_id group.
Returns
A list with:
- design
A [TTEDesign] object with column mappings
- enrollment_id
Character, the enrollment group ID
- age_range
Numeric vector of length 2: c(min, max)
- n_threads
Integer, number of data.table threads to use
- treatment_impl
List with variable, intervention_value, comparator_value (present when plan was built from a spec)
- matching_ratio
Numeric, e.g. 2 for 1:2 matching (present when plan was built from a spec)
- seed
Integer for reproducible matching (present when plan was built from a spec)
TTEPlan$s1_generate_enrollments_and_ipw()
Loop 1: Create trial panels from skeleton files and compute IPW.
Uses a two-pass pipeline to fix cross-batch matching ratio imbalance. Requires `self$spec` to be set (e.g., via [tteplan_from_spec_and_registrystudy()]).
**Pass 1a (scout)**: Lightweight parallel pass that reads each skeleton file, applies exclusions and treatment, and returns eligible `(person_id, trial_id, intervention)` tuples. No confounders or enrollment.
**Centralized matching**: Combines all tuples from all batches, then per `trial_id` keeps all intervention and samples `ratio * n_intervention` comparator globally. Stores counts on `self$enrollment_counts` for TARGET Item 8 reporting.
**Pass 1b (full enrollment)**: Parallel pass that re-reads each skeleton file with full processing (exclusions + confounders + treatment), then enrolls using pre-matched IDs (skipping per-batch matching). Produces panel-expanded TTEEnrollment objects.
Usage
TTEPlan$s1_generate_enrollments_and_ipw(
output_dir = NULL,
impute_fn = tteenrollment_impute_confounders,
stabilize = TRUE,
n_workers = 3L,
swereg_dev_path = NULL,
resume = FALSE
)Arguments
output_dirOptional directory override for output files. If `NULL` (default), uses `self$dir_tteplan`.
impute_fnImputation callback or NULL (default: [tteenrollment_impute_confounders]).
stabilizeLogical, stabilize IPW (default: TRUE).
n_workersInteger, concurrent subprocesses (default: 3L).
swereg_dev_pathPath to local swereg dev copy, or NULL.
resumeLogical. If `TRUE`, skip enrollments whose `_imp_` file already exists in `output_dir` (default: FALSE).
TTEPlan$s2_generate_analysis_files_and_ipcw_pp()
Loop 2: Per-ETT IPCW-PP calculation and analysis file generation. For each ETT, loads the imputed enrollment file, calls `$s4_prepare_for_analysis()` (outcome + IPCW-PP + weight combination + truncation), and saves the analysis-ready file.
Usage
TTEPlan$s2_generate_analysis_files_and_ipcw_pp(
output_dir = NULL,
estimate_ipcw_pp_separately_by_treatment = TRUE,
estimate_ipcw_pp_with_gam = TRUE,
n_workers = 1L,
swereg_dev_path = NULL,
resume = FALSE
)Arguments
output_dirOptional directory override containing imp files and where analysis files are saved. If `NULL` (default), uses `self$dir_tteplan`.
estimate_ipcw_pp_separately_by_treatmentLogical, estimate IPCW-PP separately by treatment group (default: TRUE).
estimate_ipcw_pp_with_gamLogical, use GAM for IPCW-PP estimation (default: TRUE).
n_workersInteger, concurrent subprocesses (default: 1L).
swereg_dev_pathPath to local swereg dev copy, or NULL.
resumeLogical. If `TRUE`, skip ETTs whose analysis file already exists in `output_dir` (default: FALSE).
TTEPlan$s3_analyze()
Loop 3: Compute all analysis results and store on the plan.
For each enrollment: loads one analysis file and the raw file, computes baseline characteristics (raw, unweighted, IPW, IPW truncated). For each ETT: loads the analysis file, computes rates, IRR, and heterogeneity test with both truncated and untruncated weights.
Results are stored in `self$results_enrollment` and `self$results_ett`. Existing results are skipped (resume-safe). Use `plan$save()` to persist.
Usage
TTEPlan$s3_analyze(
enrollment_ids = NULL,
ett_ids = NULL,
output_dir = NULL,
swereg_dev_path = NULL,
force = FALSE,
n_workers = 1L
)Arguments
enrollment_idsCharacter vector of enrollment IDs to analyze, or `NULL` (default) for all.
ett_idsCharacter vector of ETT IDs to analyze, or `NULL` (default) for all.
output_dirOptional directory override. If `NULL` (default), uses `self$dir_tteplan` (falls back to the legacy `self$output_dir` for plans created before the CandidatePath migration).
swereg_dev_pathPath to local swereg dev copy, or NULL.
forceLogical (default `FALSE`). When `TRUE`, drops cached results in the targeted scope before recomputing. Scope follows `enrollment_ids` and `ett_ids`: if both are `NULL`, all cached results are cleared; otherwise only the matching entries are dropped (untargeted enrollments / ETTs stay cached). Useful when prior results were produced under a broken environment (e.g. missing `survey` package -> 135 silent `skipped = TRUE` entries) and you want to recompute without manually mutating `self$results_ett` / `self$results_enrollment`.
n_workersInteger >= 1 (default `1L`). Number of concurrent worker subprocesses for both the enrollment loop and the per-ETT loop. Each worker reads its own analysis file fresh, so peak RAM scales linearly with `n_workers`; on machines with multi-GB analysis files, set this conservatively. CPU threads per worker are auto-partitioned as `floor(detectCores() / n_workers)`.
TTEPlan$results_summary()
Print a diagnostic summary of stored results.
Shows one row per ETT with enrollment, event count, and whether IRR/rates computed successfully.
TTEPlan$excel_spec_summary()
Export the study specification to a standalone Excel file.
Writes a formatted summary of the spec (design, criteria, confounders, outcomes, enrollments) with ICD-10/ATC code annotations from the code registry. No analysis results required.
TTEPlan$reload_spec()
Refresh cosmetic spec fields (enrollment names, treatment arm labels, outcome names, ETT descriptions) on a cached plan without re-running the upstream pipeline.
Structural fields (confounders, exclusion criteria, follow-up windows, matching parameters, etc.) are *not* applied - they would invalidate the cached results. The differences are surfaced via a loud warning and recorded in `self$spec_reload_skipped_diffs`.
TTEPlan$recompute_baselines()
Recompute baseline characteristic tables in-process.
Reads each enrollment's smallest analysis file (and the raw file when present) from disk and re-runs the new `swereg_table1` engine. Used to refresh stale results after upgrading swereg, without re-running the full `$s3_analyze()` pipeline.
TTEPlan$export_tables()
Export analysis results to an Excel workbook.
Requires `self$results_enrollment` and `self$results_ett` to be populated (run `$s3_analyze()` first).
If the cached baseline tables were produced by an older version of `swereg` (when Table 1 was a `tableone` object), they are automatically refreshed in-process via `$recompute_baselines()` using the analysis files in `output_dir`.
Usage
TTEPlan$export_tables(
path = NULL,
table1_enrollment = NULL,
featured_etts = NULL,
output_dir = NULL,
forest_label_format = NULL,
forest_desc_header = NULL
)Arguments
pathFile path for the output `.xlsx` file.
table1_enrollmentEnrollment ID for Table 1 (main baseline table). Default: the enrollment with the most baseline observations.
featured_ettsOptional, either a flat character vector of ETT ids to feature in the Forest plot, or a **named list** of such vectors. When supplied as a named list, each name becomes a bold group header in the Forest plot (e.g. one group per treatment contrast). Order follows the list (and the vectors inside). When `NULL` (default), all ETTs are shown in the Forest plot with no grouping.
output_dirOptional directory holding the cached `.qs2` files. Used by the lazy `recompute_baselines()` refresh. Defaults to `self$output_dir`.
forest_label_formatOptional character(1) format string for the Forest plot row description. Supports `placeholder` tokens: `outcome_name`, `outcome_description`, `enrollment_name`, `enrollment_id`, `intervention_name`, `comparator_name`, `follow_up`, `ett_id`. When `NULL` (default), uses `"outcome_name (follow_upw)"` for grouped featured ETTs and `"enrollment_name - outcome_name (follow_upw)"` otherwise.
forest_desc_headerOptional character(1) header label for the description column of the Forest plot left text panel. Defaults to `"ETT"`.
Examples
if (FALSE) { # \dontrun{
plan <- TTEPlan$new(
project_prefix = "project002",
skeleton_files = skeleton_files,
global_max_isoyearweek = "2023-52"
)
plan$add_one_ett(
outcome_var = "death",
outcome_name = "Death",
follow_up = 52,
confounder_vars = c("age", "education"),
time_treatment_var = "rd_intervention",
eligible_var = "eligible",
argset = list(age_group = "50_60", age_min = 50, age_max = 60)
)
# Extract first enrollment spec for interactive testing
enrollment_spec <- plan[[1]]
enrollment_spec$design
enrollment_spec$age_range
} # }
