Skip to contents

Searches for specific ICD diagnosis codes in Swedish hospital registry data and adds corresponding boolean variables to the skeleton. Can search in main diagnoses only or both main and secondary diagnoses.

Usage

add_diagnoses(
  skeleton,
  dataset,
  id_name,
  diag_type = "both",
  codes = list(icd10_F64_0 = c("F640"), icd10_F64_89 = c("F6489"), icd10_F64_089 =
    c("F640", "F648", "F649")),
  diags = NULL
)

Arguments

skeleton

A data.table containing the main skeleton structure created by create_skeleton

dataset

A data.table containing hospital registry data with diagnosis codes. Must have columns for person ID, admission date (indatum), and at least one diagnosis code column. Expected diagnosis columns after make_lowercase_names(): hdia (main), dia1/dia2/etc (secondary), ekod1/ekod2/etc (external causes), icd7*, icd9*

id_name

Character string specifying the name of the ID variable in the dataset

diag_type

Character string specifying which diagnosis types to search:

  • "both" (default) - Search in main (hdia), secondary (dia*), external cause (ekod*), and historical ICD version columns (icd7*, icd9*)

  • "main" - Search only in main diagnosis column (hdia)

codes

Named list of ICD code patterns. Names become column names in the skeleton; values are character vectors of code prefixes.

Matching is **prefix-only** via startsWith(). A pattern like "F32" matches "F32", "F320", "F321", etc. This is not regex – characters such as ^, $, *, [A-Z] are taken literally and will not match anything.

Prefixing a pattern with "!" turns it into a *row-level veto*: any source row whose code matches the (un-prefixed) pattern is masked out and does not contribute. The veto applies per source row across all scanned columns (hdia + dia* + ekod* + ...), and is reset between code names. Important: the veto operates on the raw source row, not on the (id, isoyearweek) bucket – if a person has both a vetoed code and a non-vetoed code in the same week, the non-vetoed code still triggers TRUE for that week.

Examples:

  • list(depression = c("F32", "F33")) – any depression code.

  • list(f64_minus_640 = c("F64", "!F640")) – any F64* code except literal F640.

diags

Deprecated. Use codes instead.

Value

The skeleton data.table is modified by reference with diagnosis variables added. New boolean variables are created for each diagnosis pattern, TRUE when diagnosis is present.

Details

The function searches across different diagnosis code column types based on the diag_type parameter:

  • When diag_type = "both": Searches in hdia (main diagnosis), dia1, dia2, ... (secondary diagnoses), ekod1, ekod2, ... (external cause codes), icd7* (ICD-7 codes), and icd9* (ICD-9 codes)

  • When diag_type = "main": Searches only in hdia (main diagnosis)

## Filtering by source (inpatient/outpatient/cancer)

The diagnosis dataset must contain a source column with valid values ("inpatient", "outpatient", or "cancer"). To track diagnoses separately by source, filter the dataset before calling this function:


# Inpatient diagnoses only
inpatient_data <- diagnoses[source == "inpatient"]
add_diagnoses(skeleton, inpatient_data, "lopnr", diags = list(
  "depression_inpatient" = c("F32", "F33")
))

# Outpatient diagnoses only
outpatient_data <- diagnoses[source == "outpatient"]
add_diagnoses(skeleton, outpatient_data, "lopnr", diags = list(
  "depression_outpatient" = c("F32", "F33")
))

# Combined (any source) - default behavior
add_diagnoses(skeleton, diagnoses, "lopnr", diags = list(
  "depression_any" = c("F32", "F33")
))

See also

create_skeleton for creating the skeleton structure, add_operations for surgical procedures, add_rx for prescription data, make_lowercase_names for data preprocessing

Other data_integration: add_annual(), add_cods(), add_icdo3s(), add_onetime(), add_operations(), add_quality_registry(), add_rx(), add_snomed3s(), add_snomedo10s()

Examples

# Load fake data
data("fake_person_ids", package = "swereg")
data("fake_diagnoses", package = "swereg")
swereg::make_lowercase_names(fake_diagnoses, date_columns = "indatum")
#> Found additional date columns not in date_columns: utdatum. Consider adding them for automatic date parsing.

# Create skeleton
skeleton <- create_skeleton(fake_person_ids[1:10], "2020-01-01", "2020-12-31")

# Add diagnoses
diag_patterns <- list(
  "depression" = c("F32", "F33"),
  "anxiety" = c("F40", "F41")
)
add_diagnoses(skeleton, fake_diagnoses, "lopnr", "both", diag_patterns)