Skip to contents

Searches for specific ICD diagnosis codes in Swedish hospital registry data and adds corresponding boolean variables to the skeleton. Can search in main diagnoses only or both main and secondary diagnoses.

Usage

add_diagnoses(
  skeleton,
  dataset,
  id_name,
  diag_type = "both",
  codes = list(icd10_F64_0 = c("F640"), icd10_F64_89 = c("F6489"), icd10_F64_089 =
    c("F640", "F648", "F649")),
  diags = NULL
)

Arguments

skeleton

A data.table containing the main skeleton structure created by create_skeleton

dataset

A data.table containing hospital registry data with diagnosis codes. Must have columns for person ID, admission date (indatum), and at least one diagnosis code column. Expected diagnosis columns after make_lowercase_names(): hdia (main), dia1/dia2/etc (secondary), ekod1/ekod2/etc (external causes), icd7*, icd9*

id_name

Character string specifying the name of the ID variable in the dataset

diag_type

Character string specifying which diagnosis types to search:

  • "both" (default) - Search in main (hdia), secondary (dia*), external cause (ekod*), and historical ICD version columns (icd7*, icd9*)

  • "main" - Search only in main diagnosis column (hdia)

codes

Named list of ICD code patterns to search for. Names become variable names in skeleton. Patterns should NOT include "^" prefix (automatically added). Use exclusions with "!" prefix. Example: list("depression" = c("F32", "F33"), "anxiety" = c("F40", "F41"))

diags

Deprecated. Use codes instead.

Value

The skeleton data.table is modified by reference with diagnosis variables added. New boolean variables are created for each diagnosis pattern, TRUE when diagnosis is present.

Details

The function searches across different diagnosis code column types based on the diag_type parameter:

  • When diag_type = "both": Searches in hdia (main diagnosis), dia1, dia2, ... (secondary diagnoses), ekod1, ekod2, ... (external cause codes), icd7* (ICD-7 codes), and icd9* (ICD-9 codes)

  • When diag_type = "main": Searches only in hdia (main diagnosis)

## Filtering by source (inpatient/outpatient/cancer)

The diagnosis dataset must contain a source column with valid values ("inpatient", "outpatient", or "cancer"). To track diagnoses separately by source, filter the dataset before calling this function:


# Inpatient diagnoses only
inpatient_data <- diagnoses[source == "inpatient"]
add_diagnoses(skeleton, inpatient_data, "lopnr", diags = list(
  "depression_inpatient" = c("F32", "F33")
))

# Outpatient diagnoses only
outpatient_data <- diagnoses[source == "outpatient"]
add_diagnoses(skeleton, outpatient_data, "lopnr", diags = list(
  "depression_outpatient" = c("F32", "F33")
))

# Combined (any source) - default behavior
add_diagnoses(skeleton, diagnoses, "lopnr", diags = list(
  "depression_any" = c("F32", "F33")
))

See also

create_skeleton for creating the skeleton structure, add_operations for surgical procedures, add_rx for prescription data, make_lowercase_names for data preprocessing

Other data_integration: add_annual(), add_cods(), add_icdo3s(), add_onetime(), add_operations(), add_rx(), add_snomed3s(), add_snomedo10s()

Examples

# Load fake data
data("fake_person_ids", package = "swereg")
data("fake_diagnoses", package = "swereg")
swereg::make_lowercase_names(fake_diagnoses, date_columns = "indatum")
#> Found additional date columns not in date_columns: utdatum. Consider adding them for automatic date parsing.

# Create skeleton
skeleton <- create_skeleton(fake_person_ids[1:10], "2020-01-01", "2020-12-31")

# Add diagnoses
diag_patterns <- list(
  "depression" = c("F32", "F33"),
  "anxiety" = c("F40", "F41")
)
add_diagnoses(skeleton, fake_diagnoses, "lopnr", "both", diag_patterns)