Searches for specific ICD diagnosis codes in Swedish hospital registry data and adds corresponding boolean variables to the skeleton. Can search in main diagnoses only or both main and secondary diagnoses.
Arguments
- skeleton
A data.table containing the main skeleton structure created by
create_skeleton- dataset
A data.table containing hospital registry data with diagnosis codes. Must have columns for person ID, admission date (
indatum), and at least one diagnosis code column. Expected diagnosis columns aftermake_lowercase_names():hdia(main),dia1/dia2/etc (secondary),ekod1/ekod2/etc (external causes),icd7*,icd9*- id_name
Character string specifying the name of the ID variable in the dataset
- diag_type
Character string specifying which diagnosis types to search:
"both" (default) - Search in main (
hdia), secondary (dia*), external cause (ekod*), and historical ICD version columns (icd7*,icd9*)"main" - Search only in main diagnosis column (
hdia)
- codes
Named list of ICD code patterns to search for. Names become variable names in skeleton. Patterns should NOT include "^" prefix (automatically added). Use exclusions with "!" prefix. Example:
list("depression" = c("F32", "F33"), "anxiety" = c("F40", "F41"))- diags
Deprecated. Use
codesinstead.
Value
The skeleton data.table is modified by reference with diagnosis variables added. New boolean variables are created for each diagnosis pattern, TRUE when diagnosis is present.
Details
The function searches across different diagnosis code column types based on the
diag_type parameter:
When
diag_type = "both": Searches inhdia(main diagnosis),dia1, dia2, ...(secondary diagnoses),ekod1, ekod2, ...(external cause codes),icd7*(ICD-7 codes), andicd9*(ICD-9 codes)When
diag_type = "main": Searches only inhdia(main diagnosis)
## Filtering by source (inpatient/outpatient/cancer)
The diagnosis dataset must contain a source column with valid values
("inpatient", "outpatient", or "cancer"). To track diagnoses separately by source,
filter the dataset before calling this function:
# Inpatient diagnoses only
inpatient_data <- diagnoses[source == "inpatient"]
add_diagnoses(skeleton, inpatient_data, "lopnr", diags = list(
"depression_inpatient" = c("F32", "F33")
))
# Outpatient diagnoses only
outpatient_data <- diagnoses[source == "outpatient"]
add_diagnoses(skeleton, outpatient_data, "lopnr", diags = list(
"depression_outpatient" = c("F32", "F33")
))
# Combined (any source) - default behavior
add_diagnoses(skeleton, diagnoses, "lopnr", diags = list(
"depression_any" = c("F32", "F33")
))See also
create_skeleton for creating the skeleton structure,
add_operations for surgical procedures,
add_rx for prescription data,
make_lowercase_names for data preprocessing
Other data_integration:
add_annual(),
add_cods(),
add_icdo3s(),
add_onetime(),
add_operations(),
add_rx(),
add_snomed3s(),
add_snomedo10s()
Examples
# Load fake data
data("fake_person_ids", package = "swereg")
data("fake_diagnoses", package = "swereg")
swereg::make_lowercase_names(fake_diagnoses, date_columns = "indatum")
#> Found additional date columns not in date_columns: utdatum. Consider adding them for automatic date parsing.
# Create skeleton
skeleton <- create_skeleton(fake_person_ids[1:10], "2020-01-01", "2020-12-31")
# Add diagnoses
diag_patterns <- list(
"depression" = c("F32", "F33"),
"anxiety" = c("F40", "F41")
)
add_diagnoses(skeleton, fake_diagnoses, "lopnr", "both", diag_patterns)
