Harmonizing HemOnc concepts entails getting a combination of Regimens and Components, reducing the combination down to the Component level, and finding Regimens and/or Regimen and Component combinations that map to the reduced list.

For example, the Regimen FOLFOX along with Irinotecan would harmonize to FOLFIRINOX.

FOLFOX <- get_concept(concept_id = 35806596)
Irinotecan <- get_concept(concept_id = 35803130)

To check for this, the FOLFOX plus Irinotecan combination are reduced to Components.

new_components <- 
  ho_reduce_to_components(FOLFOX,
                        Irinotecan)
new_components
#>   concept_id concept_name domain_id vocabulary_id concept_class_id
#> 1   35803077 Fluorouracil      Drug        HemOnc        Component
#> 2   35803081 Folinic acid      Drug        HemOnc        Component
#> 3   35803227  Oxaliplatin      Drug        HemOnc        Component
#> 4   35803130   Irinotecan      Drug        HemOnc        Component
#>   standard_concept concept_code valid_start_date valid_end_date invalid_reason
#> 1             <NA>          225       2019-05-27     2099-12-31           <NA>
#> 2             <NA>          229       2019-05-27     2099-12-31           <NA>
#> 3             <NA>          377       2019-05-27     2099-12-31           <NA>
#> 4             <NA>          280       2019-05-27     2099-12-31           <NA>

The components can then be submitted to generate a potential regimen match.

new_regimen <- 
  ho_lookup_regimen(new_components$concept_id)
new_regimen
#>    regimen_concept_id                                  regimen_concept_name
#> 1            35804771                                            FOLFIRINOX
#> 2            35804771                                            FOLFIRINOX
#> 3            35804771                                            FOLFIRINOX
#> 4            35804771                                            FOLFIRINOX
#> 5            35806584                                           mFOLFIRINOX
#> 6            35806584                                           mFOLFIRINOX
#> 7            35806584                                           mFOLFIRINOX
#> 8            35806584                                           mFOLFIRINOX
#> 9            35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#> 10           35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#> 11           35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#> 12           35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#>    has_antineoplastic_concept_id has_antineoplastic_concept_name
#> 1                       35803077                    Fluorouracil
#> 2                       35803081                    Folinic acid
#> 3                       35803130                      Irinotecan
#> 4                       35803227                     Oxaliplatin
#> 5                       35803077                    Fluorouracil
#> 6                       35803081                    Folinic acid
#> 7                       35803130                      Irinotecan
#> 8                       35803227                     Oxaliplatin
#> 9                       35803077                    Fluorouracil
#> 10                      35803081                    Folinic acid
#> 11                      35803130                      Irinotecan
#> 12                      35803227                     Oxaliplatin

In this case, various combinations existed. However, what if a combination does not cleanly map to a HemOnc Regimen? For example, sometimes there is an add-on drug to an established Regimen for experimental reasons.

For demonstration purposes, suppose the new Components reduced down to Fluorouracil, Folinic acid, Oxaliplatin, and Irinotecan as well as Trastuzumab.

Trastuzumab <- get_concept(concept_id = 35803361)
Trastuzumab
new_component_ids2 <- c(new_components$concept_id, Trastuzumab@concept_id)
new_component_ids2
#> [1] 35803077 35803081 35803227 35803130 35803361
new_regimen2 <- ho_lookup_regimen(new_component_ids2)

In this case, a regimen was not found as indicated by a zero row result.

new_regimen2
#> # A tibble: 0 x 4
#> # … with 4 variables: regimen_concept_id <???>, regimen_concept_name <???>,
#> #   has_antineoplastic_concept_id <???>, has_antineoplastic_concept_name <???>

To find possible Regimen and Component combinations instead, subsets of the Components can be submitted to find matches using ho_grep_regimens() with starting component_count 1 value less than the total Component count. In this case, it is 4.

The Concept classes are retrieved for each of the Components.

new_component_objs2 <- lapply(new_component_ids2,
                              get_concept)
names(new_component_objs2) <- lapply(new_component_objs2,
                                     function(x) slot(x, "concept_name"))
new_component_objs2
#> $Fluorouracil
#> An object of class "concept"
#> Slot "concept_id":
#> [1] 35803077
#> 
#> Slot "concept_name":
#> [1] "Fluorouracil"
#> 
#> Slot "concept_synonym_names":
#> [1] "5 FU|5 Fluorouracil|5-fluoracilo|5-fluorouracilo|5-fluorouracyl|FU|Ro-2-9757"
#> 
#> Slot "maps_to_concept_names":
#> [1] "fluorouracil"
#> 
#> Slot "domain_id":
#> [1] "Drug"
#> 
#> Slot "vocabulary_id":
#> [1] "HemOnc"
#> 
#> Slot "concept_class_id":
#> [1] "Component"
#> 
#> Slot "standard_concept":
#> [1] NA
#> 
#> Slot "concept_code":
#> [1] "225"
#> 
#> Slot "valid_start_date":
#> [1] "2019-05-27"
#> 
#> Slot "valid_end_date":
#> [1] "2099-12-31"
#> 
#> Slot "invalid_reason":
#> [1] NA
#> 
#> 
#> $`Folinic acid`
#> An object of class "concept"
#> Slot "concept_id":
#> [1] 35803081
#> 
#> Slot "concept_name":
#> [1] "Folinic acid"
#> 
#> Slot "concept_synonym_names":
#> [1] "LV|citrovorum factor|folinate calcium|folinato de calcio|leucovorin calcium|sodium folinate"
#> 
#> Slot "maps_to_concept_names":
#> [1] "leucovorin"
#> 
#> Slot "domain_id":
#> [1] "Drug"
#> 
#> Slot "vocabulary_id":
#> [1] "HemOnc"
#> 
#> Slot "concept_class_id":
#> [1] "Component"
#> 
#> Slot "standard_concept":
#> [1] NA
#> 
#> Slot "concept_code":
#> [1] "229"
#> 
#> Slot "valid_start_date":
#> [1] "2019-05-27"
#> 
#> Slot "valid_end_date":
#> [1] "2099-12-31"
#> 
#> Slot "invalid_reason":
#> [1] NA
#> 
#> 
#> $Oxaliplatin
#> An object of class "concept"
#> Slot "concept_id":
#> [1] 35803227
#> 
#> Slot "concept_name":
#> [1] "Oxaliplatin"
#> 
#> Slot "concept_synonym_names":
#> [1] "JM-83|RP-54780|SR-96669"
#> 
#> Slot "maps_to_concept_names":
#> [1] "oxaliplatin"
#> 
#> Slot "domain_id":
#> [1] "Drug"
#> 
#> Slot "vocabulary_id":
#> [1] "HemOnc"
#> 
#> Slot "concept_class_id":
#> [1] "Component"
#> 
#> Slot "standard_concept":
#> [1] NA
#> 
#> Slot "concept_code":
#> [1] "377"
#> 
#> Slot "valid_start_date":
#> [1] "2019-05-27"
#> 
#> Slot "valid_end_date":
#> [1] "2099-12-31"
#> 
#> Slot "invalid_reason":
#> [1] NA
#> 
#> 
#> $Irinotecan
#> An object of class "concept"
#> Slot "concept_id":
#> [1] 35803130
#> 
#> Slot "concept_name":
#> [1] "Irinotecan"
#> 
#> Slot "concept_synonym_names":
#> [1] "CPT-11|Camptothecin-11|U-101440E"
#> 
#> Slot "maps_to_concept_names":
#> [1] "irinotecan"
#> 
#> Slot "domain_id":
#> [1] "Drug"
#> 
#> Slot "vocabulary_id":
#> [1] "HemOnc"
#> 
#> Slot "concept_class_id":
#> [1] "Component"
#> 
#> Slot "standard_concept":
#> [1] NA
#> 
#> Slot "concept_code":
#> [1] "280"
#> 
#> Slot "valid_start_date":
#> [1] "2019-05-27"
#> 
#> Slot "valid_end_date":
#> [1] "2099-12-31"
#> 
#> Slot "invalid_reason":
#> [1] NA
#> 
#> 
#> $Trastuzumab
#> An object of class "concept"
#> Slot "concept_id":
#> [1] 35803361
#> 
#> Slot "concept_name":
#> [1] "Trastuzumab"
#> 
#> Slot "concept_synonym_names":
#> [1] ""
#> 
#> Slot "maps_to_concept_names":
#> [1] "trastuzumab"
#> 
#> Slot "domain_id":
#> [1] "Drug"
#> 
#> Slot "vocabulary_id":
#> [1] "HemOnc"
#> 
#> Slot "concept_class_id":
#> [1] "Component"
#> 
#> Slot "standard_concept":
#> [1] NA
#> 
#> Slot "concept_code":
#> [1] "512"
#> 
#> Slot "valid_start_date":
#> [1] "2019-05-27"
#> 
#> Slot "valid_end_date":
#> [1] "2099-12-31"
#> 
#> Slot "invalid_reason":
#> [1] NA

All the possible Component permutations are made by removing one of the Components.

new_component_permutations <- list()
for (i in seq_along(new_component_objs2)) {
  new_component_permutations[[i]] <-
  new_component_objs2[-i]
  names(new_component_permutations)[i] <- new_component_objs2[[i]]@concept_name
}

The resulting input is of the same length as the source Components (5) and each value in the list is composed of 4 (5 minus 1) Components. The names of the removed Component in a particular permutation is also saved as the name.

The permutations are mapped to Regimens.

new_regimens2 <- list()
for (i in seq_along(new_component_permutations)) {
  
  new_regimens2[[i]] <-
  ho_lookup_regimen(new_component_permutations[[i]])
}
names(new_regimens2) <- names(new_component_permutations)
new_regimens2 <-
  new_regimens2 %>%
  purrr::keep(~ nrow(.) > 0)
new_regimens2
#> $Trastuzumab
#>    regimen_concept_id                                  regimen_concept_name
#> 1            35804771                                            FOLFIRINOX
#> 2            35804771                                            FOLFIRINOX
#> 3            35804771                                            FOLFIRINOX
#> 4            35804771                                            FOLFIRINOX
#> 5            35806584                                           mFOLFIRINOX
#> 6            35806584                                           mFOLFIRINOX
#> 7            35806584                                           mFOLFIRINOX
#> 8            35806584                                           mFOLFIRINOX
#> 9            35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#> 10           35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#> 11           35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#> 12           35807526 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation
#>    has_antineoplastic_concept_id has_antineoplastic_concept_name
#> 1                       35803077                    Fluorouracil
#> 2                       35803081                    Folinic acid
#> 3                       35803130                      Irinotecan
#> 4                       35803227                     Oxaliplatin
#> 5                       35803077                    Fluorouracil
#> 6                       35803081                    Folinic acid
#> 7                       35803130                      Irinotecan
#> 8                       35803227                     Oxaliplatin
#> 9                       35803077                    Fluorouracil
#> 10                      35803081                    Folinic acid
#> 11                      35803130                      Irinotecan
#> 12                      35803227                     Oxaliplatin

Filtering for only the permutations that returned greater than zero rows, there are 3 potential regimens that a subset of 4 Components map to:

new_regimens2 %>% 
  bind_rows(.id = "add_on_component_name") %>%
  select(add_on_component_name, regimen_concept_id, regimen_concept_name) %>%
  distinct()
#>   add_on_component_name regimen_concept_id
#> 1           Trastuzumab           35804771
#> 2           Trastuzumab           35806584
#> 3           Trastuzumab           35807526
#>                                    regimen_concept_name
#> 1                                            FOLFIRINOX
#> 2                                           mFOLFIRINOX
#> 3 FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation

Therefore, the possible combinations are either FOLFIRINOX plus Trastuzumab, mFOLFIRINOX plus Trastuzumab, or FOLFIRINOX/modified FOLFIRINOX plus /- Chemoradiation plus Trastuzumab.

It is possible that the permutations do not return any rows in this scenario. If this is the case, it is likely more feasible to manually look up the mapping at than to programmatically find matches.