Using vimpact for estimating vaccine impact - internal

This vignette describes how to use vimpact to calculate impact as a member of VIMC. This requires a connection to the montagu database so can only be used internally. Note that this is all in development and the interface is likely to change.

Impact by calendar year & impact by birth year

Function interface

impact <- vimpact::calculate_impact(
  con, method = "calendar_year", touchstone = "201910gavi-5",
  modelling_group = "CDA-Razavi",  disease = "HepB",
  focal_scenario_type = "default", focal_vaccine_delivery = list(
    list(vaccine = "HepB_BD", activity_type = "routine"),
    list(vaccine = "HepB", activity_type = "routine")
  ),
  baseline_scenario_type = "novac",
  burden_outcomes = c("hepb_deaths_acute", "hepb_deaths_dec_cirrh",
                      "hepb_deaths_hcc"))
str(impact)
#> tibble [9,898 × 4] (S3: tbl_df/tbl/data.frame)
#>  $ country       : chr [1:9898] "AFG" "AFG" "AFG" "AFG" ...
#>  $ burden_outcome: chr [1:9898] "deaths" "deaths" "deaths" "deaths" ...
#>  $ year          : int [1:9898] 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
#>  $ impact        : num [1:9898] 0 0 0 0 0 0 0 53 81 97 ...

Or for impact by birth year, use method = "birth_year". This will get burden_estimate_set ids for the baseline and focal scenarios for this particular touchstone, modelling group, disease. Then uses those to pull the burden_estimate data for specified burden_outcomes. Optionally filtering on country via countries arg, year via vaccination_years and under 5 age groups if is_under5 = TRUE. We then use raw impact from burden_estimate table and call relevant public facing impact method. For method = "calendar_year" impact_by_calendar_year. For method = "birth_year" impact_by_birth_year.

Recipe interface

Define a recipe either as a csv or using recipe_template

recipe <- data.frame(
  touchstone = "201910gavi-5",
  modelling_group = "CDA-Razavi", disease = "HepB",
  focal = "default:HepB_BD-routine;HepB-routine",
  baseline = "novac",
  burden_outcome = "hepb_deaths_acute,hepb_deaths_dec_cirrh,hepb_deaths_hcc;hepb_cases_acute_severe,hepb_cases_dec_cirrh,hepb_cases_hcc;dalys")
t <- tempfile(fileext = ".csv")
write.csv(recipe, t, row.names = FALSE)

This is a set of properties defining what data we want to extract from the db e.g. touchstone, modelling_group, disease, focal and baseline scenarios and burden_outcome. It captures the same info as the args to calculate_impact above. Then use the recipe to define meta data frame

meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t, con = con)

And use this to calculate impact

old_impact <- vimpact:::get_raw_impact_details(con, meta, "deaths")
str(old_impact)
#> 'data.frame':    9898 obs. of  7 variables:
#>  $ country       : int  104 104 104 104 104 104 104 104 104 104 ...
#>  $ time          : int  2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 ...
#>  $ baseline_value: num  5636 5765 5901 6034 6172 ...
#>  $ focal_value   : num  5636 5765 5901 6015 6075 ...
#>  $ value         : num  0 0 0 19 97 180 252 312 344 381 ...
#>  $ index         : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ burden_outcome: chr  "deaths" "deaths" "deaths" "deaths" ...

Impact by year of vaccination

Function interface

Very similar to examples for calendar year and birth year, to calculate impact by year of vaccination stratified by activity type run

impact <- vimpact::calculate_impact(
  con, method = "yov_activity_type", touchstone = "201910gavi-5",
  modelling_group = "CDA-Razavi",  disease = "HepB",
  focal_scenario_type = "default", focal_vaccine_delivery = list(
    list(vaccine = "HepB_BD", activity_type = "routine"),
    list(vaccine = "HepB", activity_type = "routine")
  ),
  baseline_scenario_type = "novac",
  burden_outcomes = "dalys")
str(impact)
#> tibble [4,279 × 8] (S3: tbl_df/tbl/data.frame)
#>  $ country       : chr [1:4279] "AFG" "AFG" "AFG" "AFG" ...
#>  $ vaccine       : chr [1:4279] "HepB" "HepB" "HepB" "HepB" ...
#>  $ activity_type : chr [1:4279] "routine" "routine" "routine" "routine" ...
#>  $ year          : int [1:4279] 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 ...
#>  $ burden_outcome: chr [1:4279] "dalys" "dalys" "dalys" "dalys" ...
#>  $ impact        : num [1:4279] 103859 106575 105852 111953 115165 ...
#>  $ impact_ratio  : num [1:4279] 0.154 0.154 0.154 0.154 0.154 ...
#>  $ fvps          : num [1:4279] 676396 694088 689380 729111 750028 ...

or use method = "yov_birth_cohort for impact by year of vaccination stratified by birth cohort. This uses the same logic as other impact methods but before calling the public facing impact function it will also extract FVP data from the data base for this touchstone and vaccine delivery. It then calls either impact_by_year_of_vaccination_activity_type or impact_by_year_of_vaccination_birth_cohort to get impact.

Recipe interface

We can re-use the recipe from above and get meta table for this method

meta <- vimpact:::get_meta_from_recipe(default_recipe = FALSE, recipe = t,
                                      method = "method2a", con = con)

Then use this to get raw impact

old_raw_impact <- vimpact:::get_raw_impact_details(con, meta, "dalys")

Extract the fvp data and map output to required interface

fvps <- vimpact::extract_vaccination_history(con, "201910gavi-5", year_min = 2000,
                                             year_max = 2030,
                                             disease_to_extract = "HepB")
#> User defined touchstone version is used.
#> Converting input coverage data......
#> Extracted interpolated population.
#> Extracted raw coverage data...
#> Transformed coverage data.
fvps$fvps <- fvps$fvps_adjusted
fvps$country <- fvps$country_nid

Then can use meta, raw_impact and fvps to calculate impact by year of vaccination

old_impact <- vimpact:::impact_by_year_of_vaccination(
  meta, old_raw_impact, fvps, vaccination_years = 2000:2030)
str(old_impact)
#> 'data.frame':    4964 obs. of  27 variables:
#>  $ country             : int  4 4 4 4 4 4 4 4 4 4 ...
#>  $ vaccine             : chr  "HepB_BD" "HepB" "HepB_BD" "HepB_BD" ...
#>  $ activity_type       : chr  "routine" "routine" "routine" "routine" ...
#>  $ scenario_description: chr  "best-estimates" "best-estimates" "best-estimates" "best-estimates" ...
#>  $ coverage_set        : int  1750 1751 1750 1750 1751 1751 1751 1751 1751 1751 ...
#>  $ delivery_id         : int  487 3093 1169 1175 3376 4029 4272 3375 4564 3096 ...
#>  $ disease             : chr  "HepB" "HepB" "HepB" "HepB" ...
#>  $ scenario_type       : chr  "default" "default" "default" "default" ...
#>  $ gavi_support_level  : chr  "with" "with" "with" "with" ...
#>  $ year                : int  2016 2017 2014 2025 2030 2015 2019 2027 2029 2020 ...
#>  $ gavi_support        : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
#>  $ gender              : chr  "Both" "Both" "Both" "Both" ...
#>  $ age                 : num  0 0 0 0 0 0 0 0 0 0 ...
#>  $ target_source       : num  1118424 1131834 1103729 1180725 1188620 ...
#>  $ coverage_source     : num  0.18 0.66 0.04 0.473 0.749 ...
#>  $ cohort_size         : num  1118424 1131834 1103729 1180725 1188620 ...
#>  $ delivery_population : num  1118424 1131834 1103729 1180725 1188620 ...
#>  $ fvps_source         : num  201316 747010 44149 558483 890514 ...
#>  $ fvps_adjusted       : num  201316 747010 44149 558483 890514 ...
#>  $ coverage_adjusted   : num  0.18 0.66 0.04 0.473 0.749 ...
#>  $ country_nid         : int  4 4 4 4 4 4 4 4 4 4 ...
#>  $ fvps                : num  201316 747010 44149 558483 890514 ...
#>  $ time                : num  2016 2017 2014 2025 2030 ...
#>  $ burden_outcome      : chr  "dalys" "dalys" "dalys" "dalys" ...
#>  $ impact_ratio        : num  0.154 0.154 0.154 0.154 0.154 ...
#>  $ impact              : num  30912 114701 6779 85754 136736 ...
#>  $ index               : int  1 1 1 1 1 1 1 1 1 1 ...

Comparison

calculate_impact has a very similar interface as a single row in an impact recipe. The output is slightly different format but output from previous method can be transformed into same format.

country <- dplyr::tbl(con, "country")
old_impact <- old_impact %>%
  dplyr::left_join(country, by = c("country" = "nid"), copy = TRUE) %>%
  dplyr::select(country = id, vaccine, activity_type, year = time,
                burden_outcome, impact) %>%
  dplyr::filter(!is.na(impact)) %>%
  dplyr::arrange(activity_type, country, year, vaccine)
str(old_impact)
#> 'data.frame':    4279 obs. of  6 variables:
#>  $ country       : chr  "AFG" "AFG" "AFG" "AFG" ...
#>  $ vaccine       : chr  "HepB" "HepB" "HepB" "HepB" ...
#>  $ activity_type : chr  "routine" "routine" "routine" "routine" ...
#>  $ year          : num  2007 2008 2009 2010 2011 ...
#>  $ burden_outcome: chr  "dalys" "dalys" "dalys" "dalys" ...
#>  $ impact        : num  103859 106575 105852 111953 115165 ...

and we can see from a plot that the output is very similar

HepB

htmltools::tags$iframe( 
  src = "impact_comparisons_hepb.html", 
  width = "100%", 
  height = "400", 
  scrolling = "no", 
  seamless = "seamless", 
  frameBorder = "0", 
  `data-external` = "1" 
)

Measles

htmltools::tags$iframe( 
  src = "impact_comparisons_measles.html", 
  width = "100%", 
  height = "400", 
  scrolling = "no", 
  seamless = "seamless", 
  frameBorder = "0", 
  `data-external` = "1" 
)

Comparisons of HepB, Measles & YF

This shows bars of mean difference between old impact & new impact method for each country for HepB, Measles and YF. Tooltips show the number of observations and the min and max order of the values. i.e. an impact of 23432.342 has order 5.

htmltools::tags$iframe( 
  src = "impact_comparisons.html", 
  width = "100%", 
  height = "400", 
  scrolling = "no", 
  seamless = "seamless", 
  frameBorder = "0", 
  `data-external` = "1" 
)

There are small differences in impact because of differences in precision. calculate_impact tries to do large parts of the aggregation on the database where burden_estimate values are stored as Postgres real type which has 4 bytes of storage size and 6 decimal digits of precision. Whereas if we pull this before aggregation the values are stored in R as doubles which are much higher precision leading to small differences when aggregated.

calculate_impact at the moment won’t be able to generate impact for you for more than 1 set of scenario comparisons. Next steps will be to add a wrapper which can take some data like the impact recipe and call calculate_imapct for multiple scenarios.