• python
  • javascript
  • reactjs
  • sql
  • c#
  • java
Facebook Twitter Instagram
Devs Fixed
  • python
  • javascript
  • reactjs
  • sql
  • c#
  • java
Devs Fixed
Home ยป Resolved: Conditional sum in R w NA values?

Resolved: Conditional sum in R w NA values?

0
By Isaac Tonny on 16/06/2022 Issue
Share
Facebook Twitter LinkedIn

Question:

I am trying to calculate the number of assays a given patient has within pre-specified time periods. E.g., between 14 and 45 days after a patient receives a third dose of vaccine, how many assays were taken? However, I don’t want to include any assays taken after the patient receives a fourth dose of vaccine.
My dataset is in long format and contains a variable indicating each date that an assay was completed, and variables for the date of each vaccination. Below is a contrived example of my data frame.
I’m unsure how I can sum the cases where the date of the assay falls in my pre-specified date range, while at the same time ensuring that I’m not including assays taken after a fourth vaccine dose. The challenge is that most of the patients in my dataset have not received a fourth dose and therefore have a missing value for dose_4_date.
My first thought was to use case_when to make a flag for the cases in which the assay_date is between 14 and 45 days after the dose_3_date, but not after the dose_4_date, and then sum the flags somehow. Below is what I’ve written so far:

df %>% mutate(post = case_when(assay_date >= dose_3_date+14 & assay_date <= dose_3_date+45 & assay_date <= dose_4_date & !is.na(dose_4_date) ~ 1), post3 = case_when(assay_date >= dose_3_date+60 & assay_date <= dose_3_date+120 & assay_date <= dose_4_date & !is.na(dose_4_date) ~ 1), post6 = case_when(assay_date >= dose_3_date+135 & assay_date <= dose_3_date+210 & assay_date <= dose_4_date & !is.na(dose_4_date) ~ 1)) [/code]

The above code works well for patients with a dose_4_date, but results in NA values for those with a “missing” dose_4_date. I’m unsure how I can ignore the NAs for patients with a missing dose_4_date.
I’m also unsure how to sum the flags afterward.
Any advice would be greatly appreciated!

Answer:

library(data.table)

# dummy data
df <- data.table(id = rep(c(1,2), times=c(4,3)) , assay_date = c('20mar2021', '06jun2021', '24sep2021', '19nov2021', '29apr2021', '23may2021', '15jun2021') , dose_3_date = rep(c('22feb2021', '02apr2021'), times=c(4,3)) , dose_4_date = c(rep(c('17aug2021', NA), times=c(4,3))) ); df # set as data.table if yours isn't one already setDT(df) # as.Date x <- c("assay_date", "dose_3_date", "dose_4_date") df[, (x) := lapply(.SD, \(i) as.Date(i, format="%d%b%Y")), .SDcols=x ][, date_diff := assay_date - dose_3_date # calculate date diff ] # flag rows which fit criteria df[date_diff %between% c(14, 45) & (assay_date <= dose_4_date | is.na(dose_4_date) ) , fits_criteria := 1 ] # count per patient df[, .(assays_in_period = sum(fits_criteria, na.rm=T)), id] id assays_in_period 1: 1 1 2: 2 1 [/code]

If you have better answer, please add a comment about this, thank you!

conditional-statements r sum
Share. Facebook Twitter LinkedIn

Related Posts

Resolved: Convert function is not working with {fn } in SQL Server

24/03/2023

Resolved: Why reference in pointer array doesn’t have data?

24/03/2023

Resolved: EntityFramework creates/runs migrations using parameterless DataContext instance

24/03/2023

Leave A Reply

© 2023 DEVSFIX.COM

Type above and press Enter to search. Press Esc to cancel.