In this post, we will see how to resolve Subset rows of data that contain a specific string over several columns
Question:
I have a very large dataset and I need to subset my dataset to keep only those IDs that contain the word “Paracetamol” across any of the medication columns ex.medication1, medication2, medication3, etc. until medication50.Please help <3
Best Answer:
In R you can check the entire data frame on equality with the word “paracetamol” which gives you a boolean matrix. SinceTRUE == 1
and FALSE == 0
, you can calculate rowSums
; and obviously you want to subset where they are greater than zero.NA
in your data, use rowSums(., na.rm=TRUE)
.Data:
If you have better answer, please add a comment about this, thank you!
Source: Stackoverflow.com