In this post, we will see how to resolve Subset rows of data that contain a specific string over several columns
Question:I have a very large dataset and I need to subset my dataset to keep only those IDs that contain the word “Paracetamol” across any of the medication columns ex.medication1, medication2, medication3, etc. until medication50.
Please help <3
Best Answer:In R you can check the entire data frame on equality with the word “paracetamol” which gives you a boolean matrix. Since
TRUE == 1and
FALSE == 0, you can calculate
rowSums; and obviously you want to subset where they are greater than zero.
NAin your data, use
If you have better answer, please add a comment about this, thank you!