Question:
Say I have a pandas dataframe like this:Doctor | Patient | Days |
---|---|---|
Aaron | Jeff | 23 |
Aaron | Josh | 46 |
Aaron | Josh | 71 |
Jess | Manny | 55 |
Jess | Manny | 85 |
Jess | Manny | 46 |
I want to extract dataframes where a combination of a doctor and a patient occurs more than once. I will be doing further work on the procured dataframes.
So, for instance, in this example, dataframe
Doctor | Patient | Days |
---|---|---|
Aaron | Josh | 46 |
Aaron | Josh | 71 |
would be extracted AND dataframe
Doctor | Patient | Days |
---|---|---|
Jess | Manny | 55 |
Jess | Manny | 85 |
Jess | Manny | 46 |
would be extracted.
In accordance with my condition, dataframe
Doctor | Patient | Days |
---|---|---|
Aaron | Jeff | 23 |
will not be extracted because the combination of Aaron and Jeff occurs only once.
Now, I have a dataframe that has 400000 rows and the code I have written so far is, I think, inefficient in procuring the dataframes that I want. Here is the code:
Thanks!
Umesh
Answer:
You may check withgroupby
If you have better answer, please add a comment about this, thank you!