Question:
I have a BIG dataframe with millions of rows & many columns and need to do GROUPBY AND COUNT OF VALUES OF DIFFERENT COLUMNS .Need help with efficient coding for the problem with minimal lines of code and a code which runs very fast.
I’m giving a simpler example below about my problem.
Below is my input CSV.

I Expect the output to be as below. Output should show
- CONTINENT column as the main groupby column
- UNIQUE values of AGE_GROUP and APPROVAL_STATUS columns as separate column name. And also, it should display the count of UNIQUE values of AGE_GROUP and APPROVAL_STATUS columns for each CONTINENT under respective output columns.
Output:-

Below is how I’m achieving it currently, but this is NOT en efficient way. Need help with efficient coding for the problem with minimal lines of code and a code which runs very fast. I’ve also sen that this could be achieved by using pivit table with pandas. But not too sure about it.
Answer:
Easy solution
Let us use
crosstabs
to calculate frequency tables then concat
the tables along columns axis:If you have better answer, please add a comment about this, thank you!