Question:
i have a dataframe which contains a client code, the number of the contract and the products from the contract.Something like this :
client_code | contract_number | product |
---|---|---|
AAAA | 1000 | Water |
AAAA | 1000 | Soda |
AAAA | 1000 | Food |
BACD | 1001 | Water |
BACD | 1001 | Soda |
DAMR | 1002 | Food |
And I want to add a column which contains a count to see how many products are on a contract from 1 to n . Something like this:
client_code | contract_number | product | count |
---|---|---|---|
AAAA | 1000 | Water | 1 |
AAAA | 1000 | Soda | 2 |
AAAA | 1000 | Food | 3 |
BACD | 1001 | Water | 1 |
BACD | 1001 | Soda | 2 |
DAMR | 1002 | Food | 1 |
I’ve tried with a for loop but it’s too slow ( like an hour ).
PS : My data frame contains 500.000 lines .
Thank you !
Answer:
IIUC, you want a cumulative count by eachclient_code
(or probably contract_number
) – you can do that with the cumcount
function:If you have better answer, please add a comment about this, thank you!