Perform cumulative count on a pandas column taking order into account

Multi tool use
Multi tool use
The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Perform cumulative count on a pandas column taking order into account



I have a pandas data frame


pd.DataFrame(columns=["A", "B"], data=[['id1','a'],['id1','a'], ['id1','a'], ['id1','b'], ['id1','b'], ['id1','a'], ['id1','a'], ['id2','c'], ['id2','c'], ['id2','a'], ['id2','c']])

A B
0 id1 a
1 id1 a
2 id1 a
3 id1 b
4 id1 b
5 id1 a
6 id1 a
7 id2 c
8 id2 c
9 id2 a
10 id2 c



I want to do a group by that gives me cumulative sum of occurrences in B column considering the order


A B C
0 id1 a 3
1 b 2
2 a 2
3 id2 c 2
4 a 1
5 c 1




1 Answer
1



You can use itertools.groupby followed by a list comprehension. This works because, unlike Pandas GroupBy, the itertools version does not aggregate like elements which are non-adjacent.


itertools.groupby


GroupBy


itertools


from itertools import groupby

grouper = groupby(zip(df['A'], df['B']))
res = [(i1, i2, len(list(j))) for (i1, i2), j in grouper]

df = pd.DataFrame(res, columns=['A', 'B', 'C'])

print(df)

A B C
0 id1 a 3
1 id1 b 2
2 id1 a 2
3 id2 c 2
4 id2 a 1
5 id2 c 1






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

iygLpam05qKV28wV9qRLLtlcYgvntbsGAyb,8G j9CwAH
GVj,lc1w5,wINNQQSTKiGPRZ 0DieJ Heij,fnenWcAoi,GaNLFLhZEpdWfeNnHuw oPh96YwgJBgjh0fyvywfL,9,ehHNyWW

Popular posts from this blog

Makefile test if variable is not empty

Visual Studio Code: How to configure includePath for better IntelliSense results

Will Oldham