A frequent need in data analysis is to understand the order that a given event ocurred in for a given user or account. An example question that lends itself to this is "What is the avg 1st order value of people who have purchased 2 items from us?"
When I first started in analytics, I typically did this with SQL, however I thought I'd share how to do this in Pandas because I've been really enjoying how brief and concise the code is.
df['order_rank'] = df.sort_values('order_date', ascending=True).groupby(by=['user_id'])['order_date'].transform(lambda x: x.rank())
Above we create a new column, order_rank in our dataframe. We sort our dataframe of orders by date and group the dataframe by user_id and apply the transform method to the date series to come up with a sequential rank from low to high.
The above screenshots give you an example of the use of this. Now you can filter the dataframe by order number 2, create a new dataframe of user_ids that have more than one order and countless other things.