Sorting and filtering of grouped subsets
Task: Find out the sales clerks whose sales are within top 8 for every moth in 1995.
Python
1 | import pandas as pd |
2 | sale_file = 'E:\\txt\\SalesRecord.txt' |
3 | sale_info = pd.read_csv(sale_file,sep = '\t') |
4 | sale_info['month']=pd.to_datetime(sale_info['sale_date']).dt.month |
5 | sale_group = sale_info.groupby(by=['clerk_name','month'],as_index=False).sum() |
6 | sale_group_month = sale_group.groupby(by='month') |
7 | set_name = set(sale_info['clerk_name']) |
8 | for index,sale_g_m in sale_group_month: |
9 | sale_g_m = sale_g_m.sort_values(by='sale_amt',ascending = False) |
10 | sale_g_max_8 = sale_g_m.iloc[:8] |
11 | sale_g_max_8_name = sale_g_max_8['clerk_name'] |
12 | set_name = set_name.intersection(set(sale_g_max_8_name)) |
13 | print(set_name) |
esProc
A | |
1 | E:\\txt\\SalesRecord.txt |
2 | =file(A1).import@t() |
3 | =A2.groups(clerk_name:name,month(sale_date):month;sum(sale_amt):amount) |
4 | =A3.group(month) |
5 | =A4.(~.sort(-amount).to(8)) |
6 | =A5.isect(~.(name)) |
esProc retains grouped subsets, and it is very simple and convenient to sort and filter the subsets in the loop function.