python - pandas: replacing categorical values with counts of multi class label -


lets assume have data frame:

df = pd.dataframe({'label': [0, 1, 2, 0, 1, 2], 'cat_col': [1, 1, 2, 2, 3, 3]})    cat_col  label 0        1      0 1        1      1 2        2      2 3        2      0 4        3      1 5        3      2 

i want transform data frame following:

cat_col, label, count_when_label_is_0, count_when_label_is_1, count_when_label_is_2 1         0           1,               1,          0 1         1           1,               1,          0 ... 

so add 1 column each label value (multinomial label) , each row put count label value when row.cat_col in row. have slow:

size = df[['cat_col', 'label']].groupby(['cat_col', 'label']).size() def get_size(cat_val, label_val):   if label_val in size[cat_val]: return size[cat_val][target_val]     return 0  label_val in range(9): # 9 classes in multinominal label   df['new_col_' + str(label_val)] = df['cat_col'].apply(       lambda cat_val: get_size(cat_val, label_val)) 

you can use pivot_table:

in [11]: df.pivot_table(index="cat_col", columns="label", aggfunc=len, fill_value=0) out[11]: label    0  1  2 cat_col 1        1  1  0 2        1  0  1 3        0  1  1