sveitser/pandas_pivot_table_drop_other_columns.org

## pandas_pivot_table_drop_other_columns.org

      
    Raw
  

              pandas_pivot_table_drop_other_columns.org
            
          
    Drop other columns with pandas pivot_table

Problem: would like to use pivot_table [1] to pivot and ignore irrelevant columns
  in the DataFrame.
Given the DataFrame
df = pd.DataFrame(
    {
        "A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
        "B": ["one", "one", "one", "two", "two", "one", "one", "two", "two"],
        "C": ["small", "large", "large", "small", "small", "large", "small", "small", "large"],
        "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
        "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
    }
)
     A    B      C  D  E
0  foo  one  small  1  2
1  foo  one  large  2  4
2  foo  one  large  2  5
3  foo  two  small  3  5
4  foo  two  small  3  6
5  bar  one  large  4  6
6  bar  one  small  5  8
7  bar  two  small  6  9
8  bar  two  large  7  9

We can count the number of each category in C with
df.pivot_table(index=["A", "B"], columns=["C"], aggfunc="count", fill_value=0)
# , or aggfunc=len also works
but the result includes all the remaining columns:
            D           E
C       large small large small
A   B
bar one     1     1     1     1
    two     1     1     1     1
foo one     2     1     2     1
    two     0     2     0     2

Passing values=[] removes them
df.pivot_table(index=["A", "B"], columns=["C"], values=[], aggfunc=len, fill_value=0)
C        large  small
A   B
bar one      1      1
    two      1      1
foo one      2      1
    two      0      2

however the same functional call with aggfunc set to ~”count”~ returns an empty DataFrame. 🤷
[1] It’s difficult enough to remember how to use one function.