dd.Series
is linearly partitioned into smallerpd.Series
dd.Series.apply
anddd.Series.map
are the same and executes a function on each elementdd.Series.map_partition
maps a function to partitions of the seriesdd.Series.reduction
chunks each partition, then aggregates each partitionpd.Series
is backed by a Numpy array with additional metadata
When chunk
returns a pd.Series
. The aggregate
will receive a pd.DataFrame
.
This DataFrame will have columns equal to the indexes of the Series structure.
Visualize this as a transposition of the Series as a column vector to a row vector
of the DataFrame. Remember this still happens if your chunk
function is just the
identity function.