CMCDragonkai/dask_ideas.md

## dask_ideas.md

      
    Raw
  

              dask_ideas.md
            
          
    Dask Ideas


dd.Series is linearly partitioned into smaller pd.Series
dd.Series.apply and dd.Series.map are the same and executes a function on each element
dd.Series.map_partition maps a function to partitions of the series
dd.Series.reduction chunks each partition, then aggregates each partition
pd.Series is backed by a Numpy array with additional metadata

Series Reduction

When chunk returns a pd.Series. The aggregate will receive a pd.DataFrame.
This DataFrame will have columns equal to the indexes of the Series structure.
Visualize this as a transposition of the Series as a column vector to a row vector
of the DataFrame. Remember this still happens if your chunk function is just the
identity function.