Skip to content

Instantly share code, notes, and snippets.

@detrout
Created July 21, 2017 22:39
Show Gist options
  • Save detrout/32f19ebcddfacee9e6bd4d1cf33ffa37 to your computer and use it in GitHub Desktop.
Save detrout/32f19ebcddfacee9e6bd4d1cf33ffa37 to your computer and use it in GitHub Desktop.
notes[1] body
Dask doesn't support the following argument(s).
* buf
* columns
* col_space
* header
* index
* na_rep
* formatters
* float_format
* sparsify
* index_names
* justify
* line_width
* max_cols
* show_dimensions
.. py:method:: DataFrame.to_timestamp(freq=None, how='start', axis=0)
:module: dask.dataframe
Cast to DatetimeIndex of timestamps, at *beginning* of period
:Parameters:
**freq** : string, default frequency of PeriodIndex
Desired frequency
**how** : {'s', 'e', 'start', 'end'}
Convention for converting period to timestamp; start of period
vs. end
**axis** : {0 or 'index', 1 or 'columns'}, default 0
The axis to convert (the index by default)
**copy** : boolean, default True
If false then underlying input data is not copied
:Returns:
**df** : DataFrame with DatetimeIndex
.. rubric:: Notes
Dask doesn't support the following argument(s).
* copy
.. py:method:: DataFrame.truediv(other, axis='columns', level=None, fill_value=None)
:module: dask.dataframe
Floating division of dataframe and other, element-wise (binary operator `truediv`).
Equivalent to ``dataframe / other``, but with support to substitute a fill_value for
missing data in one of the inputs.
:Parameters:
**other** : Series, DataFrame, or constant
**axis** : {0, 1, 'index', 'columns'}
For Series input, axis to match Series index on
**fill_value** : None or float value, default None
Fill missing (NaN) values with this value. If both DataFrame
locations are missing, the result will be missing
**level** : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level
:Returns:
**result** : DataFrame
.. seealso::
:obj:`DataFrame.rtruediv`
.. rubric:: Notes
Mismatched indices will be unioned together
.. py:attribute:: DataFrame.values
:module: dask.dataframe
Return a dask.array of the values of this dataframe
Warning: This creates a dask.array without precise shape information.
Operations that depend on shape information, like slicing or reshaping,
will not work.
.. py:method:: DataFrame.var(axis=None, skipna=True, ddof=1, split_every=False)
:module: dask.dataframe
Return unbiased variance over requested axis.
Normalized by N-1 by default. This can be changed using the ddof argument
:Parameters:
**axis** : {index (0), columns (1)}
**skipna** : boolean, default True
Exclude NA/null values. If an entire row/column is NA, the result
will be NA
**level** : int or level name, default None
If the axis is a MultiIndex (hierarchical), count along a
particular level, collapsing into a Series
**ddof** : int, default 1
degrees of freedom
**numeric_only** : boolean, default None
Include only float, int, boolean columns. If None, will attempt to use
everything, then use only numeric data. Not implemented for Series.
:Returns:
**var** : Series or DataFrame (if level specified)
.. rubric:: Notes
Dask doesn't support the following argument(s).
* level
* numeric_only
.. py:method:: DataFrame.visualize(filename='mydask', format=None, optimize_graph=False, **kwargs)
:module: dask.dataframe
Render the computation of this object's task graph using graphviz.
Requires ``graphviz`` to be installed.
:Parameters:
**filename** : str or None, optional
The name (without an extension) of the file to write to disk. If
`filename` is None, no file will be written, and we communicate
with dot using only pipes.
**format** : {'png', 'pdf', 'dot', 'svg', 'jpeg', 'jpg'}, optional
Format in which to write output file. Default is 'png'.
**optimize_graph** : bool, optional
If True, the graph is optimized before rendering. Otherwise,
the graph is displayed as is. Default is False.
****kwargs**
Additional keyword arguments to forward to ``to_graphviz``.
:Returns:
**result** : IPython.diplay.Image, IPython.display.SVG, or None
See dask.dot.dot_graph for more information.
.. seealso::
:obj:`dask.base.visualize`, :obj:`dask.dot.dot_graph`
.. rubric:: Notes
For more information on optimization see here:
http://dask.pydata.org/en/latest/optimize.html
.. py:method:: DataFrame.where(cond, other=nan)
:module: dask.dataframe
Return an object of same shape as self and whose corresponding
entries are from self where cond is True and otherwise are from
other.
:Parameters:
**cond** : boolean NDFrame, array-like, or callable
If cond is callable, it is computed on the NDFrame and
should return boolean NDFrame or array. The callable must
not change input NDFrame (though pandas doesn't check it).
.. versionadded:: 0.18.1
A callable can be used as cond.
**other** : scalar, NDFrame, or callable
If other is callable, it is computed on the NDFrame and
should return scalar or NDFrame. The callable must not
change input NDFrame (though pandas doesn't check it).
.. versionadded:: 0.18.1
A callable can be used as other.
**inplace** : boolean, default False
Whether to perform the operation in place on the data
**axis** : alignment axis if needed, default None
**level** : alignment level if needed, default None
**try_cast** : boolean, default False
try to cast the result back to the input type (if possible),
**raise_on_error** : boolean, default True
Whether to raise on invalid data types (e.g. trying to where on
strings)
:Returns:
**wh** : same type as caller
.. seealso::
:func:`DataFrame.mask`
.. rubric:: Notes
The where method is an application of the if-then idiom. For each
element in the calling DataFrame, if ``cond`` is ``True`` the
element is used; otherwise the corresponding element from the DataFrame
``other`` is used.
The signature for :func:`DataFrame.where` differs from
:func:`numpy.where`. Roughly ``df1.where(m, df2)`` is equivalent to
``np.where(m, df1, df2)``.
For further details and examples see the ``where`` documentation in
:ref:`indexing <indexing.where_mask>`.
.. rubric:: Examples
>>> s = pd.Series(range(5)) # doctest: +SKIP
>>> s.where(s > 0) # doctest: +SKIP
0 NaN
1 1.0
2 2.0
3 3.0
4 4.0
>>> df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B']) # doctest: +SKIP
>>> m = df % 3 == 0 # doctest: +SKIP
>>> df.where(m, -df) # doctest: +SKIP
A B
0 0 -1
1 -2 3
2 -4 -5
3 6 -7
4 -8 9
>>> df.where(m, -df) == np.where(m, df, -df) # doctest: +SKIP
A B
0 True True
1 True True
2 True True
3 True True
4 True True
>>> df.where(m, -df) == df.mask(~m, -df) # doctest: +SKIP
A B
0 True True
1 True True
2 True True
3 True True
4 True True
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment