You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
12881 lines
552 KiB
12881 lines
552 KiB
2 years ago
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"___\n",
|
||
|
"\n",
|
||
|
"<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>\n",
|
||
|
"___\n",
|
||
|
"<center><em>Copyright by Pierian Data Inc.</em></center>\n",
|
||
|
"<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Series"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.\n",
|
||
|
"\n",
|
||
|
"A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.\n",
|
||
|
"\n",
|
||
|
"Let's explore this concept through some examples:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Imports"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 12,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"import numpy as np\n",
|
||
|
"import pandas as pd"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Creating a Series from Python Objects"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 13,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"Help on class Series in module pandas.core.series:\n",
|
||
|
"\n",
|
||
|
"class Series(pandas.core.base.IndexOpsMixin, pandas.core.generic.NDFrame)\n",
|
||
|
" | One-dimensional ndarray with axis labels (including time series).\n",
|
||
|
" | \n",
|
||
|
" | Labels need not be unique but must be a hashable type. The object\n",
|
||
|
" | supports both integer- and label-based indexing and provides a host of\n",
|
||
|
" | methods for performing operations involving the index. Statistical\n",
|
||
|
" | methods from ndarray have been overridden to automatically exclude\n",
|
||
|
" | missing data (currently represented as NaN).\n",
|
||
|
" | \n",
|
||
|
" | Operations between Series (+, -, /, *, **) align values based on their\n",
|
||
|
" | associated index values-- they need not be the same length. The result\n",
|
||
|
" | index will be the sorted union of the two indexes.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | data : array-like, Iterable, dict, or scalar value\n",
|
||
|
" | Contains data stored in Series.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged :: 0.23.0\n",
|
||
|
" | If data is a dict, argument order is maintained for Python 3.6\n",
|
||
|
" | and later.\n",
|
||
|
" | \n",
|
||
|
" | index : array-like or Index (1d)\n",
|
||
|
" | Values must be hashable and have the same length as `data`.\n",
|
||
|
" | Non-unique index values are allowed. Will default to\n",
|
||
|
" | RangeIndex (0, 1, 2, ..., n) if not provided. If both a dict and index\n",
|
||
|
" | sequence are used, the index will override the keys found in the\n",
|
||
|
" | dict.\n",
|
||
|
" | dtype : str, numpy.dtype, or ExtensionDtype, optional\n",
|
||
|
" | dtype for the output Series. If not specified, this will be\n",
|
||
|
" | inferred from `data`.\n",
|
||
|
" | See the :ref:`user guide <basics.dtypes>` for more usages.\n",
|
||
|
" | copy : bool, default False\n",
|
||
|
" | Copy input data.\n",
|
||
|
" | \n",
|
||
|
" | Method resolution order:\n",
|
||
|
" | Series\n",
|
||
|
" | pandas.core.base.IndexOpsMixin\n",
|
||
|
" | pandas.core.generic.NDFrame\n",
|
||
|
" | pandas.core.base.PandasObject\n",
|
||
|
" | pandas.core.base.StringMixin\n",
|
||
|
" | pandas.core.accessor.DirNamesMixin\n",
|
||
|
" | pandas.core.base.SelectionMixin\n",
|
||
|
" | builtins.object\n",
|
||
|
" | \n",
|
||
|
" | Methods defined here:\n",
|
||
|
" | \n",
|
||
|
" | __add__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __and__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __array__(self, dtype=None)\n",
|
||
|
" | Return the values as a NumPy array.\n",
|
||
|
" | \n",
|
||
|
" | Users should not call this directly. Rather, it is invoked by\n",
|
||
|
" | :func:`numpy.array` and :func:`numpy.asarray`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | dtype : str or numpy.dtype, optional\n",
|
||
|
" | The dtype to use for the resulting NumPy array. By default,\n",
|
||
|
" | the dtype is inferred from the data.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | numpy.ndarray\n",
|
||
|
" | The values in the series converted to a :class:`numpy.ndarary`\n",
|
||
|
" | with the specified `dtype`.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | pandas.array : Create a new array from data.\n",
|
||
|
" | Series.array : Zero-copy view to the array backing the Series.\n",
|
||
|
" | Series.to_numpy : Series method for similar behavior.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> ser = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> np.asarray(ser)\n",
|
||
|
" | array([1, 2, 3])\n",
|
||
|
" | \n",
|
||
|
" | For timezone-aware data, the timezones may be retained with\n",
|
||
|
" | ``dtype='object'``\n",
|
||
|
" | \n",
|
||
|
" | >>> tzser = pd.Series(pd.date_range('2000', periods=2, tz=\"CET\"))\n",
|
||
|
" | >>> np.asarray(tzser, dtype=\"object\")\n",
|
||
|
" | array([Timestamp('2000-01-01 00:00:00+0100', tz='CET', freq='D'),\n",
|
||
|
" | Timestamp('2000-01-02 00:00:00+0100', tz='CET', freq='D')],\n",
|
||
|
" | dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | Or the values may be localized to UTC and the tzinfo discared with\n",
|
||
|
" | ``dtype='datetime64[ns]'``\n",
|
||
|
" | \n",
|
||
|
" | >>> np.asarray(tzser, dtype=\"datetime64[ns]\") # doctest: +ELLIPSIS\n",
|
||
|
" | array(['1999-12-31T23:00:00.000000000', ...],\n",
|
||
|
" | dtype='datetime64[ns]')\n",
|
||
|
" | \n",
|
||
|
" | __array_prepare__(self, result, context=None)\n",
|
||
|
" | Gets called prior to a ufunc.\n",
|
||
|
" | \n",
|
||
|
" | __array_wrap__(self, result, context=None)\n",
|
||
|
" | Gets called after a ufunc.\n",
|
||
|
" | \n",
|
||
|
" | __div__ = __truediv__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __divmod__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __eq__(self, other, axis=None)\n",
|
||
|
" | \n",
|
||
|
" | __float__(self)\n",
|
||
|
" | \n",
|
||
|
" | __floordiv__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __ge__(self, other, axis=None)\n",
|
||
|
" | \n",
|
||
|
" | __getitem__(self, key)\n",
|
||
|
" | \n",
|
||
|
" | __gt__(self, other, axis=None)\n",
|
||
|
" | \n",
|
||
|
" | __iadd__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __iand__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __ifloordiv__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __imod__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __imul__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __init__(self, data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)\n",
|
||
|
" | Initialize self. See help(type(self)) for accurate signature.\n",
|
||
|
" | \n",
|
||
|
" | __int__(self)\n",
|
||
|
" | \n",
|
||
|
" | __ior__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __ipow__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __isub__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __itruediv__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __ixor__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __le__(self, other, axis=None)\n",
|
||
|
" | \n",
|
||
|
" | __len__(self)\n",
|
||
|
" | Return the length of the Series.\n",
|
||
|
" | \n",
|
||
|
" | __long__ = __int__(self)\n",
|
||
|
" | \n",
|
||
|
" | __lt__(self, other, axis=None)\n",
|
||
|
" | \n",
|
||
|
" | __matmul__(self, other)\n",
|
||
|
" | Matrix multiplication using binary `@` operator in Python>=3.5.\n",
|
||
|
" | \n",
|
||
|
" | __mod__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __mul__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __ne__(self, other, axis=None)\n",
|
||
|
" | \n",
|
||
|
" | __or__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __pow__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __radd__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rand__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __rdiv__ = __rtruediv__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rdivmod__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rfloordiv__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rmatmul__(self, other)\n",
|
||
|
" | Matrix multiplication using binary `@` operator in Python>=3.5.\n",
|
||
|
" | \n",
|
||
|
" | __rmod__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rmul__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __ror__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __rpow__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rsub__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rtruediv__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __rxor__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | __setitem__(self, key, value)\n",
|
||
|
" | \n",
|
||
|
" | __sub__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __truediv__(left, right)\n",
|
||
|
" | \n",
|
||
|
" | __unicode__(self)\n",
|
||
|
" | Return a string representation for a particular DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Invoked by unicode(df) in py2 only. Yields a Unicode String in both\n",
|
||
|
" | py2/py3.\n",
|
||
|
" | \n",
|
||
|
" | __xor__(self, other)\n",
|
||
|
" | \n",
|
||
|
" | add(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Addition of series and other, element-wise (binary operator `add`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series + other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.radd\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | agg = aggregate(self, func, axis=0, *args, **kwargs)\n",
|
||
|
" | \n",
|
||
|
" | aggregate(self, func, axis=0, *args, **kwargs)\n",
|
||
|
" | Aggregate using one or more operations over the specified axis.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | func : function, str, list or dict\n",
|
||
|
" | Function to use for aggregating the data. If a function, must either\n",
|
||
|
" | work when passed a Series or when passed to Series.apply.\n",
|
||
|
" | \n",
|
||
|
" | Accepted combinations are:\n",
|
||
|
" | \n",
|
||
|
" | - function\n",
|
||
|
" | - string function name\n",
|
||
|
" | - list of functions and/or function names, e.g. ``[np.sum, 'mean']``\n",
|
||
|
" | - dict of axis labels -> functions, function names or list of such.\n",
|
||
|
" | axis : {0 or 'index'}\n",
|
||
|
" | Parameter needed for compatibility with DataFrame.\n",
|
||
|
" | *args\n",
|
||
|
" | Positional arguments to pass to `func`.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Keyword arguments to pass to `func`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | DataFrame, Series or scalar\n",
|
||
|
" | if DataFrame.agg is called with a single function, returns a Series\n",
|
||
|
" | if DataFrame.agg is called with several functions, returns a DataFrame\n",
|
||
|
" | if Series.agg is called with single function, returns a scalar\n",
|
||
|
" | if Series.agg is called with several functions, returns a Series\n",
|
||
|
" | \n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.apply : Invoke function on a Series.\n",
|
||
|
" | Series.transform : Transform function producing a Series with like indexes.\n",
|
||
|
" | \n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | `agg` is an alias for `aggregate`. Use the alias.\n",
|
||
|
" | \n",
|
||
|
" | A passed user-defined-function will be passed a Series for evaluation.\n",
|
||
|
" | \n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.agg('min')\n",
|
||
|
" | 1\n",
|
||
|
" | \n",
|
||
|
" | >>> s.agg(['min', 'max'])\n",
|
||
|
" | min 1\n",
|
||
|
" | max 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | align(self, other, join='outer', axis=None, level=None, copy=True, fill_value=None, method=None, limit=None, fill_axis=0, broadcast_axis=None)\n",
|
||
|
" | Align two objects on their axes with the\n",
|
||
|
" | specified join method for each axis Index.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : DataFrame or Series\n",
|
||
|
" | join : {'outer', 'inner', 'left', 'right'}, default 'outer'\n",
|
||
|
" | axis : allowed axis of the other object, default None\n",
|
||
|
" | Align on index (0), columns (1), or both (None)\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | copy : boolean, default True\n",
|
||
|
" | Always returns new objects. If copy=False and no reindexing is\n",
|
||
|
" | required then original objects are returned.\n",
|
||
|
" | fill_value : scalar, default np.NaN\n",
|
||
|
" | Value to use for missing values. Defaults to NaN, but can be any\n",
|
||
|
" | \"compatible\" value\n",
|
||
|
" | method : {'backfill', 'bfill', 'pad', 'ffill', None}, default None\n",
|
||
|
" | Method to use for filling holes in reindexed Series\n",
|
||
|
" | pad / ffill: propagate last valid observation forward to next valid\n",
|
||
|
" | backfill / bfill: use NEXT valid observation to fill gap\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | If method is specified, this is the maximum number of consecutive\n",
|
||
|
" | NaN values to forward/backward fill. In other words, if there is\n",
|
||
|
" | a gap with more than this number of consecutive NaNs, it will only\n",
|
||
|
" | be partially filled. If method is not specified, this is the\n",
|
||
|
" | maximum number of entries along the entire axis where NaNs will be\n",
|
||
|
" | filled. Must be greater than 0 if not None.\n",
|
||
|
" | fill_axis : {0 or 'index'}, default 0\n",
|
||
|
" | Filling axis, method and limit\n",
|
||
|
" | broadcast_axis : {0 or 'index'}, default None\n",
|
||
|
" | Broadcast values along this axis, if aligning two objects of\n",
|
||
|
" | different dimensions\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | (left, right) : (Series, type of other)\n",
|
||
|
" | Aligned objects\n",
|
||
|
" | \n",
|
||
|
" | all(self, axis=0, bool_only=None, skipna=True, level=None, **kwargs)\n",
|
||
|
" | Return whether all elements are True, potentially over an axis.\n",
|
||
|
" | \n",
|
||
|
" | Returns True unless there at least one element within a series or\n",
|
||
|
" | along a Dataframe axis that is False or equivalent (e.g. zero or\n",
|
||
|
" | empty).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns', None}, default 0\n",
|
||
|
" | Indicate which axis or axes should be reduced.\n",
|
||
|
" | \n",
|
||
|
" | * 0 / 'index' : reduce the index, return a Series whose index is the\n",
|
||
|
" | original column labels.\n",
|
||
|
" | * 1 / 'columns' : reduce the columns, return a Series whose index is the\n",
|
||
|
" | original index.\n",
|
||
|
" | * None : reduce all axes, return a scalar.\n",
|
||
|
" | \n",
|
||
|
" | bool_only : bool, default None\n",
|
||
|
" | Include only boolean columns. If None, will attempt to use everything,\n",
|
||
|
" | then use only boolean data. Not implemented for Series.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values. If the entire row/column is NA and skipna is\n",
|
||
|
" | True, then the result will be True, as for an empty row/column.\n",
|
||
|
" | If skipna is False, then NA are treated as True, because these are not\n",
|
||
|
" | equal to zero.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | **kwargs : any, default None\n",
|
||
|
" | Additional keywords have no effect but might be accepted for\n",
|
||
|
" | compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | scalar or Series\n",
|
||
|
" | If level is specified, then, Series is returned; otherwise, scalar\n",
|
||
|
" | is returned.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.all : Return True if all elements are True.\n",
|
||
|
" | DataFrame.any : Return True if one (or more) elements are True.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([True, True]).all()\n",
|
||
|
" | True\n",
|
||
|
" | >>> pd.Series([True, False]).all()\n",
|
||
|
" | False\n",
|
||
|
" | >>> pd.Series([]).all()\n",
|
||
|
" | True\n",
|
||
|
" | >>> pd.Series([np.nan]).all()\n",
|
||
|
" | True\n",
|
||
|
" | >>> pd.Series([np.nan]).all(skipna=False)\n",
|
||
|
" | True\n",
|
||
|
" | \n",
|
||
|
" | **DataFrames**\n",
|
||
|
" | \n",
|
||
|
" | Create a dataframe from a dictionary.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'col1': [True, True], 'col2': [True, False]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | col1 col2\n",
|
||
|
" | 0 True True\n",
|
||
|
" | 1 True False\n",
|
||
|
" | \n",
|
||
|
" | Default behaviour checks if column-wise values all return True.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.all()\n",
|
||
|
" | col1 True\n",
|
||
|
" | col2 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | Specify ``axis='columns'`` to check if row-wise values all return True.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.all(axis='columns')\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | Or ``axis=None`` for whether every value is True.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.all(axis=None)\n",
|
||
|
" | False\n",
|
||
|
" | \n",
|
||
|
" | any(self, axis=0, bool_only=None, skipna=True, level=None, **kwargs)\n",
|
||
|
" | Return whether any element is True, potentially over an axis.\n",
|
||
|
" | \n",
|
||
|
" | Returns False unless there at least one element within a series or\n",
|
||
|
" | along a Dataframe axis that is True or equivalent (e.g. non-zero or\n",
|
||
|
" | non-empty).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns', None}, default 0\n",
|
||
|
" | Indicate which axis or axes should be reduced.\n",
|
||
|
" | \n",
|
||
|
" | * 0 / 'index' : reduce the index, return a Series whose index is the\n",
|
||
|
" | original column labels.\n",
|
||
|
" | * 1 / 'columns' : reduce the columns, return a Series whose index is the\n",
|
||
|
" | original index.\n",
|
||
|
" | * None : reduce all axes, return a scalar.\n",
|
||
|
" | \n",
|
||
|
" | bool_only : bool, default None\n",
|
||
|
" | Include only boolean columns. If None, will attempt to use everything,\n",
|
||
|
" | then use only boolean data. Not implemented for Series.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values. If the entire row/column is NA and skipna is\n",
|
||
|
" | True, then the result will be False, as for an empty row/column.\n",
|
||
|
" | If skipna is False, then NA are treated as True, because these are not\n",
|
||
|
" | equal to zero.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | **kwargs : any, default None\n",
|
||
|
" | Additional keywords have no effect but might be accepted for\n",
|
||
|
" | compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | scalar or Series\n",
|
||
|
" | If level is specified, then, Series is returned; otherwise, scalar\n",
|
||
|
" | is returned.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.any : Numpy version of this method.\n",
|
||
|
" | Series.any : Return whether any element is True.\n",
|
||
|
" | Series.all : Return whether all elements are True.\n",
|
||
|
" | DataFrame.any : Return whether any element is True over requested axis.\n",
|
||
|
" | DataFrame.all : Return whether all elements are True over requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | For Series input, the output is a scalar indicating whether any element\n",
|
||
|
" | is True.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([False, False]).any()\n",
|
||
|
" | False\n",
|
||
|
" | >>> pd.Series([True, False]).any()\n",
|
||
|
" | True\n",
|
||
|
" | >>> pd.Series([]).any()\n",
|
||
|
" | False\n",
|
||
|
" | >>> pd.Series([np.nan]).any()\n",
|
||
|
" | False\n",
|
||
|
" | >>> pd.Series([np.nan]).any(skipna=False)\n",
|
||
|
" | True\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | Whether each column contains at least one True element (the default).\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [1, 2], \"B\": [0, 2], \"C\": [0, 0]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 1 0 0\n",
|
||
|
" | 1 2 2 0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.any()\n",
|
||
|
" | A True\n",
|
||
|
" | B True\n",
|
||
|
" | C False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | Aggregating over the columns.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [True, False], \"B\": [1, 2]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 True 1\n",
|
||
|
" | 1 False 2\n",
|
||
|
" | \n",
|
||
|
" | >>> df.any(axis='columns')\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 True\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [True, False], \"B\": [1, 0]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 True 1\n",
|
||
|
" | 1 False 0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.any(axis='columns')\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | Aggregating over the entire DataFrame with ``axis=None``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.any(axis=None)\n",
|
||
|
" | True\n",
|
||
|
" | \n",
|
||
|
" | `any` for an empty DataFrame is an empty Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.DataFrame([]).any()\n",
|
||
|
" | Series([], dtype: bool)\n",
|
||
|
" | \n",
|
||
|
" | append(self, to_append, ignore_index=False, verify_integrity=False)\n",
|
||
|
" | Concatenate two or more Series.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | to_append : Series or list/tuple of Series\n",
|
||
|
" | ignore_index : boolean, default False\n",
|
||
|
" | If True, do not use the index labels.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | verify_integrity : boolean, default False\n",
|
||
|
" | If True, raise Exception on creating index with duplicates\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | appended : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | concat : General function to concatenate DataFrame, Series\n",
|
||
|
" | or Panel objects.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Iteratively appending to a Series can be more computationally intensive\n",
|
||
|
" | than a single concatenate. A better solution is to append values to a\n",
|
||
|
" | list and then concatenate the list with the original Series all at\n",
|
||
|
" | once.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s1 = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s2 = pd.Series([4, 5, 6])\n",
|
||
|
" | >>> s3 = pd.Series([4, 5, 6], index=[3,4,5])\n",
|
||
|
" | >>> s1.append(s2)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 0 4\n",
|
||
|
" | 1 5\n",
|
||
|
" | 2 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s1.append(s3)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | 4 5\n",
|
||
|
" | 5 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | With `ignore_index` set to True:\n",
|
||
|
" | \n",
|
||
|
" | >>> s1.append(s2, ignore_index=True)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | 4 5\n",
|
||
|
" | 5 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | With `verify_integrity` set to True:\n",
|
||
|
" | \n",
|
||
|
" | >>> s1.append(s2, verify_integrity=True)\n",
|
||
|
" | Traceback (most recent call last):\n",
|
||
|
" | ...\n",
|
||
|
" | ValueError: Indexes have overlapping values: [0, 1, 2]\n",
|
||
|
" | \n",
|
||
|
" | apply(self, func, convert_dtype=True, args=(), **kwds)\n",
|
||
|
" | Invoke function on values of Series.\n",
|
||
|
" | \n",
|
||
|
" | Can be ufunc (a NumPy function that applies to the entire Series)\n",
|
||
|
" | or a Python function that only works on single values.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | func : function\n",
|
||
|
" | Python function or NumPy ufunc to apply.\n",
|
||
|
" | convert_dtype : bool, default True\n",
|
||
|
" | Try to find better dtype for elementwise function results. If\n",
|
||
|
" | False, leave as dtype=object.\n",
|
||
|
" | args : tuple\n",
|
||
|
" | Positional arguments passed to func after the series value.\n",
|
||
|
" | **kwds\n",
|
||
|
" | Additional keyword arguments passed to func.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | If func returns a Series object the result will be a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.map: For element-wise operations.\n",
|
||
|
" | Series.agg: Only perform aggregating type operations.\n",
|
||
|
" | Series.transform: Only perform transforming type operations.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Create a series with typical summer temperatures for each city.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([20, 21, 12],\n",
|
||
|
" | ... index=['London', 'New York', 'Helsinki'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | London 20\n",
|
||
|
" | New York 21\n",
|
||
|
" | Helsinki 12\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Square the values by defining a function and passing it as an\n",
|
||
|
" | argument to ``apply()``.\n",
|
||
|
" | \n",
|
||
|
" | >>> def square(x):\n",
|
||
|
" | ... return x ** 2\n",
|
||
|
" | >>> s.apply(square)\n",
|
||
|
" | London 400\n",
|
||
|
" | New York 441\n",
|
||
|
" | Helsinki 144\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Square the values by passing an anonymous function as an\n",
|
||
|
" | argument to ``apply()``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.apply(lambda x: x ** 2)\n",
|
||
|
" | London 400\n",
|
||
|
" | New York 441\n",
|
||
|
" | Helsinki 144\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Define a custom function that needs additional positional\n",
|
||
|
" | arguments and pass these additional arguments using the\n",
|
||
|
" | ``args`` keyword.\n",
|
||
|
" | \n",
|
||
|
" | >>> def subtract_custom_value(x, custom_value):\n",
|
||
|
" | ... return x - custom_value\n",
|
||
|
" | \n",
|
||
|
" | >>> s.apply(subtract_custom_value, args=(5,))\n",
|
||
|
" | London 15\n",
|
||
|
" | New York 16\n",
|
||
|
" | Helsinki 7\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Define a custom function that takes keyword arguments\n",
|
||
|
" | and pass these arguments to ``apply``.\n",
|
||
|
" | \n",
|
||
|
" | >>> def add_custom_values(x, **kwargs):\n",
|
||
|
" | ... for month in kwargs:\n",
|
||
|
" | ... x += kwargs[month]\n",
|
||
|
" | ... return x\n",
|
||
|
" | \n",
|
||
|
" | >>> s.apply(add_custom_values, june=30, july=20, august=25)\n",
|
||
|
" | London 95\n",
|
||
|
" | New York 96\n",
|
||
|
" | Helsinki 87\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Use a function from the Numpy library.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.apply(np.log)\n",
|
||
|
" | London 2.995732\n",
|
||
|
" | New York 3.044522\n",
|
||
|
" | Helsinki 2.484907\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | argmax = idxmax(self, axis=0, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return the row label of the maximum value.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | The current behaviour of 'Series.argmax' is deprecated, use 'idxmax'\n",
|
||
|
" | instead.\n",
|
||
|
" | The behavior of 'argmax' will be corrected to return the positional\n",
|
||
|
" | maximum in the future. For now, use 'series.values.argmax' or\n",
|
||
|
" | 'np.argmax(np.array(values))' to get the position of the maximum\n",
|
||
|
" | row.\n",
|
||
|
" | \n",
|
||
|
" | If multiple values equal the maximum, the first row label with that\n",
|
||
|
" | value is returned.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If the entire Series is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | axis : int, default 0\n",
|
||
|
" | For compatibility with DataFrame.idxmax. Redundant for application\n",
|
||
|
" | on Series.\n",
|
||
|
" | *args, **kwargs\n",
|
||
|
" | Additional keywords have no effect but might be accepted\n",
|
||
|
" | for compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | idxmax : Index of maximum of values.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | ValueError\n",
|
||
|
" | If the Series is empty.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.argmax : Return indices of the maximum values\n",
|
||
|
" | along the given axis.\n",
|
||
|
" | DataFrame.idxmax : Return index of first occurrence of maximum\n",
|
||
|
" | over requested axis.\n",
|
||
|
" | Series.idxmin : Return index *label* of the first occurrence\n",
|
||
|
" | of minimum of values.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This method is the Series version of ``ndarray.argmax``. This method\n",
|
||
|
" | returns the label of the maximum, while ``ndarray.argmax`` returns\n",
|
||
|
" | the position. To get the position, use ``series.values.argmax()``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(data=[1, None, 4, 3, 4],\n",
|
||
|
" | ... index=['A', 'B', 'C', 'D', 'E'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | A 1.0\n",
|
||
|
" | B NaN\n",
|
||
|
" | C 4.0\n",
|
||
|
" | D 3.0\n",
|
||
|
" | E 4.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmax()\n",
|
||
|
" | 'C'\n",
|
||
|
" | \n",
|
||
|
" | If `skipna` is False and there is an NA value in the data,\n",
|
||
|
" | the function returns ``nan``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmax(skipna=False)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | argmin = idxmin(self, axis=0, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return the row label of the minimum value.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | The current behaviour of 'Series.argmin' is deprecated, use 'idxmin'\n",
|
||
|
" | instead.\n",
|
||
|
" | The behavior of 'argmin' will be corrected to return the positional\n",
|
||
|
" | minimum in the future. For now, use 'series.values.argmin' or\n",
|
||
|
" | 'np.argmin(np.array(values))' to get the position of the minimum\n",
|
||
|
" | row.\n",
|
||
|
" | \n",
|
||
|
" | If multiple values equal the minimum, the first row label with that\n",
|
||
|
" | value is returned.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If the entire Series is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | axis : int, default 0\n",
|
||
|
" | For compatibility with DataFrame.idxmin. Redundant for application\n",
|
||
|
" | on Series.\n",
|
||
|
" | *args, **kwargs\n",
|
||
|
" | Additional keywords have no effect but might be accepted\n",
|
||
|
" | for compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | idxmin : Index of minimum of values.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | ValueError\n",
|
||
|
" | If the Series is empty.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.argmin : Return indices of the minimum values\n",
|
||
|
" | along the given axis.\n",
|
||
|
" | DataFrame.idxmin : Return index of first occurrence of minimum\n",
|
||
|
" | over requested axis.\n",
|
||
|
" | Series.idxmax : Return index *label* of the first occurrence\n",
|
||
|
" | of maximum of values.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This method is the Series version of ``ndarray.argmin``. This method\n",
|
||
|
" | returns the label of the minimum, while ``ndarray.argmin`` returns\n",
|
||
|
" | the position. To get the position, use ``series.values.argmin()``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(data=[1, None, 4, 1],\n",
|
||
|
" | ... index=['A' ,'B' ,'C' ,'D'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | A 1.0\n",
|
||
|
" | B NaN\n",
|
||
|
" | C 4.0\n",
|
||
|
" | D 1.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmin()\n",
|
||
|
" | 'A'\n",
|
||
|
" | \n",
|
||
|
" | If `skipna` is False and there is an NA value in the data,\n",
|
||
|
" | the function returns ``nan``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmin(skipna=False)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | argsort(self, axis=0, kind='quicksort', order=None)\n",
|
||
|
" | Overrides ndarray.argsort. Argsorts the value, omitting NA/null values,\n",
|
||
|
" | and places the result in the same locations as the non-NA values.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : int\n",
|
||
|
" | Has no effect but is accepted for compatibility with numpy.\n",
|
||
|
" | kind : {'mergesort', 'quicksort', 'heapsort'}, default 'quicksort'\n",
|
||
|
" | Choice of sorting algorithm. See np.sort for more\n",
|
||
|
" | information. 'mergesort' is the only stable algorithm\n",
|
||
|
" | order : None\n",
|
||
|
" | Has no effect but is accepted for compatibility with numpy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | argsorted : Series, with -1 indicated where nan values are present\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.argsort\n",
|
||
|
" | \n",
|
||
|
" | autocorr(self, lag=1)\n",
|
||
|
" | Compute the lag-N autocorrelation.\n",
|
||
|
" | \n",
|
||
|
" | This method computes the Pearson correlation between\n",
|
||
|
" | the Series and its shifted self.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | lag : int, default 1\n",
|
||
|
" | Number of lags to apply before performing autocorrelation.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | float\n",
|
||
|
" | The Pearson correlation between self and self.shift(lag).\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.corr : Compute the correlation between two Series.\n",
|
||
|
" | Series.shift : Shift index by desired number of periods.\n",
|
||
|
" | DataFrame.corr : Compute pairwise correlation of columns.\n",
|
||
|
" | DataFrame.corrwith : Compute pairwise correlation between rows or\n",
|
||
|
" | columns of two DataFrame objects.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | If the Pearson correlation is not well defined return 'NaN'.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([0.25, 0.5, 0.2, -0.05])\n",
|
||
|
" | >>> s.autocorr() # doctest: +ELLIPSIS\n",
|
||
|
" | 0.10355...\n",
|
||
|
" | >>> s.autocorr(lag=2) # doctest: +ELLIPSIS\n",
|
||
|
" | -0.99999...\n",
|
||
|
" | \n",
|
||
|
" | If the Pearson correlation is not well defined, then 'NaN' is returned.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 0, 0, 0])\n",
|
||
|
" | >>> s.autocorr()\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | between(self, left, right, inclusive=True)\n",
|
||
|
" | Return boolean Series equivalent to left <= series <= right.\n",
|
||
|
" | \n",
|
||
|
" | This function returns a boolean vector containing `True` wherever the\n",
|
||
|
" | corresponding Series element is between the boundary values `left` and\n",
|
||
|
" | `right`. NA values are treated as `False`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | left : scalar\n",
|
||
|
" | Left boundary.\n",
|
||
|
" | right : scalar\n",
|
||
|
" | Right boundary.\n",
|
||
|
" | inclusive : bool, default True\n",
|
||
|
" | Include boundaries.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Each element will be a boolean.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.gt : Greater than of series and other.\n",
|
||
|
" | Series.lt : Less than of series and other.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This function is equivalent to ``(left <= ser) & (ser <= right)``\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([2, 0, 4, 8, np.nan])\n",
|
||
|
" | \n",
|
||
|
" | Boundary values are included by default:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.between(1, 4)\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | With `inclusive` set to ``False`` boundary values are excluded:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.between(1, 4, inclusive=False)\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 False\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | `left` and `right` can be any scalar value:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(['Alice', 'Bob', 'Carol', 'Eve'])\n",
|
||
|
" | >>> s.between('Anna', 'Daniel')\n",
|
||
|
" | 0 False\n",
|
||
|
" | 1 True\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | combine(self, other, func, fill_value=None)\n",
|
||
|
" | Combine the Series with a Series or scalar according to `func`.\n",
|
||
|
" | \n",
|
||
|
" | Combine the Series and `other` using `func` to perform elementwise\n",
|
||
|
" | selection for combined Series.\n",
|
||
|
" | `fill_value` is assumed when value is missing at some index\n",
|
||
|
" | from one of the two objects being combined.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar\n",
|
||
|
" | The value(s) to be combined with the `Series`.\n",
|
||
|
" | func : function\n",
|
||
|
" | Function that takes two scalars as inputs and returns an element.\n",
|
||
|
" | fill_value : scalar, optional\n",
|
||
|
" | The value to assume when an index is missing from\n",
|
||
|
" | one Series or the other. The default specifies to use the\n",
|
||
|
" | appropriate NaN value for the underlying dtype of the Series.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | The result of combining the Series with the other object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.combine_first : Combine Series values, choosing the calling\n",
|
||
|
" | Series' values first.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Consider 2 Datasets ``s1`` and ``s2`` containing\n",
|
||
|
" | highest clocked speeds of different birds.\n",
|
||
|
" | \n",
|
||
|
" | >>> s1 = pd.Series({'falcon': 330.0, 'eagle': 160.0})\n",
|
||
|
" | >>> s1\n",
|
||
|
" | falcon 330.0\n",
|
||
|
" | eagle 160.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> s2 = pd.Series({'falcon': 345.0, 'eagle': 200.0, 'duck': 30.0})\n",
|
||
|
" | >>> s2\n",
|
||
|
" | falcon 345.0\n",
|
||
|
" | eagle 200.0\n",
|
||
|
" | duck 30.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Now, to combine the two datasets and view the highest speeds\n",
|
||
|
" | of the birds across the two datasets\n",
|
||
|
" | \n",
|
||
|
" | >>> s1.combine(s2, max)\n",
|
||
|
" | duck NaN\n",
|
||
|
" | eagle 200.0\n",
|
||
|
" | falcon 345.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | In the previous example, the resulting value for duck is missing,\n",
|
||
|
" | because the maximum of a NaN and a float is a NaN.\n",
|
||
|
" | So, in the example, we set ``fill_value=0``,\n",
|
||
|
" | so the maximum value returned will be the value from some dataset.\n",
|
||
|
" | \n",
|
||
|
" | >>> s1.combine(s2, max, fill_value=0)\n",
|
||
|
" | duck 30.0\n",
|
||
|
" | eagle 200.0\n",
|
||
|
" | falcon 345.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | combine_first(self, other)\n",
|
||
|
" | Combine Series values, choosing the calling Series's values first.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series\n",
|
||
|
" | The value(s) to be combined with the `Series`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | The result of combining the Series with the other object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.combine : Perform elementwise operation on two Series\n",
|
||
|
" | using a given function.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Result index will be the union of the two indexes.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s1 = pd.Series([1, np.nan])\n",
|
||
|
" | >>> s2 = pd.Series([3, 4])\n",
|
||
|
" | >>> s1.combine_first(s2)\n",
|
||
|
" | 0 1.0\n",
|
||
|
" | 1 4.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | compound(self, axis=None, skipna=None, level=None)\n",
|
||
|
" | Return the compound percentage of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | compounded : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | compress(self, condition, *args, **kwargs)\n",
|
||
|
" | Return selected slices of an array along given axis as a Series.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.compress\n",
|
||
|
" | \n",
|
||
|
" | corr(self, other, method='pearson', min_periods=None)\n",
|
||
|
" | Compute correlation with `other` Series, excluding missing values.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series\n",
|
||
|
" | method : {'pearson', 'kendall', 'spearman'} or callable\n",
|
||
|
" | * pearson : standard correlation coefficient\n",
|
||
|
" | * kendall : Kendall Tau correlation coefficient\n",
|
||
|
" | * spearman : Spearman rank correlation\n",
|
||
|
" | * callable: callable with input two 1d ndarray\n",
|
||
|
" | and returning a float\n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | min_periods : int, optional\n",
|
||
|
" | Minimum number of observations needed to have a valid result\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | correlation : float\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> histogram_intersection = lambda a, b: np.minimum(a, b\n",
|
||
|
" | ... ).sum().round(decimals=1)\n",
|
||
|
" | >>> s1 = pd.Series([.2, .0, .6, .2])\n",
|
||
|
" | >>> s2 = pd.Series([.3, .6, .0, .1])\n",
|
||
|
" | >>> s1.corr(s2, method=histogram_intersection)\n",
|
||
|
" | 0.3\n",
|
||
|
" | \n",
|
||
|
" | count(self, level=None)\n",
|
||
|
" | Return number of non-NA/null observations in the Series.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a smaller Series\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | nobs : int or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | cov(self, other, min_periods=None)\n",
|
||
|
" | Compute covariance with Series, excluding missing values.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series\n",
|
||
|
" | min_periods : int, optional\n",
|
||
|
" | Minimum number of observations needed to have a valid result\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | covariance : float\n",
|
||
|
" | \n",
|
||
|
" | Normalized by N-1 (unbiased estimator).\n",
|
||
|
" | \n",
|
||
|
" | cummax(self, axis=None, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return cumulative maximum over a DataFrame or Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Returns a DataFrame or Series of the same size containing the cumulative\n",
|
||
|
" | maximum.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | The index or the name of the axis. 0 is equivalent to None or 'index'.\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | *args, **kwargs :\n",
|
||
|
" | Additional keywords have no effect but might be accepted for\n",
|
||
|
" | compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | cummax : scalar or Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | core.window.Expanding.max : Similar functionality\n",
|
||
|
" | but ignores ``NaN`` values.\n",
|
||
|
" | Series.max : Return the maximum over\n",
|
||
|
" | Series axis.\n",
|
||
|
" | Series.cummax : Return cumulative maximum over Series axis.\n",
|
||
|
" | Series.cummin : Return cumulative minimum over Series axis.\n",
|
||
|
" | Series.cumsum : Return cumulative sum over Series axis.\n",
|
||
|
" | Series.cumprod : Return cumulative product over Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([2, np.nan, 5, -1, 0])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 5.0\n",
|
||
|
" | 3 -1.0\n",
|
||
|
" | 4 0.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | By default, NA values are ignored.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cummax()\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 5.0\n",
|
||
|
" | 3 5.0\n",
|
||
|
" | 4 5.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | To include NA values in the operation, use ``skipna=False``\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cummax(skipna=False)\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[2.0, 1.0],\n",
|
||
|
" | ... [3.0, np.nan],\n",
|
||
|
" | ... [1.0, 0.0]],\n",
|
||
|
" | ... columns=list('AB'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | By default, iterates over rows and finds the maximum\n",
|
||
|
" | in each column. This is equivalent to ``axis=None`` or ``axis='index'``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cummax()\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 3.0 1.0\n",
|
||
|
" | \n",
|
||
|
" | To iterate over columns and find the maximum in each row,\n",
|
||
|
" | use ``axis=1``\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cummax(axis=1)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 2.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 1.0\n",
|
||
|
" | \n",
|
||
|
" | cummin(self, axis=None, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return cumulative minimum over a DataFrame or Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Returns a DataFrame or Series of the same size containing the cumulative\n",
|
||
|
" | minimum.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | The index or the name of the axis. 0 is equivalent to None or 'index'.\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | *args, **kwargs :\n",
|
||
|
" | Additional keywords have no effect but might be accepted for\n",
|
||
|
" | compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | cummin : scalar or Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | core.window.Expanding.min : Similar functionality\n",
|
||
|
" | but ignores ``NaN`` values.\n",
|
||
|
" | Series.min : Return the minimum over\n",
|
||
|
" | Series axis.\n",
|
||
|
" | Series.cummax : Return cumulative maximum over Series axis.\n",
|
||
|
" | Series.cummin : Return cumulative minimum over Series axis.\n",
|
||
|
" | Series.cumsum : Return cumulative sum over Series axis.\n",
|
||
|
" | Series.cumprod : Return cumulative product over Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([2, np.nan, 5, -1, 0])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 5.0\n",
|
||
|
" | 3 -1.0\n",
|
||
|
" | 4 0.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | By default, NA values are ignored.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cummin()\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 -1.0\n",
|
||
|
" | 4 -1.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | To include NA values in the operation, use ``skipna=False``\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cummin(skipna=False)\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[2.0, 1.0],\n",
|
||
|
" | ... [3.0, np.nan],\n",
|
||
|
" | ... [1.0, 0.0]],\n",
|
||
|
" | ... columns=list('AB'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | By default, iterates over rows and finds the minimum\n",
|
||
|
" | in each column. This is equivalent to ``axis=None`` or ``axis='index'``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cummin()\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 2.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | To iterate over columns and find the minimum in each row,\n",
|
||
|
" | use ``axis=1``\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cummin(axis=1)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | cumprod(self, axis=None, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return cumulative product over a DataFrame or Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Returns a DataFrame or Series of the same size containing the cumulative\n",
|
||
|
" | product.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | The index or the name of the axis. 0 is equivalent to None or 'index'.\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | *args, **kwargs :\n",
|
||
|
" | Additional keywords have no effect but might be accepted for\n",
|
||
|
" | compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | cumprod : scalar or Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | core.window.Expanding.prod : Similar functionality\n",
|
||
|
" | but ignores ``NaN`` values.\n",
|
||
|
" | Series.prod : Return the product over\n",
|
||
|
" | Series axis.\n",
|
||
|
" | Series.cummax : Return cumulative maximum over Series axis.\n",
|
||
|
" | Series.cummin : Return cumulative minimum over Series axis.\n",
|
||
|
" | Series.cumsum : Return cumulative sum over Series axis.\n",
|
||
|
" | Series.cumprod : Return cumulative product over Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([2, np.nan, 5, -1, 0])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 5.0\n",
|
||
|
" | 3 -1.0\n",
|
||
|
" | 4 0.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | By default, NA values are ignored.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cumprod()\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 10.0\n",
|
||
|
" | 3 -10.0\n",
|
||
|
" | 4 -0.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | To include NA values in the operation, use ``skipna=False``\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cumprod(skipna=False)\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[2.0, 1.0],\n",
|
||
|
" | ... [3.0, np.nan],\n",
|
||
|
" | ... [1.0, 0.0]],\n",
|
||
|
" | ... columns=list('AB'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | By default, iterates over rows and finds the product\n",
|
||
|
" | in each column. This is equivalent to ``axis=None`` or ``axis='index'``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cumprod()\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 6.0 NaN\n",
|
||
|
" | 2 6.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | To iterate over columns and find the product in each row,\n",
|
||
|
" | use ``axis=1``\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cumprod(axis=1)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 2.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | cumsum(self, axis=None, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return cumulative sum over a DataFrame or Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Returns a DataFrame or Series of the same size containing the cumulative\n",
|
||
|
" | sum.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | The index or the name of the axis. 0 is equivalent to None or 'index'.\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | *args, **kwargs :\n",
|
||
|
" | Additional keywords have no effect but might be accepted for\n",
|
||
|
" | compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | cumsum : scalar or Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | core.window.Expanding.sum : Similar functionality\n",
|
||
|
" | but ignores ``NaN`` values.\n",
|
||
|
" | Series.sum : Return the sum over\n",
|
||
|
" | Series axis.\n",
|
||
|
" | Series.cummax : Return cumulative maximum over Series axis.\n",
|
||
|
" | Series.cummin : Return cumulative minimum over Series axis.\n",
|
||
|
" | Series.cumsum : Return cumulative sum over Series axis.\n",
|
||
|
" | Series.cumprod : Return cumulative product over Series axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([2, np.nan, 5, -1, 0])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 5.0\n",
|
||
|
" | 3 -1.0\n",
|
||
|
" | 4 0.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | By default, NA values are ignored.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cumsum()\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 7.0\n",
|
||
|
" | 3 6.0\n",
|
||
|
" | 4 6.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | To include NA values in the operation, use ``skipna=False``\n",
|
||
|
" | \n",
|
||
|
" | >>> s.cumsum(skipna=False)\n",
|
||
|
" | 0 2.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[2.0, 1.0],\n",
|
||
|
" | ... [3.0, np.nan],\n",
|
||
|
" | ... [1.0, 0.0]],\n",
|
||
|
" | ... columns=list('AB'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 0.0\n",
|
||
|
" | \n",
|
||
|
" | By default, iterates over rows and finds the sum\n",
|
||
|
" | in each column. This is equivalent to ``axis=None`` or ``axis='index'``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cumsum()\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 1.0\n",
|
||
|
" | 1 5.0 NaN\n",
|
||
|
" | 2 6.0 1.0\n",
|
||
|
" | \n",
|
||
|
" | To iterate over columns and find the sum in each row,\n",
|
||
|
" | use ``axis=1``\n",
|
||
|
" | \n",
|
||
|
" | >>> df.cumsum(axis=1)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 2.0 3.0\n",
|
||
|
" | 1 3.0 NaN\n",
|
||
|
" | 2 1.0 1.0\n",
|
||
|
" | \n",
|
||
|
" | diff(self, periods=1)\n",
|
||
|
" | First discrete difference of element.\n",
|
||
|
" | \n",
|
||
|
" | Calculates the difference of a Series element compared with another\n",
|
||
|
" | element in the Series (default is element in previous row).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | periods : int, default 1\n",
|
||
|
" | Periods to shift for calculating difference, accepts negative\n",
|
||
|
" | values.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | diffed : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.pct_change: Percent change over given number of periods.\n",
|
||
|
" | Series.shift: Shift index by desired number of periods with an\n",
|
||
|
" | optional time freq.\n",
|
||
|
" | DataFrame.diff: First discrete difference of object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Difference with previous row\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 1, 2, 3, 5, 8])\n",
|
||
|
" | >>> s.diff()\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 0.0\n",
|
||
|
" | 2 1.0\n",
|
||
|
" | 3 1.0\n",
|
||
|
" | 4 2.0\n",
|
||
|
" | 5 3.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Difference with 3rd previous row\n",
|
||
|
" | \n",
|
||
|
" | >>> s.diff(periods=3)\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 2.0\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | 5 6.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Difference with following row\n",
|
||
|
" | \n",
|
||
|
" | >>> s.diff(periods=-1)\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 -1.0\n",
|
||
|
" | 2 -1.0\n",
|
||
|
" | 3 -2.0\n",
|
||
|
" | 4 -3.0\n",
|
||
|
" | 5 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | div = truediv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | \n",
|
||
|
" | divide = truediv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | \n",
|
||
|
" | divmod(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Integer division and modulo of series and other, element-wise (binary operator `divmod`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series divmod other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rdivmod\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | dot(self, other)\n",
|
||
|
" | Compute the dot product between the Series and the columns of other.\n",
|
||
|
" | \n",
|
||
|
" | This method computes the dot product between the Series and another\n",
|
||
|
" | one, or the Series and each columns of a DataFrame, or the Series and\n",
|
||
|
" | each columns of an array.\n",
|
||
|
" | \n",
|
||
|
" | It can also be called using `self @ other` in Python >= 3.5.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series, DataFrame or array-like\n",
|
||
|
" | The other object to compute the dot product with its columns.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | scalar, Series or numpy.ndarray\n",
|
||
|
" | Return the dot product of the Series and other if other is a\n",
|
||
|
" | Series, the Series of the dot product of Series and each rows of\n",
|
||
|
" | other if other is a DataFrame or a numpy.ndarray between the Series\n",
|
||
|
" | and each columns of the numpy array.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.dot: Compute the matrix product with the DataFrame.\n",
|
||
|
" | Series.mul: Multiplication of series and other, element-wise.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | The Series and other has to share the same index if other is a Series\n",
|
||
|
" | or a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([0, 1, 2, 3])\n",
|
||
|
" | >>> other = pd.Series([-1, 2, -3, 4])\n",
|
||
|
" | >>> s.dot(other)\n",
|
||
|
" | 8\n",
|
||
|
" | >>> s @ other\n",
|
||
|
" | 8\n",
|
||
|
" | >>> df = pd.DataFrame([[0 ,1], [-2, 3], [4, -5], [6, 7]])\n",
|
||
|
" | >>> s.dot(df)\n",
|
||
|
" | 0 24\n",
|
||
|
" | 1 14\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> arr = np.array([[0, 1], [-2, 3], [4, -5], [6, 7]])\n",
|
||
|
" | >>> s.dot(arr)\n",
|
||
|
" | array([24, 14])\n",
|
||
|
" | \n",
|
||
|
" | drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')\n",
|
||
|
" | Return Series with specified index labels removed.\n",
|
||
|
" | \n",
|
||
|
" | Remove elements of a Series based on specifying the index labels.\n",
|
||
|
" | When using a multi-index, labels on different levels can be removed\n",
|
||
|
" | by specifying the level.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | labels : single label or list-like\n",
|
||
|
" | Index labels to drop.\n",
|
||
|
" | axis : 0, default 0\n",
|
||
|
" | Redundant for application on Series.\n",
|
||
|
" | index, columns : None\n",
|
||
|
" | Redundant for application on Series, but index can be used instead\n",
|
||
|
" | of labels.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | level : int or level name, optional\n",
|
||
|
" | For MultiIndex, level for which the labels will be removed.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | If True, do operation inplace and return None.\n",
|
||
|
" | errors : {'ignore', 'raise'}, default 'raise'\n",
|
||
|
" | If 'ignore', suppress error and only existing labels are dropped.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | dropped : pandas.Series\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | KeyError\n",
|
||
|
" | If none of the labels are found in the index.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.reindex : Return only specified index labels of Series.\n",
|
||
|
" | Series.dropna : Return series without null values.\n",
|
||
|
" | Series.drop_duplicates : Return Series with duplicate values removed.\n",
|
||
|
" | DataFrame.drop : Drop specified labels from rows or columns.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(data=np.arange(3), index=['A','B','C'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | A 0\n",
|
||
|
" | B 1\n",
|
||
|
" | C 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Drop labels B en C\n",
|
||
|
" | \n",
|
||
|
" | >>> s.drop(labels=['B','C'])\n",
|
||
|
" | A 0\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Drop 2nd level label in MultiIndex Series\n",
|
||
|
" | \n",
|
||
|
" | >>> midx = pd.MultiIndex(levels=[['lama', 'cow', 'falcon'],\n",
|
||
|
" | ... ['speed', 'weight', 'length']],\n",
|
||
|
" | ... codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2],\n",
|
||
|
" | ... [0, 1, 2, 0, 1, 2, 0, 1, 2]])\n",
|
||
|
" | >>> s = pd.Series([45, 200, 1.2, 30, 250, 1.5, 320, 1, 0.3],\n",
|
||
|
" | ... index=midx)\n",
|
||
|
" | >>> s\n",
|
||
|
" | lama speed 45.0\n",
|
||
|
" | weight 200.0\n",
|
||
|
" | length 1.2\n",
|
||
|
" | cow speed 30.0\n",
|
||
|
" | weight 250.0\n",
|
||
|
" | length 1.5\n",
|
||
|
" | falcon speed 320.0\n",
|
||
|
" | weight 1.0\n",
|
||
|
" | length 0.3\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.drop(labels='weight', level=1)\n",
|
||
|
" | lama speed 45.0\n",
|
||
|
" | length 1.2\n",
|
||
|
" | cow speed 30.0\n",
|
||
|
" | length 1.5\n",
|
||
|
" | falcon speed 320.0\n",
|
||
|
" | length 0.3\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | drop_duplicates(self, keep='first', inplace=False)\n",
|
||
|
" | Return Series with duplicate values removed.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | keep : {'first', 'last', ``False``}, default 'first'\n",
|
||
|
" | - 'first' : Drop duplicates except for the first occurrence.\n",
|
||
|
" | - 'last' : Drop duplicates except for the last occurrence.\n",
|
||
|
" | - ``False`` : Drop all duplicates.\n",
|
||
|
" | inplace : boolean, default ``False``\n",
|
||
|
" | If ``True``, performs operation inplace and returns None.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | deduplicated : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Index.drop_duplicates : Equivalent method on Index.\n",
|
||
|
" | DataFrame.drop_duplicates : Equivalent method on DataFrame.\n",
|
||
|
" | Series.duplicated : Related method on Series, indicating duplicate\n",
|
||
|
" | Series values.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Generate an Series with duplicated entries.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(['lama', 'cow', 'lama', 'beetle', 'lama', 'hippo'],\n",
|
||
|
" | ... name='animal')\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 lama\n",
|
||
|
" | 1 cow\n",
|
||
|
" | 2 lama\n",
|
||
|
" | 3 beetle\n",
|
||
|
" | 4 lama\n",
|
||
|
" | 5 hippo\n",
|
||
|
" | Name: animal, dtype: object\n",
|
||
|
" | \n",
|
||
|
" | With the 'keep' parameter, the selection behaviour of duplicated values\n",
|
||
|
" | can be changed. The value 'first' keeps the first occurrence for each\n",
|
||
|
" | set of duplicated entries. The default value of keep is 'first'.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.drop_duplicates()\n",
|
||
|
" | 0 lama\n",
|
||
|
" | 1 cow\n",
|
||
|
" | 3 beetle\n",
|
||
|
" | 5 hippo\n",
|
||
|
" | Name: animal, dtype: object\n",
|
||
|
" | \n",
|
||
|
" | The value 'last' for parameter 'keep' keeps the last occurrence for\n",
|
||
|
" | each set of duplicated entries.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.drop_duplicates(keep='last')\n",
|
||
|
" | 1 cow\n",
|
||
|
" | 3 beetle\n",
|
||
|
" | 4 lama\n",
|
||
|
" | 5 hippo\n",
|
||
|
" | Name: animal, dtype: object\n",
|
||
|
" | \n",
|
||
|
" | The value ``False`` for parameter 'keep' discards all sets of\n",
|
||
|
" | duplicated entries. Setting the value of 'inplace' to ``True`` performs\n",
|
||
|
" | the operation inplace and returns ``None``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.drop_duplicates(keep=False, inplace=True)\n",
|
||
|
" | >>> s\n",
|
||
|
" | 1 cow\n",
|
||
|
" | 3 beetle\n",
|
||
|
" | 5 hippo\n",
|
||
|
" | Name: animal, dtype: object\n",
|
||
|
" | \n",
|
||
|
" | dropna(self, axis=0, inplace=False, **kwargs)\n",
|
||
|
" | Return a new Series with missing values removed.\n",
|
||
|
" | \n",
|
||
|
" | See the :ref:`User Guide <missing_data>` for more on which values are\n",
|
||
|
" | considered missing, and how to work with missing data.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index'}, default 0\n",
|
||
|
" | There is only one axis to drop values from.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | If True, do operation inplace and return None.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Not in use.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Series with NA entries dropped from it.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.isna: Indicate missing values.\n",
|
||
|
" | Series.notna : Indicate existing (non-missing) values.\n",
|
||
|
" | Series.fillna : Replace missing values.\n",
|
||
|
" | DataFrame.dropna : Drop rows or columns which contain NA values.\n",
|
||
|
" | Index.dropna : Drop missing indices.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> ser = pd.Series([1., 2., np.nan])\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 1.0\n",
|
||
|
" | 1 2.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Drop NA values from a Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.dropna()\n",
|
||
|
" | 0 1.0\n",
|
||
|
" | 1 2.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Keep the Series with valid entries in the same variable.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.dropna(inplace=True)\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 1.0\n",
|
||
|
" | 1 2.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Empty strings are not considered NA values. ``None`` is considered an\n",
|
||
|
" | NA value.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series([np.NaN, 2, pd.NaT, '', None, 'I stay'])\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 NaT\n",
|
||
|
" | 3\n",
|
||
|
" | 4 None\n",
|
||
|
" | 5 I stay\n",
|
||
|
" | dtype: object\n",
|
||
|
" | >>> ser.dropna()\n",
|
||
|
" | 1 2\n",
|
||
|
" | 3\n",
|
||
|
" | 5 I stay\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | duplicated(self, keep='first')\n",
|
||
|
" | Indicate duplicate Series values.\n",
|
||
|
" | \n",
|
||
|
" | Duplicated values are indicated as ``True`` values in the resulting\n",
|
||
|
" | Series. Either all duplicates, all except the first or all except the\n",
|
||
|
" | last occurrence of duplicates can be indicated.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | keep : {'first', 'last', False}, default 'first'\n",
|
||
|
" | - 'first' : Mark duplicates as ``True`` except for the first\n",
|
||
|
" | occurrence.\n",
|
||
|
" | - 'last' : Mark duplicates as ``True`` except for the last\n",
|
||
|
" | occurrence.\n",
|
||
|
" | - ``False`` : Mark all duplicates as ``True``.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | pandas.core.series.Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Index.duplicated : Equivalent method on pandas.Index.\n",
|
||
|
" | DataFrame.duplicated : Equivalent method on pandas.DataFrame.\n",
|
||
|
" | Series.drop_duplicates : Remove duplicate values from Series.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | By default, for each set of duplicated values, the first occurrence is\n",
|
||
|
" | set on False and all others on True:\n",
|
||
|
" | \n",
|
||
|
" | >>> animals = pd.Series(['lama', 'cow', 'lama', 'beetle', 'lama'])\n",
|
||
|
" | >>> animals.duplicated()\n",
|
||
|
" | 0 False\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 True\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | which is equivalent to\n",
|
||
|
" | \n",
|
||
|
" | >>> animals.duplicated(keep='first')\n",
|
||
|
" | 0 False\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 True\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | By using 'last', the last occurrence of each set of duplicated values\n",
|
||
|
" | is set on False and all others on True:\n",
|
||
|
" | \n",
|
||
|
" | >>> animals.duplicated(keep='last')\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | By setting keep on ``False``, all duplicates are True:\n",
|
||
|
" | \n",
|
||
|
" | >>> animals.duplicated(keep=False)\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 True\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | eq(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Equal to of series and other, element-wise (binary operator `eq`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series == other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.None\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | ewm(self, com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0)\n",
|
||
|
" | Provides exponential weighted functions.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | com : float, optional\n",
|
||
|
" | Specify decay in terms of center of mass,\n",
|
||
|
" | :math:`\\alpha = 1 / (1 + com),\\text{ for } com \\geq 0`\n",
|
||
|
" | span : float, optional\n",
|
||
|
" | Specify decay in terms of span,\n",
|
||
|
" | :math:`\\alpha = 2 / (span + 1),\\text{ for } span \\geq 1`\n",
|
||
|
" | halflife : float, optional\n",
|
||
|
" | Specify decay in terms of half-life,\n",
|
||
|
" | :math:`\\alpha = 1 - exp(log(0.5) / halflife),\\text{ for } halflife > 0`\n",
|
||
|
" | alpha : float, optional\n",
|
||
|
" | Specify smoothing factor :math:`\\alpha` directly,\n",
|
||
|
" | :math:`0 < \\alpha \\leq 1`\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | \n",
|
||
|
" | min_periods : int, default 0\n",
|
||
|
" | Minimum number of observations in window required to have a value\n",
|
||
|
" | (otherwise result is NA).\n",
|
||
|
" | adjust : bool, default True\n",
|
||
|
" | Divide by decaying adjustment factor in beginning periods to account\n",
|
||
|
" | for imbalance in relative weightings (viewing EWMA as a moving average)\n",
|
||
|
" | ignore_na : bool, default False\n",
|
||
|
" | Ignore missing values when calculating weights;\n",
|
||
|
" | specify True to reproduce pre-0.15.0 behavior\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | a Window sub-classed for the particular operation\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | rolling : Provides rolling window calculations.\n",
|
||
|
" | expanding : Provides expanding transformations.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Exactly one of center of mass, span, half-life, and alpha must be provided.\n",
|
||
|
" | Allowed values and relationship between the parameters are specified in the\n",
|
||
|
" | parameter descriptions above; see the link at the end of this section for\n",
|
||
|
" | a detailed explanation.\n",
|
||
|
" | \n",
|
||
|
" | When adjust is True (default), weighted averages are calculated using\n",
|
||
|
" | weights (1-alpha)**(n-1), (1-alpha)**(n-2), ..., 1-alpha, 1.\n",
|
||
|
" | \n",
|
||
|
" | When adjust is False, weighted averages are calculated recursively as:\n",
|
||
|
" | weighted_average[0] = arg[0];\n",
|
||
|
" | weighted_average[i] = (1-alpha)*weighted_average[i-1] + alpha*arg[i].\n",
|
||
|
" | \n",
|
||
|
" | When ignore_na is False (default), weights are based on absolute positions.\n",
|
||
|
" | For example, the weights of x and y used in calculating the final weighted\n",
|
||
|
" | average of [x, None, y] are (1-alpha)**2 and 1 (if adjust is True), and\n",
|
||
|
" | (1-alpha)**2 and alpha (if adjust is False).\n",
|
||
|
" | \n",
|
||
|
" | When ignore_na is True (reproducing pre-0.15.0 behavior), weights are based\n",
|
||
|
" | on relative positions. For example, the weights of x and y used in\n",
|
||
|
" | calculating the final weighted average of [x, None, y] are 1-alpha and 1\n",
|
||
|
" | (if adjust is True), and 1-alpha and alpha (if adjust is False).\n",
|
||
|
" | \n",
|
||
|
" | More details can be found at\n",
|
||
|
" | http://pandas.pydata.org/pandas-docs/stable/computation.html#exponentially-weighted-windows\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]})\n",
|
||
|
" | B\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.ewm(com=0.5).mean()\n",
|
||
|
" | B\n",
|
||
|
" | 0 0.000000\n",
|
||
|
" | 1 0.750000\n",
|
||
|
" | 2 1.615385\n",
|
||
|
" | 3 1.615385\n",
|
||
|
" | 4 3.670213\n",
|
||
|
" | \n",
|
||
|
" | expanding(self, min_periods=1, center=False, axis=0)\n",
|
||
|
" | Provides expanding transformations.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | min_periods : int, default 1\n",
|
||
|
" | Minimum number of observations in window required to have a value\n",
|
||
|
" | (otherwise result is NA).\n",
|
||
|
" | center : bool, default False\n",
|
||
|
" | Set the labels at the center of the window.\n",
|
||
|
" | axis : int or str, default 0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | a Window sub-classed for the particular operation\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | rolling : Provides rolling window calculations.\n",
|
||
|
" | ewm : Provides exponential weighted functions.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | By default, the result is set to the right edge of the window. This can be\n",
|
||
|
" | changed to the center of the window by setting ``center=True``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]})\n",
|
||
|
" | B\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.expanding(2).sum()\n",
|
||
|
" | B\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 3 3.0\n",
|
||
|
" | 4 7.0\n",
|
||
|
" | \n",
|
||
|
" | fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)\n",
|
||
|
" | Fill NA/NaN values using the specified method.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | value : scalar, dict, Series, or DataFrame\n",
|
||
|
" | Value to use to fill holes (e.g. 0), alternately a\n",
|
||
|
" | dict/Series/DataFrame of values specifying which value to use for\n",
|
||
|
" | each index (for a Series) or column (for a DataFrame). (values not\n",
|
||
|
" | in the dict/Series/DataFrame will not be filled). This value cannot\n",
|
||
|
" | be a list.\n",
|
||
|
" | method : {'backfill', 'bfill', 'pad', 'ffill', None}, default None\n",
|
||
|
" | Method to use for filling holes in reindexed Series\n",
|
||
|
" | pad / ffill: propagate last valid observation forward to next valid\n",
|
||
|
" | backfill / bfill: use NEXT valid observation to fill gap\n",
|
||
|
" | axis : {0 or 'index'}\n",
|
||
|
" | inplace : boolean, default False\n",
|
||
|
" | If True, fill in place. Note: this will modify any\n",
|
||
|
" | other views on this object, (e.g. a no-copy slice for a column in a\n",
|
||
|
" | DataFrame).\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | If method is specified, this is the maximum number of consecutive\n",
|
||
|
" | NaN values to forward/backward fill. In other words, if there is\n",
|
||
|
" | a gap with more than this number of consecutive NaNs, it will only\n",
|
||
|
" | be partially filled. If method is not specified, this is the\n",
|
||
|
" | maximum number of entries along the entire axis where NaNs will be\n",
|
||
|
" | filled. Must be greater than 0 if not None.\n",
|
||
|
" | downcast : dict, default is None\n",
|
||
|
" | a dict of item->dtype of what to downcast if possible,\n",
|
||
|
" | or the string 'infer' which will try to downcast to an appropriate\n",
|
||
|
" | equal type (e.g. float64 to int64 if possible)\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | filled : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | interpolate : Fill NaN values using interpolation.\n",
|
||
|
" | reindex, asfreq\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([[np.nan, 2, np.nan, 0],\n",
|
||
|
" | ... [3, 4, np.nan, 1],\n",
|
||
|
" | ... [np.nan, np.nan, np.nan, 5],\n",
|
||
|
" | ... [np.nan, 3, np.nan, 4]],\n",
|
||
|
" | ... columns=list('ABCD'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B C D\n",
|
||
|
" | 0 NaN 2.0 NaN 0\n",
|
||
|
" | 1 3.0 4.0 NaN 1\n",
|
||
|
" | 2 NaN NaN NaN 5\n",
|
||
|
" | 3 NaN 3.0 NaN 4\n",
|
||
|
" | \n",
|
||
|
" | Replace all NaN elements with 0s.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.fillna(0)\n",
|
||
|
" | A B C D\n",
|
||
|
" | 0 0.0 2.0 0.0 0\n",
|
||
|
" | 1 3.0 4.0 0.0 1\n",
|
||
|
" | 2 0.0 0.0 0.0 5\n",
|
||
|
" | 3 0.0 3.0 0.0 4\n",
|
||
|
" | \n",
|
||
|
" | We can also propagate non-null values forward or backward.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.fillna(method='ffill')\n",
|
||
|
" | A B C D\n",
|
||
|
" | 0 NaN 2.0 NaN 0\n",
|
||
|
" | 1 3.0 4.0 NaN 1\n",
|
||
|
" | 2 3.0 4.0 NaN 5\n",
|
||
|
" | 3 3.0 3.0 NaN 4\n",
|
||
|
" | \n",
|
||
|
" | Replace all NaN elements in column 'A', 'B', 'C', and 'D', with 0, 1,\n",
|
||
|
" | 2, and 3 respectively.\n",
|
||
|
" | \n",
|
||
|
" | >>> values = {'A': 0, 'B': 1, 'C': 2, 'D': 3}\n",
|
||
|
" | >>> df.fillna(value=values)\n",
|
||
|
" | A B C D\n",
|
||
|
" | 0 0.0 2.0 2.0 0\n",
|
||
|
" | 1 3.0 4.0 2.0 1\n",
|
||
|
" | 2 0.0 1.0 2.0 5\n",
|
||
|
" | 3 0.0 3.0 2.0 4\n",
|
||
|
" | \n",
|
||
|
" | Only replace the first NaN element.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.fillna(value=values, limit=1)\n",
|
||
|
" | A B C D\n",
|
||
|
" | 0 0.0 2.0 2.0 0\n",
|
||
|
" | 1 3.0 4.0 NaN 1\n",
|
||
|
" | 2 NaN 1.0 NaN 5\n",
|
||
|
" | 3 NaN 3.0 NaN 4\n",
|
||
|
" | \n",
|
||
|
" | floordiv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Integer division of series and other, element-wise (binary operator `floordiv`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series // other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rfloordiv\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | ge(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Greater than or equal to of series and other, element-wise (binary operator `ge`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series >= other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.None\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | get_value(self, label, takeable=False)\n",
|
||
|
" | Quickly retrieve single value at passed index label.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | Please use .at[] or .iat[] accessors.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | label : object\n",
|
||
|
" | takeable : interpret the index as indexers, default False\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | value : scalar value\n",
|
||
|
" | \n",
|
||
|
" | get_values(self)\n",
|
||
|
" | Same as values (but handles sparseness conversions); is a view.\n",
|
||
|
" | \n",
|
||
|
" | gt(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Greater than of series and other, element-wise (binary operator `gt`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series > other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.None\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | hist = hist_series(self, by=None, ax=None, grid=True, xlabelsize=None, xrot=None, ylabelsize=None, yrot=None, figsize=None, bins=10, **kwds)\n",
|
||
|
" | Draw histogram of the input series using matplotlib.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | by : object, optional\n",
|
||
|
" | If passed, then used to form histograms for separate groups\n",
|
||
|
" | ax : matplotlib axis object\n",
|
||
|
" | If not passed, uses gca()\n",
|
||
|
" | grid : boolean, default True\n",
|
||
|
" | Whether to show axis grid lines\n",
|
||
|
" | xlabelsize : int, default None\n",
|
||
|
" | If specified changes the x-axis label size\n",
|
||
|
" | xrot : float, default None\n",
|
||
|
" | rotation of x axis labels\n",
|
||
|
" | ylabelsize : int, default None\n",
|
||
|
" | If specified changes the y-axis label size\n",
|
||
|
" | yrot : float, default None\n",
|
||
|
" | rotation of y axis labels\n",
|
||
|
" | figsize : tuple, default None\n",
|
||
|
" | figure size in inches by default\n",
|
||
|
" | bins : integer or sequence, default 10\n",
|
||
|
" | Number of histogram bins to be used. If an integer is given, bins + 1\n",
|
||
|
" | bin edges are calculated and returned. If bins is a sequence, gives\n",
|
||
|
" | bin edges, including left edge of first bin and right edge of last\n",
|
||
|
" | bin. In this case, bins is returned unmodified.\n",
|
||
|
" | bins : integer, default 10\n",
|
||
|
" | Number of histogram bins to be used\n",
|
||
|
" | `**kwds` : keywords\n",
|
||
|
" | To be passed to the actual plotting function\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | matplotlib.axes.Axes.hist : Plot a histogram using matplotlib.\n",
|
||
|
" | \n",
|
||
|
" | idxmax(self, axis=0, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return the row label of the maximum value.\n",
|
||
|
" | \n",
|
||
|
" | If multiple values equal the maximum, the first row label with that\n",
|
||
|
" | value is returned.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If the entire Series is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | axis : int, default 0\n",
|
||
|
" | For compatibility with DataFrame.idxmax. Redundant for application\n",
|
||
|
" | on Series.\n",
|
||
|
" | *args, **kwargs\n",
|
||
|
" | Additional keywords have no effect but might be accepted\n",
|
||
|
" | for compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | idxmax : Index of maximum of values.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | ValueError\n",
|
||
|
" | If the Series is empty.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.argmax : Return indices of the maximum values\n",
|
||
|
" | along the given axis.\n",
|
||
|
" | DataFrame.idxmax : Return index of first occurrence of maximum\n",
|
||
|
" | over requested axis.\n",
|
||
|
" | Series.idxmin : Return index *label* of the first occurrence\n",
|
||
|
" | of minimum of values.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This method is the Series version of ``ndarray.argmax``. This method\n",
|
||
|
" | returns the label of the maximum, while ``ndarray.argmax`` returns\n",
|
||
|
" | the position. To get the position, use ``series.values.argmax()``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(data=[1, None, 4, 3, 4],\n",
|
||
|
" | ... index=['A', 'B', 'C', 'D', 'E'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | A 1.0\n",
|
||
|
" | B NaN\n",
|
||
|
" | C 4.0\n",
|
||
|
" | D 3.0\n",
|
||
|
" | E 4.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmax()\n",
|
||
|
" | 'C'\n",
|
||
|
" | \n",
|
||
|
" | If `skipna` is False and there is an NA value in the data,\n",
|
||
|
" | the function returns ``nan``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmax(skipna=False)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | idxmin(self, axis=0, skipna=True, *args, **kwargs)\n",
|
||
|
" | Return the row label of the minimum value.\n",
|
||
|
" | \n",
|
||
|
" | If multiple values equal the minimum, the first row label with that\n",
|
||
|
" | value is returned.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If the entire Series is NA, the result\n",
|
||
|
" | will be NA.\n",
|
||
|
" | axis : int, default 0\n",
|
||
|
" | For compatibility with DataFrame.idxmin. Redundant for application\n",
|
||
|
" | on Series.\n",
|
||
|
" | *args, **kwargs\n",
|
||
|
" | Additional keywords have no effect but might be accepted\n",
|
||
|
" | for compatibility with NumPy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | idxmin : Index of minimum of values.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | ValueError\n",
|
||
|
" | If the Series is empty.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.argmin : Return indices of the minimum values\n",
|
||
|
" | along the given axis.\n",
|
||
|
" | DataFrame.idxmin : Return index of first occurrence of minimum\n",
|
||
|
" | over requested axis.\n",
|
||
|
" | Series.idxmax : Return index *label* of the first occurrence\n",
|
||
|
" | of maximum of values.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This method is the Series version of ``ndarray.argmin``. This method\n",
|
||
|
" | returns the label of the minimum, while ``ndarray.argmin`` returns\n",
|
||
|
" | the position. To get the position, use ``series.values.argmin()``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(data=[1, None, 4, 1],\n",
|
||
|
" | ... index=['A' ,'B' ,'C' ,'D'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | A 1.0\n",
|
||
|
" | B NaN\n",
|
||
|
" | C 4.0\n",
|
||
|
" | D 1.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmin()\n",
|
||
|
" | 'A'\n",
|
||
|
" | \n",
|
||
|
" | If `skipna` is False and there is an NA value in the data,\n",
|
||
|
" | the function returns ``nan``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.idxmin(skipna=False)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | isin(self, values)\n",
|
||
|
" | Check whether `values` are contained in Series.\n",
|
||
|
" | \n",
|
||
|
" | Return a boolean Series showing whether each element in the Series\n",
|
||
|
" | matches an element in the passed sequence of `values` exactly.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | values : set or list-like\n",
|
||
|
" | The sequence of values to test. Passing in a single string will\n",
|
||
|
" | raise a ``TypeError``. Instead, turn a single string into a\n",
|
||
|
" | list of one element.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.1\n",
|
||
|
" | \n",
|
||
|
" | Support for values as a set.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | isin : Series (bool dtype)\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | * If `values` is a string\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.isin : Equivalent method on DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(['lama', 'cow', 'lama', 'beetle', 'lama',\n",
|
||
|
" | ... 'hippo'], name='animal')\n",
|
||
|
" | >>> s.isin(['cow', 'lama'])\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 True\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 True\n",
|
||
|
" | 5 False\n",
|
||
|
" | Name: animal, dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | Passing a single string as ``s.isin('lama')`` will raise an error. Use\n",
|
||
|
" | a list of one element instead:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.isin(['lama'])\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | 3 False\n",
|
||
|
" | 4 True\n",
|
||
|
" | 5 False\n",
|
||
|
" | Name: animal, dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | isna(self)\n",
|
||
|
" | Detect missing values.\n",
|
||
|
" | \n",
|
||
|
" | Return a boolean same-sized object indicating if the values are NA.\n",
|
||
|
" | NA values, such as None or :attr:`numpy.NaN`, gets mapped to True\n",
|
||
|
" | values.\n",
|
||
|
" | Everything else gets mapped to False values. Characters such as empty\n",
|
||
|
" | strings ``''`` or :attr:`numpy.inf` are not considered NA values\n",
|
||
|
" | (unless you set ``pandas.options.mode.use_inf_as_na = True``).\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Mask of bool values for each element in Series that\n",
|
||
|
" | indicates whether an element is not an NA value.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.isnull : Alias of isna.\n",
|
||
|
" | Series.notna : Boolean inverse of isna.\n",
|
||
|
" | Series.dropna : Omit axes labels with missing values.\n",
|
||
|
" | isna : Top-level isna.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Show which entries in a DataFrame are NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'age': [5, 6, np.NaN],\n",
|
||
|
" | ... 'born': [pd.NaT, pd.Timestamp('1939-05-27'),\n",
|
||
|
" | ... pd.Timestamp('1940-04-25')],\n",
|
||
|
" | ... 'name': ['Alfred', 'Batman', ''],\n",
|
||
|
" | ... 'toy': [None, 'Batmobile', 'Joker']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 5.0 NaT Alfred None\n",
|
||
|
" | 1 6.0 1939-05-27 Batman Batmobile\n",
|
||
|
" | 2 NaN 1940-04-25 Joker\n",
|
||
|
" | \n",
|
||
|
" | >>> df.isna()\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 False True False True\n",
|
||
|
" | 1 False False False False\n",
|
||
|
" | 2 True False False False\n",
|
||
|
" | \n",
|
||
|
" | Show which entries in a Series are NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series([5, 6, np.NaN])\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 5.0\n",
|
||
|
" | 1 6.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.isna()\n",
|
||
|
" | 0 False\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | isnull(self)\n",
|
||
|
" | Detect missing values.\n",
|
||
|
" | \n",
|
||
|
" | Return a boolean same-sized object indicating if the values are NA.\n",
|
||
|
" | NA values, such as None or :attr:`numpy.NaN`, gets mapped to True\n",
|
||
|
" | values.\n",
|
||
|
" | Everything else gets mapped to False values. Characters such as empty\n",
|
||
|
" | strings ``''`` or :attr:`numpy.inf` are not considered NA values\n",
|
||
|
" | (unless you set ``pandas.options.mode.use_inf_as_na = True``).\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Mask of bool values for each element in Series that\n",
|
||
|
" | indicates whether an element is not an NA value.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.isnull : Alias of isna.\n",
|
||
|
" | Series.notna : Boolean inverse of isna.\n",
|
||
|
" | Series.dropna : Omit axes labels with missing values.\n",
|
||
|
" | isna : Top-level isna.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Show which entries in a DataFrame are NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'age': [5, 6, np.NaN],\n",
|
||
|
" | ... 'born': [pd.NaT, pd.Timestamp('1939-05-27'),\n",
|
||
|
" | ... pd.Timestamp('1940-04-25')],\n",
|
||
|
" | ... 'name': ['Alfred', 'Batman', ''],\n",
|
||
|
" | ... 'toy': [None, 'Batmobile', 'Joker']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 5.0 NaT Alfred None\n",
|
||
|
" | 1 6.0 1939-05-27 Batman Batmobile\n",
|
||
|
" | 2 NaN 1940-04-25 Joker\n",
|
||
|
" | \n",
|
||
|
" | >>> df.isna()\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 False True False True\n",
|
||
|
" | 1 False False False False\n",
|
||
|
" | 2 True False False False\n",
|
||
|
" | \n",
|
||
|
" | Show which entries in a Series are NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series([5, 6, np.NaN])\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 5.0\n",
|
||
|
" | 1 6.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.isna()\n",
|
||
|
" | 0 False\n",
|
||
|
" | 1 False\n",
|
||
|
" | 2 True\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | items = iteritems(self)\n",
|
||
|
" | \n",
|
||
|
" | iteritems(self)\n",
|
||
|
" | Lazily iterate over (index, value) tuples.\n",
|
||
|
" | \n",
|
||
|
" | keys(self)\n",
|
||
|
" | Alias for index.\n",
|
||
|
" | \n",
|
||
|
" | kurt(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Return unbiased kurtosis over requested axis using Fisher's definition of\n",
|
||
|
" | kurtosis (kurtosis of normal == 0.0). Normalized by N-1.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | kurt : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | kurtosis = kurt(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | \n",
|
||
|
" | le(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Less than or equal to of series and other, element-wise (binary operator `le`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series <= other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.None\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | lt(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Less than of series and other, element-wise (binary operator `lt`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series < other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.None\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | mad(self, axis=None, skipna=None, level=None)\n",
|
||
|
" | Return the mean absolute deviation of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | mad : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | map(self, arg, na_action=None)\n",
|
||
|
" | Map values of Series according to input correspondence.\n",
|
||
|
" | \n",
|
||
|
" | Used for substituting each value in a Series with another value,\n",
|
||
|
" | that may be derived from a function, a ``dict`` or\n",
|
||
|
" | a :class:`Series`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | arg : function, dict, or Series\n",
|
||
|
" | Mapping correspondence.\n",
|
||
|
" | na_action : {None, 'ignore'}, default None\n",
|
||
|
" | If 'ignore', propagate NaN values, without passing them to the\n",
|
||
|
" | mapping correspondence.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Same index as caller.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.apply : For applying more complex functions on a Series.\n",
|
||
|
" | DataFrame.apply : Apply a function row-/column-wise.\n",
|
||
|
" | DataFrame.applymap : Apply a function elementwise on a whole DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | When ``arg`` is a dictionary, values in Series that are not in the\n",
|
||
|
" | dictionary (as keys) are converted to ``NaN``. However, if the\n",
|
||
|
" | dictionary is a ``dict`` subclass that defines ``__missing__`` (i.e.\n",
|
||
|
" | provides a method for default values), then this default is used\n",
|
||
|
" | rather than ``NaN``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(['cat', 'dog', np.nan, 'rabbit'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 cat\n",
|
||
|
" | 1 dog\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 rabbit\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | ``map`` accepts a ``dict`` or a ``Series``. Values that are not found\n",
|
||
|
" | in the ``dict`` are converted to ``NaN``, unless the dict has a default\n",
|
||
|
" | value (e.g. ``defaultdict``):\n",
|
||
|
" | \n",
|
||
|
" | >>> s.map({'cat': 'kitten', 'dog': 'puppy'})\n",
|
||
|
" | 0 kitten\n",
|
||
|
" | 1 puppy\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | It also accepts a function:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.map('I am a {}'.format)\n",
|
||
|
" | 0 I am a cat\n",
|
||
|
" | 1 I am a dog\n",
|
||
|
" | 2 I am a nan\n",
|
||
|
" | 3 I am a rabbit\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | To avoid applying the function to missing values (and keep them as\n",
|
||
|
" | ``NaN``) ``na_action='ignore'`` can be used:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.map('I am a {}'.format, na_action='ignore')\n",
|
||
|
" | 0 I am a cat\n",
|
||
|
" | 1 I am a dog\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 I am a rabbit\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Return the maximum of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | If you want the *index* of the maximum, use ``idxmax``. This is\n",
|
||
|
" | the equivalent of the ``numpy.ndarray`` method ``argmax``.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | max : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.sum : Return the sum.\n",
|
||
|
" | Series.min : Return the minimum.\n",
|
||
|
" | Series.max : Return the maximum.\n",
|
||
|
" | Series.idxmin : Return the index of the minimum.\n",
|
||
|
" | Series.idxmax : Return the index of the maximum.\n",
|
||
|
" | DataFrame.min : Return the sum over the requested axis.\n",
|
||
|
" | DataFrame.min : Return the minimum over the requested axis.\n",
|
||
|
" | DataFrame.max : Return the maximum over the requested axis.\n",
|
||
|
" | DataFrame.idxmin : Return the index of the minimum over the requested axis.\n",
|
||
|
" | DataFrame.idxmax : Return the index of the maximum over the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> idx = pd.MultiIndex.from_arrays([\n",
|
||
|
" | ... ['warm', 'warm', 'cold', 'cold'],\n",
|
||
|
" | ... ['dog', 'falcon', 'fish', 'spider']],\n",
|
||
|
" | ... names=['blooded', 'animal'])\n",
|
||
|
" | >>> s = pd.Series([4, 2, 0, 8], name='legs', index=idx)\n",
|
||
|
" | >>> s\n",
|
||
|
" | blooded animal\n",
|
||
|
" | warm dog 4\n",
|
||
|
" | falcon 2\n",
|
||
|
" | cold fish 0\n",
|
||
|
" | spider 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.max()\n",
|
||
|
" | 8\n",
|
||
|
" | \n",
|
||
|
" | Max using level names, as well as indices.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.max(level='blooded')\n",
|
||
|
" | blooded\n",
|
||
|
" | warm 4\n",
|
||
|
" | cold 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.max(level=0)\n",
|
||
|
" | blooded\n",
|
||
|
" | warm 4\n",
|
||
|
" | cold 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Return the mean of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | mean : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | median(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Return the median of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | median : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | memory_usage(self, index=True, deep=False)\n",
|
||
|
" | Return the memory usage of the Series.\n",
|
||
|
" | \n",
|
||
|
" | The memory usage can optionally include the contribution of\n",
|
||
|
" | the index and of elements of `object` dtype.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | index : bool, default True\n",
|
||
|
" | Specifies whether to include the memory usage of the Series index.\n",
|
||
|
" | deep : bool, default False\n",
|
||
|
" | If True, introspect the data deeply by interrogating\n",
|
||
|
" | `object` dtypes for system-level memory consumption, and include\n",
|
||
|
" | it in the returned value.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | int\n",
|
||
|
" | Bytes of memory consumed.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.nbytes : Total bytes consumed by the elements of the\n",
|
||
|
" | array.\n",
|
||
|
" | DataFrame.memory_usage : Bytes consumed by a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(range(3))\n",
|
||
|
" | >>> s.memory_usage()\n",
|
||
|
" | 104\n",
|
||
|
" | \n",
|
||
|
" | Not including the index gives the size of the rest of the data, which\n",
|
||
|
" | is necessarily smaller:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.memory_usage(index=False)\n",
|
||
|
" | 24\n",
|
||
|
" | \n",
|
||
|
" | The memory footprint of `object` values is ignored by default:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([\"a\", \"b\"])\n",
|
||
|
" | >>> s.values\n",
|
||
|
" | array(['a', 'b'], dtype=object)\n",
|
||
|
" | >>> s.memory_usage()\n",
|
||
|
" | 96\n",
|
||
|
" | >>> s.memory_usage(deep=True)\n",
|
||
|
" | 212\n",
|
||
|
" | \n",
|
||
|
" | min(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Return the minimum of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | If you want the *index* of the minimum, use ``idxmin``. This is\n",
|
||
|
" | the equivalent of the ``numpy.ndarray`` method ``argmin``.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | min : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.sum : Return the sum.\n",
|
||
|
" | Series.min : Return the minimum.\n",
|
||
|
" | Series.max : Return the maximum.\n",
|
||
|
" | Series.idxmin : Return the index of the minimum.\n",
|
||
|
" | Series.idxmax : Return the index of the maximum.\n",
|
||
|
" | DataFrame.min : Return the sum over the requested axis.\n",
|
||
|
" | DataFrame.min : Return the minimum over the requested axis.\n",
|
||
|
" | DataFrame.max : Return the maximum over the requested axis.\n",
|
||
|
" | DataFrame.idxmin : Return the index of the minimum over the requested axis.\n",
|
||
|
" | DataFrame.idxmax : Return the index of the maximum over the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> idx = pd.MultiIndex.from_arrays([\n",
|
||
|
" | ... ['warm', 'warm', 'cold', 'cold'],\n",
|
||
|
" | ... ['dog', 'falcon', 'fish', 'spider']],\n",
|
||
|
" | ... names=['blooded', 'animal'])\n",
|
||
|
" | >>> s = pd.Series([4, 2, 0, 8], name='legs', index=idx)\n",
|
||
|
" | >>> s\n",
|
||
|
" | blooded animal\n",
|
||
|
" | warm dog 4\n",
|
||
|
" | falcon 2\n",
|
||
|
" | cold fish 0\n",
|
||
|
" | spider 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.min()\n",
|
||
|
" | 0\n",
|
||
|
" | \n",
|
||
|
" | Min using level names, as well as indices.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.min(level='blooded')\n",
|
||
|
" | blooded\n",
|
||
|
" | warm 2\n",
|
||
|
" | cold 0\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.min(level=0)\n",
|
||
|
" | blooded\n",
|
||
|
" | warm 2\n",
|
||
|
" | cold 0\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | mod(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Modulo of series and other, element-wise (binary operator `mod`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series % other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rmod\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | mode(self, dropna=True)\n",
|
||
|
" | Return the mode(s) of the dataset.\n",
|
||
|
" | \n",
|
||
|
" | Always returns Series even if only one value is returned.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | dropna : boolean, default True\n",
|
||
|
" | Don't consider counts of NaN/NaT.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | modes : Series (sorted)\n",
|
||
|
" | \n",
|
||
|
" | mul(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Multiplication of series and other, element-wise (binary operator `mul`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series * other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rmul\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | multiply = mul(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | \n",
|
||
|
" | ne(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Not equal to of series and other, element-wise (binary operator `ne`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series != other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.None\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | nlargest(self, n=5, keep='first')\n",
|
||
|
" | Return the largest `n` elements.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | n : int, default 5\n",
|
||
|
" | Return this many descending sorted values.\n",
|
||
|
" | keep : {'first', 'last', 'all'}, default 'first'\n",
|
||
|
" | When there are duplicate values that cannot all fit in a\n",
|
||
|
" | Series of `n` elements:\n",
|
||
|
" | \n",
|
||
|
" | - ``first`` : take the first occurrences based on the index order\n",
|
||
|
" | - ``last`` : take the last occurrences based on the index order\n",
|
||
|
" | - ``all`` : keep all occurrences. This can result in a Series of\n",
|
||
|
" | size larger than `n`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | The `n` largest values in the Series, sorted in decreasing order.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.nsmallest: Get the `n` smallest elements.\n",
|
||
|
" | Series.sort_values: Sort Series by values.\n",
|
||
|
" | Series.head: Return the first `n` rows.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Faster than ``.sort_values(ascending=False).head(n)`` for small `n`\n",
|
||
|
" | relative to the size of the ``Series`` object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> countries_population = {\"Italy\": 59000000, \"France\": 65000000,\n",
|
||
|
" | ... \"Malta\": 434000, \"Maldives\": 434000,\n",
|
||
|
" | ... \"Brunei\": 434000, \"Iceland\": 337000,\n",
|
||
|
" | ... \"Nauru\": 11300, \"Tuvalu\": 11300,\n",
|
||
|
" | ... \"Anguilla\": 11300, \"Monserat\": 5200}\n",
|
||
|
" | >>> s = pd.Series(countries_population)\n",
|
||
|
" | >>> s\n",
|
||
|
" | Italy 59000000\n",
|
||
|
" | France 65000000\n",
|
||
|
" | Malta 434000\n",
|
||
|
" | Maldives 434000\n",
|
||
|
" | Brunei 434000\n",
|
||
|
" | Iceland 337000\n",
|
||
|
" | Nauru 11300\n",
|
||
|
" | Tuvalu 11300\n",
|
||
|
" | Anguilla 11300\n",
|
||
|
" | Monserat 5200\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` largest elements where ``n=5`` by default.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nlargest()\n",
|
||
|
" | France 65000000\n",
|
||
|
" | Italy 59000000\n",
|
||
|
" | Malta 434000\n",
|
||
|
" | Maldives 434000\n",
|
||
|
" | Brunei 434000\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` largest elements where ``n=3``. Default `keep` value is 'first'\n",
|
||
|
" | so Malta will be kept.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nlargest(3)\n",
|
||
|
" | France 65000000\n",
|
||
|
" | Italy 59000000\n",
|
||
|
" | Malta 434000\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` largest elements where ``n=3`` and keeping the last duplicates.\n",
|
||
|
" | Brunei will be kept since it is the last with value 434000 based on\n",
|
||
|
" | the index order.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nlargest(3, keep='last')\n",
|
||
|
" | France 65000000\n",
|
||
|
" | Italy 59000000\n",
|
||
|
" | Brunei 434000\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` largest elements where ``n=3`` with all duplicates kept. Note\n",
|
||
|
" | that the returned Series has five elements due to the three duplicates.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nlargest(3, keep='all')\n",
|
||
|
" | France 65000000\n",
|
||
|
" | Italy 59000000\n",
|
||
|
" | Malta 434000\n",
|
||
|
" | Maldives 434000\n",
|
||
|
" | Brunei 434000\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | nonzero(self)\n",
|
||
|
" | Return the *integer* indices of the elements that are non-zero.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.24.0\n",
|
||
|
" | Please use .to_numpy().nonzero() as a replacement.\n",
|
||
|
" | \n",
|
||
|
" | This method is equivalent to calling `numpy.nonzero` on the\n",
|
||
|
" | series data. For compatibility with NumPy, the return value is\n",
|
||
|
" | the same (a tuple with an array of indices for each dimension),\n",
|
||
|
" | but it will always be a one-item tuple because series only have\n",
|
||
|
" | one dimension.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.nonzero\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([0, 3, 0, 4])\n",
|
||
|
" | >>> s.nonzero()\n",
|
||
|
" | (array([1, 3]),)\n",
|
||
|
" | >>> s.iloc[s.nonzero()[0]]\n",
|
||
|
" | 1 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([0, 3, 0, 4], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | # same return although index of s is different\n",
|
||
|
" | >>> s.nonzero()\n",
|
||
|
" | (array([1, 3]),)\n",
|
||
|
" | >>> s.iloc[s.nonzero()[0]]\n",
|
||
|
" | b 3\n",
|
||
|
" | d 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | notna(self)\n",
|
||
|
" | Detect existing (non-missing) values.\n",
|
||
|
" | \n",
|
||
|
" | Return a boolean same-sized object indicating if the values are not NA.\n",
|
||
|
" | Non-missing values get mapped to True. Characters such as empty\n",
|
||
|
" | strings ``''`` or :attr:`numpy.inf` are not considered NA values\n",
|
||
|
" | (unless you set ``pandas.options.mode.use_inf_as_na = True``).\n",
|
||
|
" | NA values, such as None or :attr:`numpy.NaN`, get mapped to False\n",
|
||
|
" | values.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Mask of bool values for each element in Series that\n",
|
||
|
" | indicates whether an element is not an NA value.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.notnull : Alias of notna.\n",
|
||
|
" | Series.isna : Boolean inverse of notna.\n",
|
||
|
" | Series.dropna : Omit axes labels with missing values.\n",
|
||
|
" | notna : Top-level notna.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Show which entries in a DataFrame are not NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'age': [5, 6, np.NaN],\n",
|
||
|
" | ... 'born': [pd.NaT, pd.Timestamp('1939-05-27'),\n",
|
||
|
" | ... pd.Timestamp('1940-04-25')],\n",
|
||
|
" | ... 'name': ['Alfred', 'Batman', ''],\n",
|
||
|
" | ... 'toy': [None, 'Batmobile', 'Joker']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 5.0 NaT Alfred None\n",
|
||
|
" | 1 6.0 1939-05-27 Batman Batmobile\n",
|
||
|
" | 2 NaN 1940-04-25 Joker\n",
|
||
|
" | \n",
|
||
|
" | >>> df.notna()\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 True False True False\n",
|
||
|
" | 1 True True True True\n",
|
||
|
" | 2 False True True True\n",
|
||
|
" | \n",
|
||
|
" | Show which entries in a Series are not NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series([5, 6, np.NaN])\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 5.0\n",
|
||
|
" | 1 6.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.notna()\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 True\n",
|
||
|
" | 2 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | notnull(self)\n",
|
||
|
" | Detect existing (non-missing) values.\n",
|
||
|
" | \n",
|
||
|
" | Return a boolean same-sized object indicating if the values are not NA.\n",
|
||
|
" | Non-missing values get mapped to True. Characters such as empty\n",
|
||
|
" | strings ``''`` or :attr:`numpy.inf` are not considered NA values\n",
|
||
|
" | (unless you set ``pandas.options.mode.use_inf_as_na = True``).\n",
|
||
|
" | NA values, such as None or :attr:`numpy.NaN`, get mapped to False\n",
|
||
|
" | values.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Mask of bool values for each element in Series that\n",
|
||
|
" | indicates whether an element is not an NA value.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.notnull : Alias of notna.\n",
|
||
|
" | Series.isna : Boolean inverse of notna.\n",
|
||
|
" | Series.dropna : Omit axes labels with missing values.\n",
|
||
|
" | notna : Top-level notna.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Show which entries in a DataFrame are not NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'age': [5, 6, np.NaN],\n",
|
||
|
" | ... 'born': [pd.NaT, pd.Timestamp('1939-05-27'),\n",
|
||
|
" | ... pd.Timestamp('1940-04-25')],\n",
|
||
|
" | ... 'name': ['Alfred', 'Batman', ''],\n",
|
||
|
" | ... 'toy': [None, 'Batmobile', 'Joker']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 5.0 NaT Alfred None\n",
|
||
|
" | 1 6.0 1939-05-27 Batman Batmobile\n",
|
||
|
" | 2 NaN 1940-04-25 Joker\n",
|
||
|
" | \n",
|
||
|
" | >>> df.notna()\n",
|
||
|
" | age born name toy\n",
|
||
|
" | 0 True False True False\n",
|
||
|
" | 1 True True True True\n",
|
||
|
" | 2 False True True True\n",
|
||
|
" | \n",
|
||
|
" | Show which entries in a Series are not NA.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series([5, 6, np.NaN])\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 5.0\n",
|
||
|
" | 1 6.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.notna()\n",
|
||
|
" | 0 True\n",
|
||
|
" | 1 True\n",
|
||
|
" | 2 False\n",
|
||
|
" | dtype: bool\n",
|
||
|
" | \n",
|
||
|
" | nsmallest(self, n=5, keep='first')\n",
|
||
|
" | Return the smallest `n` elements.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | n : int, default 5\n",
|
||
|
" | Return this many ascending sorted values.\n",
|
||
|
" | keep : {'first', 'last', 'all'}, default 'first'\n",
|
||
|
" | When there are duplicate values that cannot all fit in a\n",
|
||
|
" | Series of `n` elements:\n",
|
||
|
" | \n",
|
||
|
" | - ``first`` : take the first occurrences based on the index order\n",
|
||
|
" | - ``last`` : take the last occurrences based on the index order\n",
|
||
|
" | - ``all`` : keep all occurrences. This can result in a Series of\n",
|
||
|
" | size larger than `n`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | The `n` smallest values in the Series, sorted in increasing order.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.nlargest: Get the `n` largest elements.\n",
|
||
|
" | Series.sort_values: Sort Series by values.\n",
|
||
|
" | Series.head: Return the first `n` rows.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Faster than ``.sort_values().head(n)`` for small `n` relative to\n",
|
||
|
" | the size of the ``Series`` object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> countries_population = {\"Italy\": 59000000, \"France\": 65000000,\n",
|
||
|
" | ... \"Brunei\": 434000, \"Malta\": 434000,\n",
|
||
|
" | ... \"Maldives\": 434000, \"Iceland\": 337000,\n",
|
||
|
" | ... \"Nauru\": 11300, \"Tuvalu\": 11300,\n",
|
||
|
" | ... \"Anguilla\": 11300, \"Monserat\": 5200}\n",
|
||
|
" | >>> s = pd.Series(countries_population)\n",
|
||
|
" | >>> s\n",
|
||
|
" | Italy 59000000\n",
|
||
|
" | France 65000000\n",
|
||
|
" | Brunei 434000\n",
|
||
|
" | Malta 434000\n",
|
||
|
" | Maldives 434000\n",
|
||
|
" | Iceland 337000\n",
|
||
|
" | Nauru 11300\n",
|
||
|
" | Tuvalu 11300\n",
|
||
|
" | Anguilla 11300\n",
|
||
|
" | Monserat 5200\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` largest elements where ``n=5`` by default.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nsmallest()\n",
|
||
|
" | Monserat 5200\n",
|
||
|
" | Nauru 11300\n",
|
||
|
" | Tuvalu 11300\n",
|
||
|
" | Anguilla 11300\n",
|
||
|
" | Iceland 337000\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` smallest elements where ``n=3``. Default `keep` value is\n",
|
||
|
" | 'first' so Nauru and Tuvalu will be kept.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nsmallest(3)\n",
|
||
|
" | Monserat 5200\n",
|
||
|
" | Nauru 11300\n",
|
||
|
" | Tuvalu 11300\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` smallest elements where ``n=3`` and keeping the last\n",
|
||
|
" | duplicates. Anguilla and Tuvalu will be kept since they are the last\n",
|
||
|
" | with value 11300 based on the index order.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nsmallest(3, keep='last')\n",
|
||
|
" | Monserat 5200\n",
|
||
|
" | Anguilla 11300\n",
|
||
|
" | Tuvalu 11300\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `n` smallest elements where ``n=3`` with all duplicates kept. Note\n",
|
||
|
" | that the returned Series has four elements due to the three duplicates.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.nsmallest(3, keep='all')\n",
|
||
|
" | Monserat 5200\n",
|
||
|
" | Nauru 11300\n",
|
||
|
" | Tuvalu 11300\n",
|
||
|
" | Anguilla 11300\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | pow(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Exponential power of series and other, element-wise (binary operator `pow`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series ** other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rpow\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | prod(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)\n",
|
||
|
" | Return the product of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | min_count : int, default 0\n",
|
||
|
" | The required number of valid values to perform the operation. If fewer than\n",
|
||
|
" | ``min_count`` non-NA values are present the result will be NA.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded :: 0.22.0\n",
|
||
|
" | \n",
|
||
|
" | Added with the default being 0. This means the sum of an all-NA\n",
|
||
|
" | or empty Series is 0, and the product of an all-NA or empty\n",
|
||
|
" | Series is 1.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | prod : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | By default, the product of an empty or all-NA Series is ``1``\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([]).prod()\n",
|
||
|
" | 1.0\n",
|
||
|
" | \n",
|
||
|
" | This can be controlled with the ``min_count`` parameter\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([]).prod(min_count=1)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | Thanks to the ``skipna`` parameter, ``min_count`` handles all-NA and\n",
|
||
|
" | empty series identically.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([np.nan]).prod()\n",
|
||
|
" | 1.0\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([np.nan]).prod(min_count=1)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | product = prod(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)\n",
|
||
|
" | \n",
|
||
|
" | ptp(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Returns the difference between the maximum value and the\n",
|
||
|
" | minimum value in the object. This is the equivalent of the\n",
|
||
|
" | ``numpy.ndarray`` method ``ptp``.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.24.0\n",
|
||
|
" | Use numpy.ptp instead\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | ptp : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | put(self, *args, **kwargs)\n",
|
||
|
" | Applies the `put` method to its `values` attribute if it has one.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.put\n",
|
||
|
" | \n",
|
||
|
" | quantile(self, q=0.5, interpolation='linear')\n",
|
||
|
" | Return value at the given quantile.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | q : float or array-like, default 0.5 (50% quantile)\n",
|
||
|
" | 0 <= q <= 1, the quantile(s) to compute\n",
|
||
|
" | interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'}\n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | \n",
|
||
|
" | This optional parameter specifies the interpolation method to use,\n",
|
||
|
" | when the desired quantile lies between two data points `i` and `j`:\n",
|
||
|
" | \n",
|
||
|
" | * linear: `i + (j - i) * fraction`, where `fraction` is the\n",
|
||
|
" | fractional part of the index surrounded by `i` and `j`.\n",
|
||
|
" | * lower: `i`.\n",
|
||
|
" | * higher: `j`.\n",
|
||
|
" | * nearest: `i` or `j` whichever is nearest.\n",
|
||
|
" | * midpoint: (`i` + `j`) / 2.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | quantile : float or Series\n",
|
||
|
" | if ``q`` is an array, a Series will be returned where the\n",
|
||
|
" | index is ``q`` and the values are the quantiles.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | core.window.Rolling.quantile\n",
|
||
|
" | numpy.percentile\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4])\n",
|
||
|
" | >>> s.quantile(.5)\n",
|
||
|
" | 2.5\n",
|
||
|
" | >>> s.quantile([.25, .5, .75])\n",
|
||
|
" | 0.25 1.75\n",
|
||
|
" | 0.50 2.50\n",
|
||
|
" | 0.75 3.25\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | radd(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Addition of series and other, element-wise (binary operator `radd`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other + series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.add\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | ravel(self, order='C')\n",
|
||
|
" | Return the flattened underlying data as an ndarray.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.ravel\n",
|
||
|
" | \n",
|
||
|
" | rdiv = rtruediv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | \n",
|
||
|
" | rdivmod(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Integer division and modulo of series and other, element-wise (binary operator `rdivmod`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other divmod series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.divmod\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | reindex(self, index=None, **kwargs)\n",
|
||
|
" | Conform Series to new index with optional filling logic, placing\n",
|
||
|
" | NA/NaN in locations having no value in the previous index. A new object\n",
|
||
|
" | is produced unless the new index is equivalent to the current one and\n",
|
||
|
" | ``copy=False``.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | \n",
|
||
|
" | index : array-like, optional\n",
|
||
|
" | New labels / index to conform to, should be specified using\n",
|
||
|
" | keywords. Preferably an Index object to avoid duplicating data\n",
|
||
|
" | \n",
|
||
|
" | method : {None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}\n",
|
||
|
" | Method to use for filling holes in reindexed DataFrame.\n",
|
||
|
" | Please note: this is only applicable to DataFrames/Series with a\n",
|
||
|
" | monotonically increasing/decreasing index.\n",
|
||
|
" | \n",
|
||
|
" | * None (default): don't fill gaps\n",
|
||
|
" | * pad / ffill: propagate last valid observation forward to next\n",
|
||
|
" | valid\n",
|
||
|
" | * backfill / bfill: use next valid observation to fill gap\n",
|
||
|
" | * nearest: use nearest valid observations to fill gap\n",
|
||
|
" | \n",
|
||
|
" | copy : bool, default True\n",
|
||
|
" | Return a new object, even if the passed indexes are the same.\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level.\n",
|
||
|
" | fill_value : scalar, default np.NaN\n",
|
||
|
" | Value to use for missing values. Defaults to NaN, but can be any\n",
|
||
|
" | \"compatible\" value.\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | Maximum number of consecutive elements to forward or backward fill.\n",
|
||
|
" | tolerance : optional\n",
|
||
|
" | Maximum distance between original and new labels for inexact\n",
|
||
|
" | matches. The values of the index at the matching locations most\n",
|
||
|
" | satisfy the equation ``abs(index[indexer] - target) <= tolerance``.\n",
|
||
|
" | \n",
|
||
|
" | Tolerance may be a scalar value, which applies the same tolerance\n",
|
||
|
" | to all values, or list-like, which applies variable tolerance per\n",
|
||
|
" | element. List-like includes list, tuple, array, Series, and must be\n",
|
||
|
" | the same size as the index and its dtype must exactly match the\n",
|
||
|
" | index's type.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0 (list-like tolerance)\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series with changed index.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.set_index : Set row labels.\n",
|
||
|
" | DataFrame.reset_index : Remove row labels or move them to new columns.\n",
|
||
|
" | DataFrame.reindex_like : Change to same indices as other DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | ``DataFrame.reindex`` supports two calling conventions\n",
|
||
|
" | \n",
|
||
|
" | * ``(index=index_labels, columns=column_labels, ...)``\n",
|
||
|
" | * ``(labels, axis={'index', 'columns'}, ...)``\n",
|
||
|
" | \n",
|
||
|
" | We *highly* recommend using keyword arguments to clarify your\n",
|
||
|
" | intent.\n",
|
||
|
" | \n",
|
||
|
" | Create a dataframe with some fictional data.\n",
|
||
|
" | \n",
|
||
|
" | >>> index = ['Firefox', 'Chrome', 'Safari', 'IE10', 'Konqueror']\n",
|
||
|
" | >>> df = pd.DataFrame({\n",
|
||
|
" | ... 'http_status': [200,200,404,404,301],\n",
|
||
|
" | ... 'response_time': [0.04, 0.02, 0.07, 0.08, 1.0]},\n",
|
||
|
" | ... index=index)\n",
|
||
|
" | >>> df\n",
|
||
|
" | http_status response_time\n",
|
||
|
" | Firefox 200 0.04\n",
|
||
|
" | Chrome 200 0.02\n",
|
||
|
" | Safari 404 0.07\n",
|
||
|
" | IE10 404 0.08\n",
|
||
|
" | Konqueror 301 1.00\n",
|
||
|
" | \n",
|
||
|
" | Create a new index and reindex the dataframe. By default\n",
|
||
|
" | values in the new index that do not have corresponding\n",
|
||
|
" | records in the dataframe are assigned ``NaN``.\n",
|
||
|
" | \n",
|
||
|
" | >>> new_index= ['Safari', 'Iceweasel', 'Comodo Dragon', 'IE10',\n",
|
||
|
" | ... 'Chrome']\n",
|
||
|
" | >>> df.reindex(new_index)\n",
|
||
|
" | http_status response_time\n",
|
||
|
" | Safari 404.0 0.07\n",
|
||
|
" | Iceweasel NaN NaN\n",
|
||
|
" | Comodo Dragon NaN NaN\n",
|
||
|
" | IE10 404.0 0.08\n",
|
||
|
" | Chrome 200.0 0.02\n",
|
||
|
" | \n",
|
||
|
" | We can fill in the missing values by passing a value to\n",
|
||
|
" | the keyword ``fill_value``. Because the index is not monotonically\n",
|
||
|
" | increasing or decreasing, we cannot use arguments to the keyword\n",
|
||
|
" | ``method`` to fill the ``NaN`` values.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.reindex(new_index, fill_value=0)\n",
|
||
|
" | http_status response_time\n",
|
||
|
" | Safari 404 0.07\n",
|
||
|
" | Iceweasel 0 0.00\n",
|
||
|
" | Comodo Dragon 0 0.00\n",
|
||
|
" | IE10 404 0.08\n",
|
||
|
" | Chrome 200 0.02\n",
|
||
|
" | \n",
|
||
|
" | >>> df.reindex(new_index, fill_value='missing')\n",
|
||
|
" | http_status response_time\n",
|
||
|
" | Safari 404 0.07\n",
|
||
|
" | Iceweasel missing missing\n",
|
||
|
" | Comodo Dragon missing missing\n",
|
||
|
" | IE10 404 0.08\n",
|
||
|
" | Chrome 200 0.02\n",
|
||
|
" | \n",
|
||
|
" | We can also reindex the columns.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.reindex(columns=['http_status', 'user_agent'])\n",
|
||
|
" | http_status user_agent\n",
|
||
|
" | Firefox 200 NaN\n",
|
||
|
" | Chrome 200 NaN\n",
|
||
|
" | Safari 404 NaN\n",
|
||
|
" | IE10 404 NaN\n",
|
||
|
" | Konqueror 301 NaN\n",
|
||
|
" | \n",
|
||
|
" | Or we can use \"axis-style\" keyword arguments\n",
|
||
|
" | \n",
|
||
|
" | >>> df.reindex(['http_status', 'user_agent'], axis=\"columns\")\n",
|
||
|
" | http_status user_agent\n",
|
||
|
" | Firefox 200 NaN\n",
|
||
|
" | Chrome 200 NaN\n",
|
||
|
" | Safari 404 NaN\n",
|
||
|
" | IE10 404 NaN\n",
|
||
|
" | Konqueror 301 NaN\n",
|
||
|
" | \n",
|
||
|
" | To further illustrate the filling functionality in\n",
|
||
|
" | ``reindex``, we will create a dataframe with a\n",
|
||
|
" | monotonically increasing index (for example, a sequence\n",
|
||
|
" | of dates).\n",
|
||
|
" | \n",
|
||
|
" | >>> date_index = pd.date_range('1/1/2010', periods=6, freq='D')\n",
|
||
|
" | >>> df2 = pd.DataFrame({\"prices\": [100, 101, np.nan, 100, 89, 88]},\n",
|
||
|
" | ... index=date_index)\n",
|
||
|
" | >>> df2\n",
|
||
|
" | prices\n",
|
||
|
" | 2010-01-01 100.0\n",
|
||
|
" | 2010-01-02 101.0\n",
|
||
|
" | 2010-01-03 NaN\n",
|
||
|
" | 2010-01-04 100.0\n",
|
||
|
" | 2010-01-05 89.0\n",
|
||
|
" | 2010-01-06 88.0\n",
|
||
|
" | \n",
|
||
|
" | Suppose we decide to expand the dataframe to cover a wider\n",
|
||
|
" | date range.\n",
|
||
|
" | \n",
|
||
|
" | >>> date_index2 = pd.date_range('12/29/2009', periods=10, freq='D')\n",
|
||
|
" | >>> df2.reindex(date_index2)\n",
|
||
|
" | prices\n",
|
||
|
" | 2009-12-29 NaN\n",
|
||
|
" | 2009-12-30 NaN\n",
|
||
|
" | 2009-12-31 NaN\n",
|
||
|
" | 2010-01-01 100.0\n",
|
||
|
" | 2010-01-02 101.0\n",
|
||
|
" | 2010-01-03 NaN\n",
|
||
|
" | 2010-01-04 100.0\n",
|
||
|
" | 2010-01-05 89.0\n",
|
||
|
" | 2010-01-06 88.0\n",
|
||
|
" | 2010-01-07 NaN\n",
|
||
|
" | \n",
|
||
|
" | The index entries that did not have a value in the original data frame\n",
|
||
|
" | (for example, '2009-12-29') are by default filled with ``NaN``.\n",
|
||
|
" | If desired, we can fill in the missing values using one of several\n",
|
||
|
" | options.\n",
|
||
|
" | \n",
|
||
|
" | For example, to back-propagate the last valid value to fill the ``NaN``\n",
|
||
|
" | values, pass ``bfill`` as an argument to the ``method`` keyword.\n",
|
||
|
" | \n",
|
||
|
" | >>> df2.reindex(date_index2, method='bfill')\n",
|
||
|
" | prices\n",
|
||
|
" | 2009-12-29 100.0\n",
|
||
|
" | 2009-12-30 100.0\n",
|
||
|
" | 2009-12-31 100.0\n",
|
||
|
" | 2010-01-01 100.0\n",
|
||
|
" | 2010-01-02 101.0\n",
|
||
|
" | 2010-01-03 NaN\n",
|
||
|
" | 2010-01-04 100.0\n",
|
||
|
" | 2010-01-05 89.0\n",
|
||
|
" | 2010-01-06 88.0\n",
|
||
|
" | 2010-01-07 NaN\n",
|
||
|
" | \n",
|
||
|
" | Please note that the ``NaN`` value present in the original dataframe\n",
|
||
|
" | (at index value 2010-01-03) will not be filled by any of the\n",
|
||
|
" | value propagation schemes. This is because filling while reindexing\n",
|
||
|
" | does not look at dataframe values, but only compares the original and\n",
|
||
|
" | desired indexes. If you do want to fill in the ``NaN`` values present\n",
|
||
|
" | in the original dataframe, use the ``fillna()`` method.\n",
|
||
|
" | \n",
|
||
|
" | See the :ref:`user guide <basics.reindexing>` for more.\n",
|
||
|
" | \n",
|
||
|
" | reindex_axis(self, labels, axis=0, **kwargs)\n",
|
||
|
" | Conform Series to new index with optional filling logic.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | Use ``Series.reindex`` instead.\n",
|
||
|
" | \n",
|
||
|
" | rename(self, index=None, **kwargs)\n",
|
||
|
" | Alter Series index labels or name.\n",
|
||
|
" | \n",
|
||
|
" | Function / dict values must be unique (1-to-1). Labels not contained in\n",
|
||
|
" | a dict / Series will be left as-is. Extra labels listed don't throw an\n",
|
||
|
" | error.\n",
|
||
|
" | \n",
|
||
|
" | Alternatively, change ``Series.name`` with a scalar value.\n",
|
||
|
" | \n",
|
||
|
" | See the :ref:`user guide <basics.rename>` for more.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | index : scalar, hashable sequence, dict-like or function, optional\n",
|
||
|
" | dict-like or functions are transformations to apply to\n",
|
||
|
" | the index.\n",
|
||
|
" | Scalar or hashable sequence-like will alter the ``Series.name``\n",
|
||
|
" | attribute.\n",
|
||
|
" | copy : bool, default True\n",
|
||
|
" | Also copy underlying data\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | Whether to return a new Series. If True then value of copy is\n",
|
||
|
" | ignored.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | In case of a MultiIndex, only rename labels in the specified\n",
|
||
|
" | level.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | renamed : Series (new object)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rename_axis\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> s.rename(\"my_name\") # scalar, changes Series.name\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | Name: my_name, dtype: int64\n",
|
||
|
" | >>> s.rename(lambda x: x ** 2) # function, changes labels\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 4 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> s.rename({1: 3, 2: 5}) # mapping, changes labels\n",
|
||
|
" | 0 1\n",
|
||
|
" | 3 2\n",
|
||
|
" | 5 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | reorder_levels(self, order)\n",
|
||
|
" | Rearrange index levels using input order.\n",
|
||
|
" | \n",
|
||
|
" | May not drop or duplicate levels.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | order : list of int representing new level order\n",
|
||
|
" | (reference level by number or key)\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | type of caller (new object)\n",
|
||
|
" | \n",
|
||
|
" | repeat(self, repeats, axis=None)\n",
|
||
|
" | Repeat elements of a Series.\n",
|
||
|
" | \n",
|
||
|
" | Returns a new Series where each element of the current Series\n",
|
||
|
" | is repeated consecutively a given number of times.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | repeats : int or array of ints\n",
|
||
|
" | The number of repetitions for each element. This should be a\n",
|
||
|
" | non-negative integer. Repeating 0 times will return an empty\n",
|
||
|
" | Series.\n",
|
||
|
" | axis : None\n",
|
||
|
" | Must be ``None``. Has no effect but is accepted for compatibility\n",
|
||
|
" | with numpy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | repeated_series : Series\n",
|
||
|
" | Newly created Series with repeated elements.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Index.repeat : Equivalent function for Index.\n",
|
||
|
" | numpy.repeat : Similar method for :class:`numpy.ndarray`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(['a', 'b', 'c'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 a\n",
|
||
|
" | 1 b\n",
|
||
|
" | 2 c\n",
|
||
|
" | dtype: object\n",
|
||
|
" | >>> s.repeat(2)\n",
|
||
|
" | 0 a\n",
|
||
|
" | 0 a\n",
|
||
|
" | 1 b\n",
|
||
|
" | 1 b\n",
|
||
|
" | 2 c\n",
|
||
|
" | 2 c\n",
|
||
|
" | dtype: object\n",
|
||
|
" | >>> s.repeat([1, 2, 3])\n",
|
||
|
" | 0 a\n",
|
||
|
" | 1 b\n",
|
||
|
" | 1 b\n",
|
||
|
" | 2 c\n",
|
||
|
" | 2 c\n",
|
||
|
" | 2 c\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | replace(self, to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')\n",
|
||
|
" | Replace values given in `to_replace` with `value`.\n",
|
||
|
" | \n",
|
||
|
" | Values of the Series are replaced with other values dynamically.\n",
|
||
|
" | This differs from updating with ``.loc`` or ``.iloc``, which require\n",
|
||
|
" | you to specify a location to update with some value.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | to_replace : str, regex, list, dict, Series, int, float, or None\n",
|
||
|
" | How to find the values that will be replaced.\n",
|
||
|
" | \n",
|
||
|
" | * numeric, str or regex:\n",
|
||
|
" | \n",
|
||
|
" | - numeric: numeric values equal to `to_replace` will be\n",
|
||
|
" | replaced with `value`\n",
|
||
|
" | - str: string exactly matching `to_replace` will be replaced\n",
|
||
|
" | with `value`\n",
|
||
|
" | - regex: regexs matching `to_replace` will be replaced with\n",
|
||
|
" | `value`\n",
|
||
|
" | \n",
|
||
|
" | * list of str, regex, or numeric:\n",
|
||
|
" | \n",
|
||
|
" | - First, if `to_replace` and `value` are both lists, they\n",
|
||
|
" | **must** be the same length.\n",
|
||
|
" | - Second, if ``regex=True`` then all of the strings in **both**\n",
|
||
|
" | lists will be interpreted as regexs otherwise they will match\n",
|
||
|
" | directly. This doesn't matter much for `value` since there\n",
|
||
|
" | are only a few possible substitution regexes you can use.\n",
|
||
|
" | - str, regex and numeric rules apply as above.\n",
|
||
|
" | \n",
|
||
|
" | * dict:\n",
|
||
|
" | \n",
|
||
|
" | - Dicts can be used to specify different replacement values\n",
|
||
|
" | for different existing values. For example,\n",
|
||
|
" | ``{'a': 'b', 'y': 'z'}`` replaces the value 'a' with 'b' and\n",
|
||
|
" | 'y' with 'z'. To use a dict in this way the `value`\n",
|
||
|
" | parameter should be `None`.\n",
|
||
|
" | - For a DataFrame a dict can specify that different values\n",
|
||
|
" | should be replaced in different columns. For example,\n",
|
||
|
" | ``{'a': 1, 'b': 'z'}`` looks for the value 1 in column 'a'\n",
|
||
|
" | and the value 'z' in column 'b' and replaces these values\n",
|
||
|
" | with whatever is specified in `value`. The `value` parameter\n",
|
||
|
" | should not be ``None`` in this case. You can treat this as a\n",
|
||
|
" | special case of passing two lists except that you are\n",
|
||
|
" | specifying the column to search in.\n",
|
||
|
" | - For a DataFrame nested dictionaries, e.g.,\n",
|
||
|
" | ``{'a': {'b': np.nan}}``, are read as follows: look in column\n",
|
||
|
" | 'a' for the value 'b' and replace it with NaN. The `value`\n",
|
||
|
" | parameter should be ``None`` to use a nested dict in this\n",
|
||
|
" | way. You can nest regular expressions as well. Note that\n",
|
||
|
" | column names (the top-level dictionary keys in a nested\n",
|
||
|
" | dictionary) **cannot** be regular expressions.\n",
|
||
|
" | \n",
|
||
|
" | * None:\n",
|
||
|
" | \n",
|
||
|
" | - This means that the `regex` argument must be a string,\n",
|
||
|
" | compiled regular expression, or list, dict, ndarray or\n",
|
||
|
" | Series of such elements. If `value` is also ``None`` then\n",
|
||
|
" | this **must** be a nested dictionary or Series.\n",
|
||
|
" | \n",
|
||
|
" | See the examples section for examples of each of these.\n",
|
||
|
" | value : scalar, dict, list, str, regex, default None\n",
|
||
|
" | Value to replace any values matching `to_replace` with.\n",
|
||
|
" | For a DataFrame a dict of values can be used to specify which\n",
|
||
|
" | value to use for each column (columns not in the dict will not be\n",
|
||
|
" | filled). Regular expressions, strings and lists or dicts of such\n",
|
||
|
" | objects are also allowed.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | If True, in place. Note: this will modify any\n",
|
||
|
" | other views on this object (e.g. a column from a DataFrame).\n",
|
||
|
" | Returns the caller if this is True.\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | Maximum size gap to forward or backward fill.\n",
|
||
|
" | regex : bool or same types as `to_replace`, default False\n",
|
||
|
" | Whether to interpret `to_replace` and/or `value` as regular\n",
|
||
|
" | expressions. If this is ``True`` then `to_replace` *must* be a\n",
|
||
|
" | string. Alternatively, this could be a regular expression or a\n",
|
||
|
" | list, dict, or array of regular expressions in which case\n",
|
||
|
" | `to_replace` must be ``None``.\n",
|
||
|
" | method : {'pad', 'ffill', 'bfill', `None`}\n",
|
||
|
" | The method to use when for replacement, when `to_replace` is a\n",
|
||
|
" | scalar, list or tuple and `value` is ``None``.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.23.0\n",
|
||
|
" | Added to DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Object after replacement.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | AssertionError\n",
|
||
|
" | * If `regex` is not a ``bool`` and `to_replace` is not\n",
|
||
|
" | ``None``.\n",
|
||
|
" | TypeError\n",
|
||
|
" | * If `to_replace` is a ``dict`` and `value` is not a ``list``,\n",
|
||
|
" | ``dict``, ``ndarray``, or ``Series``\n",
|
||
|
" | * If `to_replace` is ``None`` and `regex` is not compilable\n",
|
||
|
" | into a regular expression or is a list, dict, ndarray, or\n",
|
||
|
" | Series.\n",
|
||
|
" | * When replacing multiple ``bool`` or ``datetime64`` objects and\n",
|
||
|
" | the arguments to `to_replace` does not match the type of the\n",
|
||
|
" | value being replaced\n",
|
||
|
" | ValueError\n",
|
||
|
" | * If a ``list`` or an ``ndarray`` is passed to `to_replace` and\n",
|
||
|
" | `value` but they are not the same length.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.fillna : Fill NA values.\n",
|
||
|
" | Series.where : Replace values based on boolean condition.\n",
|
||
|
" | Series.str.replace : Simple string replacement.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | * Regex substitution is performed under the hood with ``re.sub``. The\n",
|
||
|
" | rules for substitution for ``re.sub`` are the same.\n",
|
||
|
" | * Regular expressions will only substitute on strings, meaning you\n",
|
||
|
" | cannot provide, for example, a regular expression matching floating\n",
|
||
|
" | point numbers and expect the columns in your frame that have a\n",
|
||
|
" | numeric dtype to be matched. However, if those floating point\n",
|
||
|
" | numbers *are* strings, then you can do this.\n",
|
||
|
" | * This method has *a lot* of options. You are encouraged to experiment\n",
|
||
|
" | and play with this method to gain intuition about how it works.\n",
|
||
|
" | * When dict is used as the `to_replace` value, it is like\n",
|
||
|
" | key(s) in the dict are the to_replace part and\n",
|
||
|
" | value(s) in the dict are the value parameter.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | **Scalar `to_replace` and `value`**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([0, 1, 2, 3, 4])\n",
|
||
|
" | >>> s.replace(0, 5)\n",
|
||
|
" | 0 5\n",
|
||
|
" | 1 1\n",
|
||
|
" | 2 2\n",
|
||
|
" | 3 3\n",
|
||
|
" | 4 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'A': [0, 1, 2, 3, 4],\n",
|
||
|
" | ... 'B': [5, 6, 7, 8, 9],\n",
|
||
|
" | ... 'C': ['a', 'b', 'c', 'd', 'e']})\n",
|
||
|
" | >>> df.replace(0, 5)\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 5 5 a\n",
|
||
|
" | 1 1 6 b\n",
|
||
|
" | 2 2 7 c\n",
|
||
|
" | 3 3 8 d\n",
|
||
|
" | 4 4 9 e\n",
|
||
|
" | \n",
|
||
|
" | **List-like `to_replace`**\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace([0, 1, 2, 3], 4)\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 4 5 a\n",
|
||
|
" | 1 4 6 b\n",
|
||
|
" | 2 4 7 c\n",
|
||
|
" | 3 4 8 d\n",
|
||
|
" | 4 4 9 e\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace([0, 1, 2, 3], [4, 3, 2, 1])\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 4 5 a\n",
|
||
|
" | 1 3 6 b\n",
|
||
|
" | 2 2 7 c\n",
|
||
|
" | 3 1 8 d\n",
|
||
|
" | 4 4 9 e\n",
|
||
|
" | \n",
|
||
|
" | >>> s.replace([1, 2], method='bfill')\n",
|
||
|
" | 0 0\n",
|
||
|
" | 1 3\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 3\n",
|
||
|
" | 4 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | **dict-like `to_replace`**\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace({0: 10, 1: 100})\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 10 5 a\n",
|
||
|
" | 1 100 6 b\n",
|
||
|
" | 2 2 7 c\n",
|
||
|
" | 3 3 8 d\n",
|
||
|
" | 4 4 9 e\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace({'A': 0, 'B': 5}, 100)\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 100 100 a\n",
|
||
|
" | 1 1 6 b\n",
|
||
|
" | 2 2 7 c\n",
|
||
|
" | 3 3 8 d\n",
|
||
|
" | 4 4 9 e\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace({'A': {0: 100, 4: 400}})\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 100 5 a\n",
|
||
|
" | 1 1 6 b\n",
|
||
|
" | 2 2 7 c\n",
|
||
|
" | 3 3 8 d\n",
|
||
|
" | 4 400 9 e\n",
|
||
|
" | \n",
|
||
|
" | **Regular expression `to_replace`**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'A': ['bat', 'foo', 'bait'],\n",
|
||
|
" | ... 'B': ['abc', 'bar', 'xyz']})\n",
|
||
|
" | >>> df.replace(to_replace=r'^ba.$', value='new', regex=True)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 new abc\n",
|
||
|
" | 1 foo new\n",
|
||
|
" | 2 bait xyz\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace({'A': r'^ba.$'}, {'A': 'new'}, regex=True)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 new abc\n",
|
||
|
" | 1 foo bar\n",
|
||
|
" | 2 bait xyz\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace(regex=r'^ba.$', value='new')\n",
|
||
|
" | A B\n",
|
||
|
" | 0 new abc\n",
|
||
|
" | 1 foo new\n",
|
||
|
" | 2 bait xyz\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace(regex={r'^ba.$': 'new', 'foo': 'xyz'})\n",
|
||
|
" | A B\n",
|
||
|
" | 0 new abc\n",
|
||
|
" | 1 xyz new\n",
|
||
|
" | 2 bait xyz\n",
|
||
|
" | \n",
|
||
|
" | >>> df.replace(regex=[r'^ba.$', 'foo'], value='new')\n",
|
||
|
" | A B\n",
|
||
|
" | 0 new abc\n",
|
||
|
" | 1 new new\n",
|
||
|
" | 2 bait xyz\n",
|
||
|
" | \n",
|
||
|
" | Note that when replacing multiple ``bool`` or ``datetime64`` objects,\n",
|
||
|
" | the data types in the `to_replace` parameter must match the data\n",
|
||
|
" | type of the value being replaced:\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'A': [True, False, True],\n",
|
||
|
" | ... 'B': [False, True, False]})\n",
|
||
|
" | >>> df.replace({'a string': 'new value', True: False}) # raises\n",
|
||
|
" | Traceback (most recent call last):\n",
|
||
|
" | ...\n",
|
||
|
" | TypeError: Cannot compare types 'ndarray(dtype=bool)' and 'str'\n",
|
||
|
" | \n",
|
||
|
" | This raises a ``TypeError`` because one of the ``dict`` keys is not of\n",
|
||
|
" | the correct type for replacement.\n",
|
||
|
" | \n",
|
||
|
" | Compare the behavior of ``s.replace({'a': None})`` and\n",
|
||
|
" | ``s.replace('a', None)`` to understand the peculiarities\n",
|
||
|
" | of the `to_replace` parameter:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([10, 'a', 'a', 'b', 'a'])\n",
|
||
|
" | \n",
|
||
|
" | When one uses a dict as the `to_replace` value, it is like the\n",
|
||
|
" | value(s) in the dict are equal to the `value` parameter.\n",
|
||
|
" | ``s.replace({'a': None})`` is equivalent to\n",
|
||
|
" | ``s.replace(to_replace={'a': None}, value=None, method=None)``:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.replace({'a': None})\n",
|
||
|
" | 0 10\n",
|
||
|
" | 1 None\n",
|
||
|
" | 2 None\n",
|
||
|
" | 3 b\n",
|
||
|
" | 4 None\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | When ``value=None`` and `to_replace` is a scalar, list or\n",
|
||
|
" | tuple, `replace` uses the method parameter (default 'pad') to do the\n",
|
||
|
" | replacement. So this is why the 'a' values are being replaced by 10\n",
|
||
|
" | in rows 1 and 2 and 'b' in row 4 in this case.\n",
|
||
|
" | The command ``s.replace('a', None)`` is actually equivalent to\n",
|
||
|
" | ``s.replace(to_replace='a', value=None, method='pad')``:\n",
|
||
|
" | \n",
|
||
|
" | >>> s.replace('a', None)\n",
|
||
|
" | 0 10\n",
|
||
|
" | 1 10\n",
|
||
|
" | 2 10\n",
|
||
|
" | 3 b\n",
|
||
|
" | 4 b\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | reset_index(self, level=None, drop=False, name=None, inplace=False)\n",
|
||
|
" | Generate a new DataFrame or Series with the index reset.\n",
|
||
|
" | \n",
|
||
|
" | This is useful when the index needs to be treated as a column, or\n",
|
||
|
" | when the index is meaningless and needs to be reset to the default\n",
|
||
|
" | before another operation.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | level : int, str, tuple, or list, default optional\n",
|
||
|
" | For a Series with a MultiIndex, only remove the specified levels\n",
|
||
|
" | from the index. Removes all levels by default.\n",
|
||
|
" | drop : bool, default False\n",
|
||
|
" | Just reset the index, without inserting it as a column in\n",
|
||
|
" | the new DataFrame.\n",
|
||
|
" | name : object, optional\n",
|
||
|
" | The name to use for the column containing the original Series\n",
|
||
|
" | values. Uses ``self.name`` by default. This argument is ignored\n",
|
||
|
" | when `drop` is True.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | Modify the Series in place (do not create a new object).\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | When `drop` is False (the default), a DataFrame is returned.\n",
|
||
|
" | The newly created columns will come first in the DataFrame,\n",
|
||
|
" | followed by the original Series values.\n",
|
||
|
" | When `drop` is True, a `Series` is returned.\n",
|
||
|
" | In either case, if ``inplace=True``, no value is returned.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.reset_index: Analogous function for DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4], name='foo',\n",
|
||
|
" | ... index=pd.Index(['a', 'b', 'c', 'd'], name='idx'))\n",
|
||
|
" | \n",
|
||
|
" | Generate a DataFrame with default index.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.reset_index()\n",
|
||
|
" | idx foo\n",
|
||
|
" | 0 a 1\n",
|
||
|
" | 1 b 2\n",
|
||
|
" | 2 c 3\n",
|
||
|
" | 3 d 4\n",
|
||
|
" | \n",
|
||
|
" | To specify the name of the new column use `name`.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.reset_index(name='values')\n",
|
||
|
" | idx values\n",
|
||
|
" | 0 a 1\n",
|
||
|
" | 1 b 2\n",
|
||
|
" | 2 c 3\n",
|
||
|
" | 3 d 4\n",
|
||
|
" | \n",
|
||
|
" | To generate a new Series with the default set `drop` to True.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.reset_index(drop=True)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | Name: foo, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | To update the Series in place, without generating a new one\n",
|
||
|
" | set `inplace` to True. Note that it also requires ``drop=True``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.reset_index(inplace=True, drop=True)\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | Name: foo, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The `level` parameter is interesting for Series with a multi-level\n",
|
||
|
" | index.\n",
|
||
|
" | \n",
|
||
|
" | >>> arrays = [np.array(['bar', 'bar', 'baz', 'baz']),\n",
|
||
|
" | ... np.array(['one', 'two', 'one', 'two'])]\n",
|
||
|
" | >>> s2 = pd.Series(\n",
|
||
|
" | ... range(4), name='foo',\n",
|
||
|
" | ... index=pd.MultiIndex.from_arrays(arrays,\n",
|
||
|
" | ... names=['a', 'b']))\n",
|
||
|
" | \n",
|
||
|
" | To remove a specific level from the Index, use `level`.\n",
|
||
|
" | \n",
|
||
|
" | >>> s2.reset_index(level='a')\n",
|
||
|
" | a foo\n",
|
||
|
" | b\n",
|
||
|
" | one bar 0\n",
|
||
|
" | two bar 1\n",
|
||
|
" | one baz 2\n",
|
||
|
" | two baz 3\n",
|
||
|
" | \n",
|
||
|
" | If `level` is not set, all levels are removed from the Index.\n",
|
||
|
" | \n",
|
||
|
" | >>> s2.reset_index()\n",
|
||
|
" | a b foo\n",
|
||
|
" | 0 bar one 0\n",
|
||
|
" | 1 bar two 1\n",
|
||
|
" | 2 baz one 2\n",
|
||
|
" | 3 baz two 3\n",
|
||
|
" | \n",
|
||
|
" | rfloordiv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Integer division of series and other, element-wise (binary operator `rfloordiv`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other // series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.floordiv\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | rmod(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Modulo of series and other, element-wise (binary operator `rmod`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other % series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.mod\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | rmul(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Multiplication of series and other, element-wise (binary operator `rmul`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other * series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.mul\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | rolling(self, window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None)\n",
|
||
|
" | Provides rolling window calculations.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | window : int, or offset\n",
|
||
|
" | Size of the moving window. This is the number of observations used for\n",
|
||
|
" | calculating the statistic. Each window will be a fixed size.\n",
|
||
|
" | \n",
|
||
|
" | If its an offset then this will be the time period of each window. Each\n",
|
||
|
" | window will be a variable sized based on the observations included in\n",
|
||
|
" | the time-period. This is only valid for datetimelike indexes. This is\n",
|
||
|
" | new in 0.19.0\n",
|
||
|
" | min_periods : int, default None\n",
|
||
|
" | Minimum number of observations in window required to have a value\n",
|
||
|
" | (otherwise result is NA). For a window that is specified by an offset,\n",
|
||
|
" | `min_periods` will default to 1. Otherwise, `min_periods` will default\n",
|
||
|
" | to the size of the window.\n",
|
||
|
" | center : bool, default False\n",
|
||
|
" | Set the labels at the center of the window.\n",
|
||
|
" | win_type : str, default None\n",
|
||
|
" | Provide a window type. If ``None``, all points are evenly weighted.\n",
|
||
|
" | See the notes below for further information.\n",
|
||
|
" | on : str, optional\n",
|
||
|
" | For a DataFrame, column on which to calculate\n",
|
||
|
" | the rolling window, rather than the index\n",
|
||
|
" | axis : int or str, default 0\n",
|
||
|
" | closed : str, default None\n",
|
||
|
" | Make the interval closed on the 'right', 'left', 'both' or\n",
|
||
|
" | 'neither' endpoints.\n",
|
||
|
" | For offset-based windows, it defaults to 'right'.\n",
|
||
|
" | For fixed windows, defaults to 'both'. Remaining cases not implemented\n",
|
||
|
" | for fixed windows.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | a Window or Rolling sub-classed for the particular operation\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | expanding : Provides expanding transformations.\n",
|
||
|
" | ewm : Provides exponential weighted functions.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | By default, the result is set to the right edge of the window. This can be\n",
|
||
|
" | changed to the center of the window by setting ``center=True``.\n",
|
||
|
" | \n",
|
||
|
" | To learn more about the offsets & frequency strings, please see `this link\n",
|
||
|
" | <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.\n",
|
||
|
" | \n",
|
||
|
" | The recognized win_types are:\n",
|
||
|
" | \n",
|
||
|
" | * ``boxcar``\n",
|
||
|
" | * ``triang``\n",
|
||
|
" | * ``blackman``\n",
|
||
|
" | * ``hamming``\n",
|
||
|
" | * ``bartlett``\n",
|
||
|
" | * ``parzen``\n",
|
||
|
" | * ``bohman``\n",
|
||
|
" | * ``blackmanharris``\n",
|
||
|
" | * ``nuttall``\n",
|
||
|
" | * ``barthann``\n",
|
||
|
" | * ``kaiser`` (needs beta)\n",
|
||
|
" | * ``gaussian`` (needs std)\n",
|
||
|
" | * ``general_gaussian`` (needs power, width)\n",
|
||
|
" | * ``slepian`` (needs width).\n",
|
||
|
" | \n",
|
||
|
" | If ``win_type=None`` all points are evenly weighted. To learn more about\n",
|
||
|
" | different window types see `scipy.signal window functions\n",
|
||
|
" | <https://docs.scipy.org/doc/scipy/reference/signal.html#window-functions>`__.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | B\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | \n",
|
||
|
" | Rolling sum with a window length of 2, using the 'triang'\n",
|
||
|
" | window type.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.rolling(2, win_type='triang').sum()\n",
|
||
|
" | B\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.5\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | \n",
|
||
|
" | Rolling sum with a window length of 2, min_periods defaults\n",
|
||
|
" | to the window length.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.rolling(2).sum()\n",
|
||
|
" | B\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | \n",
|
||
|
" | Same as above, but explicitly set the min_periods\n",
|
||
|
" | \n",
|
||
|
" | >>> df.rolling(2, min_periods=1).sum()\n",
|
||
|
" | B\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 3 2.0\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | \n",
|
||
|
" | A ragged (meaning not-a-regular frequency), time-indexed DataFrame\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},\n",
|
||
|
" | ... index = [pd.Timestamp('20130101 09:00:00'),\n",
|
||
|
" | ... pd.Timestamp('20130101 09:00:02'),\n",
|
||
|
" | ... pd.Timestamp('20130101 09:00:03'),\n",
|
||
|
" | ... pd.Timestamp('20130101 09:00:05'),\n",
|
||
|
" | ... pd.Timestamp('20130101 09:00:06')])\n",
|
||
|
" | \n",
|
||
|
" | >>> df\n",
|
||
|
" | B\n",
|
||
|
" | 2013-01-01 09:00:00 0.0\n",
|
||
|
" | 2013-01-01 09:00:02 1.0\n",
|
||
|
" | 2013-01-01 09:00:03 2.0\n",
|
||
|
" | 2013-01-01 09:00:05 NaN\n",
|
||
|
" | 2013-01-01 09:00:06 4.0\n",
|
||
|
" | \n",
|
||
|
" | Contrasting to an integer rolling window, this will roll a variable\n",
|
||
|
" | length window corresponding to the time period.\n",
|
||
|
" | The default for min_periods is 1.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.rolling('2s').sum()\n",
|
||
|
" | B\n",
|
||
|
" | 2013-01-01 09:00:00 0.0\n",
|
||
|
" | 2013-01-01 09:00:02 1.0\n",
|
||
|
" | 2013-01-01 09:00:03 3.0\n",
|
||
|
" | 2013-01-01 09:00:05 NaN\n",
|
||
|
" | 2013-01-01 09:00:06 4.0\n",
|
||
|
" | \n",
|
||
|
" | round(self, decimals=0, *args, **kwargs)\n",
|
||
|
" | Round each value in a Series to the given number of decimals.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | decimals : int\n",
|
||
|
" | Number of decimal places to round to (default: 0).\n",
|
||
|
" | If decimals is negative, it specifies the number of\n",
|
||
|
" | positions to the left of the decimal point.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series object\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.around\n",
|
||
|
" | DataFrame.round\n",
|
||
|
" | \n",
|
||
|
" | rpow(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Exponential power of series and other, element-wise (binary operator `rpow`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other ** series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.pow\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | rsub(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Subtraction of series and other, element-wise (binary operator `rsub`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other - series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.sub\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | rtruediv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Floating division of series and other, element-wise (binary operator `rtruediv`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``other / series``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.truediv\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | searchsorted(self, value, side='left', sorter=None)\n",
|
||
|
" | Find indices where elements should be inserted to maintain order.\n",
|
||
|
" | \n",
|
||
|
" | Find the indices into a sorted Series `self` such that, if the\n",
|
||
|
" | corresponding elements in `value` were inserted before the indices,\n",
|
||
|
" | the order of `self` would be preserved.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | value : array_like\n",
|
||
|
" | Values to insert into `self`.\n",
|
||
|
" | side : {'left', 'right'}, optional\n",
|
||
|
" | If 'left', the index of the first suitable location found is given.\n",
|
||
|
" | If 'right', return the last such index. If there is no suitable\n",
|
||
|
" | index, return either 0 or N (where N is the length of `self`).\n",
|
||
|
" | sorter : 1-D array_like, optional\n",
|
||
|
" | Optional array of integer indices that sort `self` into ascending\n",
|
||
|
" | order. They are typically the result of ``np.argsort``.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | int or array of int\n",
|
||
|
" | A scalar or array of insertion points with the\n",
|
||
|
" | same shape as `value`.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged :: 0.24.0\n",
|
||
|
" | If `value` is a scalar, an int is now always returned.\n",
|
||
|
" | Previously, scalar inputs returned an 1-item array for\n",
|
||
|
" | :class:`Series` and :class:`Categorical`.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.searchsorted\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Binary search is used to find the required insertion points.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> x = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> x\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> x.searchsorted(4)\n",
|
||
|
" | 3\n",
|
||
|
" | \n",
|
||
|
" | >>> x.searchsorted([0, 4])\n",
|
||
|
" | array([0, 3])\n",
|
||
|
" | \n",
|
||
|
" | >>> x.searchsorted([1, 3], side='left')\n",
|
||
|
" | array([0, 2])\n",
|
||
|
" | \n",
|
||
|
" | >>> x.searchsorted([1, 3], side='right')\n",
|
||
|
" | array([1, 3])\n",
|
||
|
" | \n",
|
||
|
" | >>> x = pd.Categorical(['apple', 'bread', 'bread',\n",
|
||
|
" | 'cheese', 'milk'], ordered=True)\n",
|
||
|
" | [apple, bread, bread, cheese, milk]\n",
|
||
|
" | Categories (4, object): [apple < bread < cheese < milk]\n",
|
||
|
" | \n",
|
||
|
" | >>> x.searchsorted('bread')\n",
|
||
|
" | 1\n",
|
||
|
" | \n",
|
||
|
" | >>> x.searchsorted(['bread'], side='right')\n",
|
||
|
" | array([3])\n",
|
||
|
" | \n",
|
||
|
" | sem(self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)\n",
|
||
|
" | Return unbiased standard error of the mean over requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Normalized by N-1 by default. This can be changed using the ddof argument\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar\n",
|
||
|
" | ddof : int, default 1\n",
|
||
|
" | Delta Degrees of Freedom. The divisor used in calculations is N - ddof,\n",
|
||
|
" | where N represents the number of elements.\n",
|
||
|
" | numeric_only : boolean, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | sem : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | set_value(self, label, value, takeable=False)\n",
|
||
|
" | Quickly set single value at passed label.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | Please use .at[] or .iat[] accessors.\n",
|
||
|
" | \n",
|
||
|
" | If label is not contained, a new object is created with the label\n",
|
||
|
" | placed at the end of the result index.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | label : object\n",
|
||
|
" | Partial indexing with MultiIndex not allowed\n",
|
||
|
" | value : object\n",
|
||
|
" | Scalar value\n",
|
||
|
" | takeable : interpret the index as indexers, default False\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | series : Series\n",
|
||
|
" | If label is contained, will be reference to calling Series,\n",
|
||
|
" | otherwise a new object\n",
|
||
|
" | \n",
|
||
|
" | shift(self, periods=1, freq=None, axis=0, fill_value=None)\n",
|
||
|
" | Shift index by desired number of periods with an optional time `freq`.\n",
|
||
|
" | \n",
|
||
|
" | When `freq` is not passed, shift the index without realigning the data.\n",
|
||
|
" | If `freq` is passed (in this case, the index must be date or datetime,\n",
|
||
|
" | or it will raise a `NotImplementedError`), the index will be\n",
|
||
|
" | increased using the periods and the `freq`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | periods : int\n",
|
||
|
" | Number of periods to shift. Can be positive or negative.\n",
|
||
|
" | freq : DateOffset, tseries.offsets, timedelta, or str, optional\n",
|
||
|
" | Offset to use from the tseries module or time rule (e.g. 'EOM').\n",
|
||
|
" | If `freq` is specified then the index values are shifted but the\n",
|
||
|
" | data is not realigned. That is, use `freq` if you would like to\n",
|
||
|
" | extend the index when shifting and preserve the original data.\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns', None}, default None\n",
|
||
|
" | Shift direction.\n",
|
||
|
" | fill_value : object, optional\n",
|
||
|
" | The scalar value to use for newly introduced missing values.\n",
|
||
|
" | the default depends on the dtype of `self`.\n",
|
||
|
" | For numeric data, ``np.nan`` is used.\n",
|
||
|
" | For datetime, timedelta, or period data, etc. :attr:`NaT` is used.\n",
|
||
|
" | For extension dtypes, ``self.dtype.na_value`` is used.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Copy of input object, shifted.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Index.shift : Shift values of Index.\n",
|
||
|
" | DatetimeIndex.shift : Shift values of DatetimeIndex.\n",
|
||
|
" | PeriodIndex.shift : Shift values of PeriodIndex.\n",
|
||
|
" | tshift : Shift the time index, using the index's frequency if\n",
|
||
|
" | available.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'Col1': [10, 20, 15, 30, 45],\n",
|
||
|
" | ... 'Col2': [13, 23, 18, 33, 48],\n",
|
||
|
" | ... 'Col3': [17, 27, 22, 37, 52]})\n",
|
||
|
" | \n",
|
||
|
" | >>> df.shift(periods=3)\n",
|
||
|
" | Col1 Col2 Col3\n",
|
||
|
" | 0 NaN NaN NaN\n",
|
||
|
" | 1 NaN NaN NaN\n",
|
||
|
" | 2 NaN NaN NaN\n",
|
||
|
" | 3 10.0 13.0 17.0\n",
|
||
|
" | 4 20.0 23.0 27.0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.shift(periods=1, axis='columns')\n",
|
||
|
" | Col1 Col2 Col3\n",
|
||
|
" | 0 NaN 10.0 13.0\n",
|
||
|
" | 1 NaN 20.0 23.0\n",
|
||
|
" | 2 NaN 15.0 18.0\n",
|
||
|
" | 3 NaN 30.0 33.0\n",
|
||
|
" | 4 NaN 45.0 48.0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.shift(periods=3, fill_value=0)\n",
|
||
|
" | Col1 Col2 Col3\n",
|
||
|
" | 0 0 0 0\n",
|
||
|
" | 1 0 0 0\n",
|
||
|
" | 2 0 0 0\n",
|
||
|
" | 3 10 13 17\n",
|
||
|
" | 4 20 23 27\n",
|
||
|
" | \n",
|
||
|
" | skew(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)\n",
|
||
|
" | Return unbiased skew over requested axis\n",
|
||
|
" | Normalized by N-1.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | skew : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | sort_index(self, axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True)\n",
|
||
|
" | Sort Series by index labels.\n",
|
||
|
" | \n",
|
||
|
" | Returns a new Series sorted by label if `inplace` argument is\n",
|
||
|
" | ``False``, otherwise updates the original series and returns None.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : int, default 0\n",
|
||
|
" | Axis to direct sorting. This can only be 0 for Series.\n",
|
||
|
" | level : int, optional\n",
|
||
|
" | If not None, sort on values in specified index level(s).\n",
|
||
|
" | ascending : bool, default true\n",
|
||
|
" | Sort ascending vs. descending.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | If True, perform operation in-place.\n",
|
||
|
" | kind : {'quicksort', 'mergesort', 'heapsort'}, default 'quicksort'\n",
|
||
|
" | Choice of sorting algorithm. See also :func:`numpy.sort` for more\n",
|
||
|
" | information. 'mergesort' is the only stable algorithm. For\n",
|
||
|
" | DataFrames, this option is only applied when sorting on a single\n",
|
||
|
" | column or label.\n",
|
||
|
" | na_position : {'first', 'last'}, default 'last'\n",
|
||
|
" | If 'first' puts NaNs at the beginning, 'last' puts NaNs at the end.\n",
|
||
|
" | Not implemented for MultiIndex.\n",
|
||
|
" | sort_remaining : bool, default True\n",
|
||
|
" | If true and sorting by level and index is multilevel, sort by other\n",
|
||
|
" | levels too (in order) after sorting by specified level.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | pandas.Series\n",
|
||
|
" | The original Series sorted by the labels\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.sort_index: Sort DataFrame by the index.\n",
|
||
|
" | DataFrame.sort_values: Sort DataFrame by the value.\n",
|
||
|
" | Series.sort_values : Sort Series by the value.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(['a', 'b', 'c', 'd'], index=[3, 2, 1, 4])\n",
|
||
|
" | >>> s.sort_index()\n",
|
||
|
" | 1 c\n",
|
||
|
" | 2 b\n",
|
||
|
" | 3 a\n",
|
||
|
" | 4 d\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | Sort Descending\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_index(ascending=False)\n",
|
||
|
" | 4 d\n",
|
||
|
" | 3 a\n",
|
||
|
" | 2 b\n",
|
||
|
" | 1 c\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | Sort Inplace\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_index(inplace=True)\n",
|
||
|
" | >>> s\n",
|
||
|
" | 1 c\n",
|
||
|
" | 2 b\n",
|
||
|
" | 3 a\n",
|
||
|
" | 4 d\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | By default NaNs are put at the end, but use `na_position` to place\n",
|
||
|
" | them at the beginning\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(['a', 'b', 'c', 'd'], index=[3, 2, 1, np.nan])\n",
|
||
|
" | >>> s.sort_index(na_position='first')\n",
|
||
|
" | NaN d\n",
|
||
|
" | 1.0 c\n",
|
||
|
" | 2.0 b\n",
|
||
|
" | 3.0 a\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | Specify index level to sort\n",
|
||
|
" | \n",
|
||
|
" | >>> arrays = [np.array(['qux', 'qux', 'foo', 'foo',\n",
|
||
|
" | ... 'baz', 'baz', 'bar', 'bar']),\n",
|
||
|
" | ... np.array(['two', 'one', 'two', 'one',\n",
|
||
|
" | ... 'two', 'one', 'two', 'one'])]\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4, 5, 6, 7, 8], index=arrays)\n",
|
||
|
" | >>> s.sort_index(level=1)\n",
|
||
|
" | bar one 8\n",
|
||
|
" | baz one 6\n",
|
||
|
" | foo one 4\n",
|
||
|
" | qux one 2\n",
|
||
|
" | bar two 7\n",
|
||
|
" | baz two 5\n",
|
||
|
" | foo two 3\n",
|
||
|
" | qux two 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Does not sort by remaining levels when sorting by levels\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_index(level=1, sort_remaining=False)\n",
|
||
|
" | qux one 2\n",
|
||
|
" | foo one 4\n",
|
||
|
" | baz one 6\n",
|
||
|
" | bar one 8\n",
|
||
|
" | qux two 1\n",
|
||
|
" | foo two 3\n",
|
||
|
" | baz two 5\n",
|
||
|
" | bar two 7\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | sort_values(self, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')\n",
|
||
|
" | Sort by the values.\n",
|
||
|
" | \n",
|
||
|
" | Sort a Series in ascending or descending order by some\n",
|
||
|
" | criterion.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index'}, default 0\n",
|
||
|
" | Axis to direct sorting. The value 'index' is accepted for\n",
|
||
|
" | compatibility with DataFrame.sort_values.\n",
|
||
|
" | ascending : bool, default True\n",
|
||
|
" | If True, sort values in ascending order, otherwise descending.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | If True, perform operation in-place.\n",
|
||
|
" | kind : {'quicksort', 'mergesort' or 'heapsort'}, default 'quicksort'\n",
|
||
|
" | Choice of sorting algorithm. See also :func:`numpy.sort` for more\n",
|
||
|
" | information. 'mergesort' is the only stable algorithm.\n",
|
||
|
" | na_position : {'first' or 'last'}, default 'last'\n",
|
||
|
" | Argument 'first' puts NaNs at the beginning, 'last' puts NaNs at\n",
|
||
|
" | the end.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | Series ordered by values.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.sort_index : Sort by the Series indices.\n",
|
||
|
" | DataFrame.sort_values : Sort DataFrame by the values along either axis.\n",
|
||
|
" | DataFrame.sort_index : Sort DataFrame by indices.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([np.nan, 1, 3, 10, 5])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 3 10.0\n",
|
||
|
" | 4 5.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Sort values ascending order (default behaviour)\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_values(ascending=True)\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 4 5.0\n",
|
||
|
" | 3 10.0\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Sort values descending order\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_values(ascending=False)\n",
|
||
|
" | 3 10.0\n",
|
||
|
" | 4 5.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Sort values inplace\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_values(ascending=False, inplace=True)\n",
|
||
|
" | >>> s\n",
|
||
|
" | 3 10.0\n",
|
||
|
" | 4 5.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Sort values putting NAs first\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_values(na_position='first')\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 3.0\n",
|
||
|
" | 4 5.0\n",
|
||
|
" | 3 10.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Sort a series of strings\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(['z', 'b', 'd', 'a', 'c'])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 z\n",
|
||
|
" | 1 b\n",
|
||
|
" | 2 d\n",
|
||
|
" | 3 a\n",
|
||
|
" | 4 c\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sort_values()\n",
|
||
|
" | 3 a\n",
|
||
|
" | 1 b\n",
|
||
|
" | 4 c\n",
|
||
|
" | 2 d\n",
|
||
|
" | 0 z\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | std(self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)\n",
|
||
|
" | Return sample standard deviation over requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Normalized by N-1 by default. This can be changed using the ddof argument\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar\n",
|
||
|
" | ddof : int, default 1\n",
|
||
|
" | Delta Degrees of Freedom. The divisor used in calculations is N - ddof,\n",
|
||
|
" | where N represents the number of elements.\n",
|
||
|
" | numeric_only : boolean, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | std : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | sub(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Subtraction of series and other, element-wise (binary operator `sub`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series - other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rsub\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | subtract = sub(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | \n",
|
||
|
" | sum(self, axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)\n",
|
||
|
" | Return the sum of the values for the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | This is equivalent to the method ``numpy.sum``.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | Axis for the function to be applied on.\n",
|
||
|
" | skipna : bool, default True\n",
|
||
|
" | Exclude NA/null values when computing the result.\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar.\n",
|
||
|
" | numeric_only : bool, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | min_count : int, default 0\n",
|
||
|
" | The required number of valid values to perform the operation. If fewer than\n",
|
||
|
" | ``min_count`` non-NA values are present the result will be NA.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded :: 0.22.0\n",
|
||
|
" | \n",
|
||
|
" | Added with the default being 0. This means the sum of an all-NA\n",
|
||
|
" | or empty Series is 0, and the product of an all-NA or empty\n",
|
||
|
" | Series is 1.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments to be passed to the function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | sum : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.sum : Return the sum.\n",
|
||
|
" | Series.min : Return the minimum.\n",
|
||
|
" | Series.max : Return the maximum.\n",
|
||
|
" | Series.idxmin : Return the index of the minimum.\n",
|
||
|
" | Series.idxmax : Return the index of the maximum.\n",
|
||
|
" | DataFrame.min : Return the sum over the requested axis.\n",
|
||
|
" | DataFrame.min : Return the minimum over the requested axis.\n",
|
||
|
" | DataFrame.max : Return the maximum over the requested axis.\n",
|
||
|
" | DataFrame.idxmin : Return the index of the minimum over the requested axis.\n",
|
||
|
" | DataFrame.idxmax : Return the index of the maximum over the requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> idx = pd.MultiIndex.from_arrays([\n",
|
||
|
" | ... ['warm', 'warm', 'cold', 'cold'],\n",
|
||
|
" | ... ['dog', 'falcon', 'fish', 'spider']],\n",
|
||
|
" | ... names=['blooded', 'animal'])\n",
|
||
|
" | >>> s = pd.Series([4, 2, 0, 8], name='legs', index=idx)\n",
|
||
|
" | >>> s\n",
|
||
|
" | blooded animal\n",
|
||
|
" | warm dog 4\n",
|
||
|
" | falcon 2\n",
|
||
|
" | cold fish 0\n",
|
||
|
" | spider 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sum()\n",
|
||
|
" | 14\n",
|
||
|
" | \n",
|
||
|
" | Sum using level names, as well as indices.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sum(level='blooded')\n",
|
||
|
" | blooded\n",
|
||
|
" | warm 6\n",
|
||
|
" | cold 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.sum(level=0)\n",
|
||
|
" | blooded\n",
|
||
|
" | warm 6\n",
|
||
|
" | cold 8\n",
|
||
|
" | Name: legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | By default, the sum of an empty or all-NA Series is ``0``.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([]).sum() # min_count=0 is the default\n",
|
||
|
" | 0.0\n",
|
||
|
" | \n",
|
||
|
" | This can be controlled with the ``min_count`` parameter. For example, if\n",
|
||
|
" | you'd like the sum of an empty series to be NaN, pass ``min_count=1``.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([]).sum(min_count=1)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | Thanks to the ``skipna`` parameter, ``min_count`` handles all-NA and\n",
|
||
|
" | empty series identically.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([np.nan]).sum()\n",
|
||
|
" | 0.0\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([np.nan]).sum(min_count=1)\n",
|
||
|
" | nan\n",
|
||
|
" | \n",
|
||
|
" | swaplevel(self, i=-2, j=-1, copy=True)\n",
|
||
|
" | Swap levels i and j in a MultiIndex.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | i, j : int, string (can be mixed)\n",
|
||
|
" | Level of index to be swapped. Can pass level name as string.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | swapped : Series\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.18.1\n",
|
||
|
" | \n",
|
||
|
" | The indexes ``i`` and ``j`` are now optional, and default to\n",
|
||
|
" | the two innermost levels of the index.\n",
|
||
|
" | \n",
|
||
|
" | to_csv(self, *args, **kwargs)\n",
|
||
|
" | Write object to a comma-separated values (csv) file.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | The order of arguments for Series was changed.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | path_or_buf : str or file handle, default None\n",
|
||
|
" | File path or object, if None is provided the result is returned as\n",
|
||
|
" | a string. If a file object is passed it should be opened with\n",
|
||
|
" | `newline=''`, disabling universal newlines.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Was previously named \"path\" for Series.\n",
|
||
|
" | \n",
|
||
|
" | sep : str, default ','\n",
|
||
|
" | String of length 1. Field delimiter for the output file.\n",
|
||
|
" | na_rep : str, default ''\n",
|
||
|
" | Missing data representation.\n",
|
||
|
" | float_format : str, default None\n",
|
||
|
" | Format string for floating point numbers.\n",
|
||
|
" | columns : sequence, optional\n",
|
||
|
" | Columns to write.\n",
|
||
|
" | header : bool or list of str, default True\n",
|
||
|
" | Write out the column names. If a list of strings is given it is\n",
|
||
|
" | assumed to be aliases for the column names.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Previously defaulted to False for Series.\n",
|
||
|
" | \n",
|
||
|
" | index : bool, default True\n",
|
||
|
" | Write row names (index).\n",
|
||
|
" | index_label : str or sequence, or False, default None\n",
|
||
|
" | Column label for index column(s) if desired. If None is given, and\n",
|
||
|
" | `header` and `index` are True, then the index names are used. A\n",
|
||
|
" | sequence should be given if the object uses MultiIndex. If\n",
|
||
|
" | False do not print fields for index names. Use index_label=False\n",
|
||
|
" | for easier importing in R.\n",
|
||
|
" | mode : str\n",
|
||
|
" | Python write mode, default 'w'.\n",
|
||
|
" | encoding : str, optional\n",
|
||
|
" | A string representing the encoding to use in the output file,\n",
|
||
|
" | defaults to 'ascii' on Python 2 and 'utf-8' on Python 3.\n",
|
||
|
" | compression : str, default 'infer'\n",
|
||
|
" | Compression mode among the following possible values: {'infer',\n",
|
||
|
" | 'gzip', 'bz2', 'zip', 'xz', None}. If 'infer' and `path_or_buf`\n",
|
||
|
" | is path-like, then detect compression from the following\n",
|
||
|
" | extensions: '.gz', '.bz2', '.zip' or '.xz'. (otherwise no\n",
|
||
|
" | compression).\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | 'infer' option added and set to default.\n",
|
||
|
" | \n",
|
||
|
" | quoting : optional constant from csv module\n",
|
||
|
" | Defaults to csv.QUOTE_MINIMAL. If you have set a `float_format`\n",
|
||
|
" | then floats are converted to strings and thus csv.QUOTE_NONNUMERIC\n",
|
||
|
" | will treat them as non-numeric.\n",
|
||
|
" | quotechar : str, default '\\\"'\n",
|
||
|
" | String of length 1. Character used to quote fields.\n",
|
||
|
" | line_terminator : string, optional\n",
|
||
|
" | The newline character or character sequence to use in the output\n",
|
||
|
" | file. Defaults to `os.linesep`, which depends on the OS in which\n",
|
||
|
" | this method is called ('\\n' for linux, '\\r\\n' for Windows, i.e.).\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | chunksize : int or None\n",
|
||
|
" | Rows to write at a time.\n",
|
||
|
" | tupleize_cols : bool, default False\n",
|
||
|
" | Write MultiIndex columns as a list of tuples (if True) or in\n",
|
||
|
" | the new, expanded format, where each MultiIndex column is a row\n",
|
||
|
" | in the CSV (if False).\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | This argument will be removed and will always write each row\n",
|
||
|
" | of the multi-index as a separate row in the CSV file.\n",
|
||
|
" | date_format : str, default None\n",
|
||
|
" | Format string for datetime objects.\n",
|
||
|
" | doublequote : bool, default True\n",
|
||
|
" | Control quoting of `quotechar` inside a field.\n",
|
||
|
" | escapechar : str, default None\n",
|
||
|
" | String of length 1. Character used to escape `sep` and `quotechar`\n",
|
||
|
" | when appropriate.\n",
|
||
|
" | decimal : str, default '.'\n",
|
||
|
" | Character recognized as decimal separator. E.g. use ',' for\n",
|
||
|
" | European data.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | None or str\n",
|
||
|
" | If path_or_buf is None, returns the resulting csv format as a\n",
|
||
|
" | string. Otherwise returns None.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | read_csv : Load a CSV file into a DataFrame.\n",
|
||
|
" | to_excel : Load an Excel file into a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'name': ['Raphael', 'Donatello'],\n",
|
||
|
" | ... 'mask': ['red', 'purple'],\n",
|
||
|
" | ... 'weapon': ['sai', 'bo staff']})\n",
|
||
|
" | >>> df.to_csv(index=False)\n",
|
||
|
" | 'name,mask,weapon\\nRaphael,red,sai\\nDonatello,purple,bo staff\\n'\n",
|
||
|
" | \n",
|
||
|
" | to_dict(self, into=<class 'dict'>)\n",
|
||
|
" | Convert Series to {label -> value} dict or dict-like object.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | into : class, default dict\n",
|
||
|
" | The collections.Mapping subclass to use as the return\n",
|
||
|
" | object. Can be the actual class or an empty\n",
|
||
|
" | instance of the mapping type you want. If you want a\n",
|
||
|
" | collections.defaultdict, you must pass it initialized.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | value_dict : collections.Mapping\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4])\n",
|
||
|
" | >>> s.to_dict()\n",
|
||
|
" | {0: 1, 1: 2, 2: 3, 3: 4}\n",
|
||
|
" | >>> from collections import OrderedDict, defaultdict\n",
|
||
|
" | >>> s.to_dict(OrderedDict)\n",
|
||
|
" | OrderedDict([(0, 1), (1, 2), (2, 3), (3, 4)])\n",
|
||
|
" | >>> dd = defaultdict(list)\n",
|
||
|
" | >>> s.to_dict(dd)\n",
|
||
|
" | defaultdict(<type 'list'>, {0: 1, 1: 2, 2: 3, 3: 4})\n",
|
||
|
" | \n",
|
||
|
" | to_frame(self, name=None)\n",
|
||
|
" | Convert Series to DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | name : object, default None\n",
|
||
|
" | The passed name should substitute for the series name (if it has\n",
|
||
|
" | one).\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | data_frame : DataFrame\n",
|
||
|
" | \n",
|
||
|
" | to_period(self, freq=None, copy=True)\n",
|
||
|
" | Convert Series from DatetimeIndex to PeriodIndex with desired\n",
|
||
|
" | frequency (inferred from index if not passed).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | freq : string, default\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | ts : Series with PeriodIndex\n",
|
||
|
" | \n",
|
||
|
" | to_sparse(self, kind='block', fill_value=None)\n",
|
||
|
" | Convert Series to SparseSeries.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | kind : {'block', 'integer'}\n",
|
||
|
" | fill_value : float, defaults to NaN (missing)\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | sp : SparseSeries\n",
|
||
|
" | \n",
|
||
|
" | to_string(self, buf=None, na_rep='NaN', float_format=None, header=True, index=True, length=False, dtype=False, name=False, max_rows=None)\n",
|
||
|
" | Render a string representation of the Series.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | buf : StringIO-like, optional\n",
|
||
|
" | buffer to write to\n",
|
||
|
" | na_rep : string, optional\n",
|
||
|
" | string representation of NAN to use, default 'NaN'\n",
|
||
|
" | float_format : one-parameter function, optional\n",
|
||
|
" | formatter function to apply to columns' elements if they are floats\n",
|
||
|
" | default None\n",
|
||
|
" | header : boolean, default True\n",
|
||
|
" | Add the Series header (index name)\n",
|
||
|
" | index : bool, optional\n",
|
||
|
" | Add index (row) labels, default True\n",
|
||
|
" | length : boolean, default False\n",
|
||
|
" | Add the Series length\n",
|
||
|
" | dtype : boolean, default False\n",
|
||
|
" | Add the Series dtype\n",
|
||
|
" | name : boolean, default False\n",
|
||
|
" | Add the Series name if not None\n",
|
||
|
" | max_rows : int, optional\n",
|
||
|
" | Maximum number of rows to show before truncating. If None, show\n",
|
||
|
" | all.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | formatted : string (if not buffer passed)\n",
|
||
|
" | \n",
|
||
|
" | to_timestamp(self, freq=None, how='start', copy=True)\n",
|
||
|
" | Cast to datetimeindex of timestamps, at *beginning* of period.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | freq : string, default frequency of PeriodIndex\n",
|
||
|
" | Desired frequency\n",
|
||
|
" | how : {'s', 'e', 'start', 'end'}\n",
|
||
|
" | Convention for converting period to timestamp; start of period\n",
|
||
|
" | vs. end\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | ts : Series with DatetimeIndex\n",
|
||
|
" | \n",
|
||
|
" | transform(self, func, axis=0, *args, **kwargs)\n",
|
||
|
" | Call ``func`` on self producing a Series with transformed values\n",
|
||
|
" | and that has the same axis length as self.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | func : function, str, list or dict\n",
|
||
|
" | Function to use for transforming the data. If a function, must either\n",
|
||
|
" | work when passed a Series or when passed to Series.apply.\n",
|
||
|
" | \n",
|
||
|
" | Accepted combinations are:\n",
|
||
|
" | \n",
|
||
|
" | - function\n",
|
||
|
" | - string function name\n",
|
||
|
" | - list of functions and/or function names, e.g. ``[np.exp. 'sqrt']``\n",
|
||
|
" | - dict of axis labels -> functions, function names or list of such.\n",
|
||
|
" | axis : {0 or 'index'}\n",
|
||
|
" | Parameter needed for compatibility with DataFrame.\n",
|
||
|
" | *args\n",
|
||
|
" | Positional arguments to pass to `func`.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Keyword arguments to pass to `func`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | A Series that must have the same length as self.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | ValueError : If the returned Series has a different length than self.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.agg : Only perform aggregating type operations.\n",
|
||
|
" | Series.apply : Invoke function on a Series.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'A': range(3), 'B': range(1, 4)})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 0 1\n",
|
||
|
" | 1 1 2\n",
|
||
|
" | 2 2 3\n",
|
||
|
" | >>> df.transform(lambda x: x + 1)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 1 2\n",
|
||
|
" | 1 2 3\n",
|
||
|
" | 2 3 4\n",
|
||
|
" | \n",
|
||
|
" | Even though the resulting Series must have the same length as the\n",
|
||
|
" | input Series, it is possible to provide several input functions:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(range(3))\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 0\n",
|
||
|
" | 1 1\n",
|
||
|
" | 2 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> s.transform([np.sqrt, np.exp])\n",
|
||
|
" | sqrt exp\n",
|
||
|
" | 0 0.000000 1.000000\n",
|
||
|
" | 1 1.000000 2.718282\n",
|
||
|
" | 2 1.414214 7.389056\n",
|
||
|
" | \n",
|
||
|
" | truediv(self, other, level=None, fill_value=None, axis=0)\n",
|
||
|
" | Floating division of series and other, element-wise (binary operator `truediv`).\n",
|
||
|
" | \n",
|
||
|
" | Equivalent to ``series / other``, but with support to substitute a fill_value for\n",
|
||
|
" | missing data in one of the inputs.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or scalar value\n",
|
||
|
" | fill_value : None or float value, default None (NaN)\n",
|
||
|
" | Fill existing missing (NaN) values, and any new element needed for\n",
|
||
|
" | successful Series alignment, with this value before computation.\n",
|
||
|
" | If data in both corresponding Series locations is missing\n",
|
||
|
" | the result will be missing\n",
|
||
|
" | level : int or name\n",
|
||
|
" | Broadcast across a level, matching Index values on the\n",
|
||
|
" | passed MultiIndex level\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | result : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rtruediv\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])\n",
|
||
|
" | >>> a\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])\n",
|
||
|
" | >>> b\n",
|
||
|
" | a 1.0\n",
|
||
|
" | b NaN\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> a.add(b, fill_value=0)\n",
|
||
|
" | a 2.0\n",
|
||
|
" | b 1.0\n",
|
||
|
" | c 1.0\n",
|
||
|
" | d 1.0\n",
|
||
|
" | e NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | unique(self)\n",
|
||
|
" | Return unique values of Series object.\n",
|
||
|
" | \n",
|
||
|
" | Uniques are returned in order of appearance. Hash table-based unique,\n",
|
||
|
" | therefore does NOT sort.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | ndarray or ExtensionArray\n",
|
||
|
" | The unique values returned as a NumPy array. In case of an\n",
|
||
|
" | extension-array backed Series, a new\n",
|
||
|
" | :class:`~api.extensions.ExtensionArray` of that type with just\n",
|
||
|
" | the unique values is returned. This includes\n",
|
||
|
" | \n",
|
||
|
" | * Categorical\n",
|
||
|
" | * Period\n",
|
||
|
" | * Datetime with Timezone\n",
|
||
|
" | * Interval\n",
|
||
|
" | * Sparse\n",
|
||
|
" | * IntegerNA\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | unique : Top-level unique method for any 1-d array-like object.\n",
|
||
|
" | Index.unique : Return Index with unique values from an Index object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> pd.Series([2, 1, 3, 3], name='A').unique()\n",
|
||
|
" | array([2, 1, 3])\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([pd.Timestamp('2016-01-01') for _ in range(3)]).unique()\n",
|
||
|
" | array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([pd.Timestamp('2016-01-01', tz='US/Eastern')\n",
|
||
|
" | ... for _ in range(3)]).unique()\n",
|
||
|
" | <DatetimeArray>\n",
|
||
|
" | ['2016-01-01 00:00:00-05:00']\n",
|
||
|
" | Length: 1, dtype: datetime64[ns, US/Eastern]\n",
|
||
|
" | \n",
|
||
|
" | An unordered Categorical will return categories in the order of\n",
|
||
|
" | appearance.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series(pd.Categorical(list('baabc'))).unique()\n",
|
||
|
" | [b, a, c]\n",
|
||
|
" | Categories (3, object): [b, a, c]\n",
|
||
|
" | \n",
|
||
|
" | An ordered Categorical preserves the category ordering.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series(pd.Categorical(list('baabc'), categories=list('abc'),\n",
|
||
|
" | ... ordered=True)).unique()\n",
|
||
|
" | [b, a, c]\n",
|
||
|
" | Categories (3, object): [a < b < c]\n",
|
||
|
" | \n",
|
||
|
" | unstack(self, level=-1, fill_value=None)\n",
|
||
|
" | Unstack, a.k.a. pivot, Series with MultiIndex to produce DataFrame.\n",
|
||
|
" | The level involved will automatically get sorted.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | level : int, string, or list of these, default last level\n",
|
||
|
" | Level(s) to unstack, can pass level name\n",
|
||
|
" | fill_value : replace NaN with this value if the unstack produces\n",
|
||
|
" | missing values\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | unstacked : DataFrame\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4],\n",
|
||
|
" | ... index=pd.MultiIndex.from_product([['one', 'two'], ['a', 'b']]))\n",
|
||
|
" | >>> s\n",
|
||
|
" | one a 1\n",
|
||
|
" | b 2\n",
|
||
|
" | two a 3\n",
|
||
|
" | b 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.unstack(level=-1)\n",
|
||
|
" | a b\n",
|
||
|
" | one 1 2\n",
|
||
|
" | two 3 4\n",
|
||
|
" | \n",
|
||
|
" | >>> s.unstack(level=0)\n",
|
||
|
" | one two\n",
|
||
|
" | a 1 3\n",
|
||
|
" | b 2 4\n",
|
||
|
" | \n",
|
||
|
" | update(self, other)\n",
|
||
|
" | Modify Series in place using non-NA values from passed\n",
|
||
|
" | Series. Aligns on index.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s.update(pd.Series([4, 5, 6]))\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 4\n",
|
||
|
" | 1 5\n",
|
||
|
" | 2 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(['a', 'b', 'c'])\n",
|
||
|
" | >>> s.update(pd.Series(['d', 'e'], index=[0, 2]))\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 d\n",
|
||
|
" | 1 b\n",
|
||
|
" | 2 e\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s.update(pd.Series([4, 5, 6, 7, 8]))\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 4\n",
|
||
|
" | 1 5\n",
|
||
|
" | 2 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | If ``other`` contains NaNs the corresponding values are not updated\n",
|
||
|
" | in the original Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s.update(pd.Series([4, np.nan, 6]))\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 4\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | valid(self, inplace=False, **kwargs)\n",
|
||
|
" | Return Series without null values.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.23.0\n",
|
||
|
" | Use :meth:`Series.dropna` instead.\n",
|
||
|
" | \n",
|
||
|
" | var(self, axis=None, skipna=None, level=None, ddof=1, numeric_only=None, **kwargs)\n",
|
||
|
" | Return unbiased variance over requested axis.\n",
|
||
|
" | \n",
|
||
|
" | Normalized by N-1 by default. This can be changed using the ddof argument\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {index (0)}\n",
|
||
|
" | skipna : boolean, default True\n",
|
||
|
" | Exclude NA/null values. If an entire row/column is NA, the result\n",
|
||
|
" | will be NA\n",
|
||
|
" | level : int or level name, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), count along a\n",
|
||
|
" | particular level, collapsing into a scalar\n",
|
||
|
" | ddof : int, default 1\n",
|
||
|
" | Delta Degrees of Freedom. The divisor used in calculations is N - ddof,\n",
|
||
|
" | where N represents the number of elements.\n",
|
||
|
" | numeric_only : boolean, default None\n",
|
||
|
" | Include only float, int, boolean columns. If None, will attempt to use\n",
|
||
|
" | everything, then use only numeric data. Not implemented for Series.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | var : scalar or Series (if level specified)\n",
|
||
|
" | \n",
|
||
|
" | view(self, dtype=None)\n",
|
||
|
" | Create a new view of the Series.\n",
|
||
|
" | \n",
|
||
|
" | This function will return a new Series with a view of the same\n",
|
||
|
" | underlying values in memory, optionally reinterpreted with a new data\n",
|
||
|
" | type. The new data type must preserve the same size in bytes as to not\n",
|
||
|
" | cause index misalignment.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | dtype : data type\n",
|
||
|
" | Data type object or one of their string representations.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series\n",
|
||
|
" | A new Series object as a view of the same data in memory.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.view : Equivalent numpy function to create a new view of\n",
|
||
|
" | the same data in memory.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Series are instantiated with ``dtype=float64`` by default. While\n",
|
||
|
" | ``numpy.ndarray.view()`` will return a view with the same data type as\n",
|
||
|
" | the original array, ``Series.view()`` (without specified dtype)\n",
|
||
|
" | will try using ``float64`` and may fail if the original data type size\n",
|
||
|
" | in bytes is not the same.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([-2, -1, 0, 1, 2], dtype='int8')\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 -2\n",
|
||
|
" | 1 -1\n",
|
||
|
" | 2 0\n",
|
||
|
" | 3 1\n",
|
||
|
" | 4 2\n",
|
||
|
" | dtype: int8\n",
|
||
|
" | \n",
|
||
|
" | The 8 bit signed integer representation of `-1` is `0b11111111`, but\n",
|
||
|
" | the same bytes represent 255 if read as an 8 bit unsigned integer:\n",
|
||
|
" | \n",
|
||
|
" | >>> us = s.view('uint8')\n",
|
||
|
" | >>> us\n",
|
||
|
" | 0 254\n",
|
||
|
" | 1 255\n",
|
||
|
" | 2 0\n",
|
||
|
" | 3 1\n",
|
||
|
" | 4 2\n",
|
||
|
" | dtype: uint8\n",
|
||
|
" | \n",
|
||
|
" | The views share the same underlying values:\n",
|
||
|
" | \n",
|
||
|
" | >>> us[0] = 128\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 -128\n",
|
||
|
" | 1 -1\n",
|
||
|
" | 2 0\n",
|
||
|
" | 3 1\n",
|
||
|
" | 4 2\n",
|
||
|
" | dtype: int8\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Class methods defined here:\n",
|
||
|
" | \n",
|
||
|
" | from_array(arr, index=None, name=None, dtype=None, copy=False, fastpath=False) from builtins.type\n",
|
||
|
" | Construct Series from array.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated :: 0.23.0\n",
|
||
|
" | Use pd.Series(..) constructor instead.\n",
|
||
|
" | \n",
|
||
|
" | from_csv(path, sep=',', parse_dates=True, header=None, index_col=0, encoding=None, infer_datetime_format=False) from builtins.type\n",
|
||
|
" | Read CSV file.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | Use :func:`pandas.read_csv` instead.\n",
|
||
|
" | \n",
|
||
|
" | It is preferable to use the more powerful :func:`pandas.read_csv`\n",
|
||
|
" | for most general purposes, but ``from_csv`` makes for an easy\n",
|
||
|
" | roundtrip to and from a file (the exact counterpart of\n",
|
||
|
" | ``to_csv``), especially with a time Series.\n",
|
||
|
" | \n",
|
||
|
" | This method only differs from :func:`pandas.read_csv` in some defaults:\n",
|
||
|
" | \n",
|
||
|
" | - `index_col` is ``0`` instead of ``None`` (take first column as index\n",
|
||
|
" | by default)\n",
|
||
|
" | - `header` is ``None`` instead of ``0`` (the first row is not used as\n",
|
||
|
" | the column names)\n",
|
||
|
" | - `parse_dates` is ``True`` instead of ``False`` (try parsing the index\n",
|
||
|
" | as datetime by default)\n",
|
||
|
" | \n",
|
||
|
" | With :func:`pandas.read_csv`, the option ``squeeze=True`` can be used\n",
|
||
|
" | to return a Series like ``from_csv``.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | path : string file path or file handle / StringIO\n",
|
||
|
" | sep : string, default ','\n",
|
||
|
" | Field delimiter\n",
|
||
|
" | parse_dates : boolean, default True\n",
|
||
|
" | Parse dates. Different default from read_table\n",
|
||
|
" | header : int, default None\n",
|
||
|
" | Row to use as header (skip prior rows)\n",
|
||
|
" | index_col : int or sequence, default 0\n",
|
||
|
" | Column to use for index. If a sequence is given, a MultiIndex\n",
|
||
|
" | is used. Different default from read_table\n",
|
||
|
" | encoding : string, optional\n",
|
||
|
" | a string representing the encoding to use if the contents are\n",
|
||
|
" | non-ascii, for python versions prior to 3\n",
|
||
|
" | infer_datetime_format : boolean, default False\n",
|
||
|
" | If True and `parse_dates` is True for a column, try to infer the\n",
|
||
|
" | datetime format based on the first datetime string. If the format\n",
|
||
|
" | can be inferred, there often will be a large parsing speed-up.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | y : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | read_csv\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Data descriptors defined here:\n",
|
||
|
" | \n",
|
||
|
" | asobject\n",
|
||
|
" | Return object Series which contains boxed values.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated :: 0.23.0\n",
|
||
|
" | \n",
|
||
|
" | Use ``astype(object)`` instead.\n",
|
||
|
" | \n",
|
||
|
" | *this is an internal non-public method*\n",
|
||
|
" | \n",
|
||
|
" | axes\n",
|
||
|
" | Return a list of the row axis labels.\n",
|
||
|
" | \n",
|
||
|
" | dtype\n",
|
||
|
" | Return the dtype object of the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | dtypes\n",
|
||
|
" | Return the dtype object of the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | ftype\n",
|
||
|
" | Return if the data is sparse|dense.\n",
|
||
|
" | \n",
|
||
|
" | ftypes\n",
|
||
|
" | Return if the data is sparse|dense.\n",
|
||
|
" | \n",
|
||
|
" | hasnans\n",
|
||
|
" | Return if I have any nans; enables various perf speedups.\n",
|
||
|
" | \n",
|
||
|
" | imag\n",
|
||
|
" | Return imag value of vector.\n",
|
||
|
" | \n",
|
||
|
" | index\n",
|
||
|
" | The index (axis labels) of the Series.\n",
|
||
|
" | \n",
|
||
|
" | name\n",
|
||
|
" | Return name of the Series.\n",
|
||
|
" | \n",
|
||
|
" | real\n",
|
||
|
" | Return the real value of vector.\n",
|
||
|
" | \n",
|
||
|
" | values\n",
|
||
|
" | Return Series as ndarray or ndarray-like depending on the dtype.\n",
|
||
|
" | \n",
|
||
|
" | .. warning::\n",
|
||
|
" | \n",
|
||
|
" | We recommend using :attr:`Series.array` or\n",
|
||
|
" | :meth:`Series.to_numpy`, depending on whether you need\n",
|
||
|
" | a reference to the underlying data or a NumPy array.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | arr : numpy.ndarray or ndarray-like\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.array : Reference to the underlying data.\n",
|
||
|
" | Series.to_numpy : A NumPy array representing the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> pd.Series([1, 2, 3]).values\n",
|
||
|
" | array([1, 2, 3])\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series(list('aabc')).values\n",
|
||
|
" | array(['a', 'a', 'b', 'c'], dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series(list('aabc')).astype('category').values\n",
|
||
|
" | [a, a, b, c]\n",
|
||
|
" | Categories (3, object): [a, b, c]\n",
|
||
|
" | \n",
|
||
|
" | Timezone aware datetime data is converted to UTC:\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series(pd.date_range('20130101', periods=3,\n",
|
||
|
" | ... tz='US/Eastern')).values\n",
|
||
|
" | array(['2013-01-01T05:00:00.000000000',\n",
|
||
|
" | '2013-01-02T05:00:00.000000000',\n",
|
||
|
" | '2013-01-03T05:00:00.000000000'], dtype='datetime64[ns]')\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Data and other attributes defined here:\n",
|
||
|
" | \n",
|
||
|
" | cat = <class 'pandas.core.arrays.categorical.CategoricalAccessor'>\n",
|
||
|
" | Accessor object for categorical properties of the Series values.\n",
|
||
|
" | \n",
|
||
|
" | Be aware that assigning to `categories` is a inplace operation, while all\n",
|
||
|
" | methods return new categorical data per default (but can be called with\n",
|
||
|
" | `inplace=True`).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | data : Series or CategoricalIndex\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s.cat.categories\n",
|
||
|
" | >>> s.cat.categories = list('abc')\n",
|
||
|
" | >>> s.cat.rename_categories(list('cab'))\n",
|
||
|
" | >>> s.cat.reorder_categories(list('cab'))\n",
|
||
|
" | >>> s.cat.add_categories(['d','e'])\n",
|
||
|
" | >>> s.cat.remove_categories(['d'])\n",
|
||
|
" | >>> s.cat.remove_unused_categories()\n",
|
||
|
" | >>> s.cat.set_categories(list('abcde'))\n",
|
||
|
" | >>> s.cat.as_ordered()\n",
|
||
|
" | >>> s.cat.as_unordered()\n",
|
||
|
" | \n",
|
||
|
" | dt = <class 'pandas.core.indexes.accessors.CombinedDatetimelikePropert...\n",
|
||
|
" | Accessor object for datetimelike properties of the Series values.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s.dt.hour\n",
|
||
|
" | >>> s.dt.second\n",
|
||
|
" | >>> s.dt.quarter\n",
|
||
|
" | \n",
|
||
|
" | Returns a Series indexed like the original Series.\n",
|
||
|
" | Raises TypeError if the Series does not contain datetimelike values.\n",
|
||
|
" | \n",
|
||
|
" | plot = <class 'pandas.plotting._core.SeriesPlotMethods'>\n",
|
||
|
" | Series plotting accessor and method.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s.plot.line()\n",
|
||
|
" | >>> s.plot.bar()\n",
|
||
|
" | >>> s.plot.hist()\n",
|
||
|
" | \n",
|
||
|
" | Plotting methods can also be accessed by calling the accessor as a method\n",
|
||
|
" | with the ``kind`` argument:\n",
|
||
|
" | ``s.plot(kind='line')`` is equivalent to ``s.plot.line()``\n",
|
||
|
" | \n",
|
||
|
" | sparse = <class 'pandas.core.arrays.sparse.SparseAccessor'>\n",
|
||
|
" | Accessor for SparseSparse from other sparse matrix data types.\n",
|
||
|
" | \n",
|
||
|
" | str = <class 'pandas.core.strings.StringMethods'>\n",
|
||
|
" | Vectorized string functions for Series and Index. NAs stay NA unless\n",
|
||
|
" | handled otherwise by a particular method. Patterned after Python's string\n",
|
||
|
" | methods, with some inspiration from R's stringr package.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s.str.split('_')\n",
|
||
|
" | >>> s.str.replace('_', '')\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Methods inherited from pandas.core.base.IndexOpsMixin:\n",
|
||
|
" | \n",
|
||
|
" | __iter__(self)\n",
|
||
|
" | Return an iterator of the values.\n",
|
||
|
" | \n",
|
||
|
" | These are each a scalar type, which is a Python scalar\n",
|
||
|
" | (for str, int, float) or a pandas scalar\n",
|
||
|
" | (for Timestamp/Timedelta/Interval/Period)\n",
|
||
|
" | \n",
|
||
|
" | factorize(self, sort=False, na_sentinel=-1)\n",
|
||
|
" | Encode the object as an enumerated type or categorical variable.\n",
|
||
|
" | \n",
|
||
|
" | This method is useful for obtaining a numeric representation of an\n",
|
||
|
" | array when all that matters is identifying distinct values. `factorize`\n",
|
||
|
" | is available as both a top-level function :func:`pandas.factorize`,\n",
|
||
|
" | and as a method :meth:`Series.factorize` and :meth:`Index.factorize`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | sort : boolean, default False\n",
|
||
|
" | Sort `uniques` and shuffle `labels` to maintain the\n",
|
||
|
" | relationship.\n",
|
||
|
" | \n",
|
||
|
" | na_sentinel : int, default -1\n",
|
||
|
" | Value to mark \"not found\".\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | labels : ndarray\n",
|
||
|
" | An integer ndarray that's an indexer into `uniques`.\n",
|
||
|
" | ``uniques.take(labels)`` will have the same values as `values`.\n",
|
||
|
" | uniques : ndarray, Index, or Categorical\n",
|
||
|
" | The unique valid values. When `values` is Categorical, `uniques`\n",
|
||
|
" | is a Categorical. When `values` is some other pandas object, an\n",
|
||
|
" | `Index` is returned. Otherwise, a 1-D ndarray is returned.\n",
|
||
|
" | \n",
|
||
|
" | .. note ::\n",
|
||
|
" | \n",
|
||
|
" | Even if there's a missing value in `values`, `uniques` will\n",
|
||
|
" | *not* contain an entry for it.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | cut : Discretize continuous-valued array.\n",
|
||
|
" | unique : Find the unique value in an array.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | These examples all show factorize as a top-level method like\n",
|
||
|
" | ``pd.factorize(values)``. The results are identical for methods like\n",
|
||
|
" | :meth:`Series.factorize`.\n",
|
||
|
" | \n",
|
||
|
" | >>> labels, uniques = pd.factorize(['b', 'b', 'a', 'c', 'b'])\n",
|
||
|
" | >>> labels\n",
|
||
|
" | array([0, 0, 1, 2, 0])\n",
|
||
|
" | >>> uniques\n",
|
||
|
" | array(['b', 'a', 'c'], dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | With ``sort=True``, the `uniques` will be sorted, and `labels` will be\n",
|
||
|
" | shuffled so that the relationship is the maintained.\n",
|
||
|
" | \n",
|
||
|
" | >>> labels, uniques = pd.factorize(['b', 'b', 'a', 'c', 'b'], sort=True)\n",
|
||
|
" | >>> labels\n",
|
||
|
" | array([1, 1, 0, 2, 1])\n",
|
||
|
" | >>> uniques\n",
|
||
|
" | array(['a', 'b', 'c'], dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | Missing values are indicated in `labels` with `na_sentinel`\n",
|
||
|
" | (``-1`` by default). Note that missing values are never\n",
|
||
|
" | included in `uniques`.\n",
|
||
|
" | \n",
|
||
|
" | >>> labels, uniques = pd.factorize(['b', None, 'a', 'c', 'b'])\n",
|
||
|
" | >>> labels\n",
|
||
|
" | array([ 0, -1, 1, 2, 0])\n",
|
||
|
" | >>> uniques\n",
|
||
|
" | array(['b', 'a', 'c'], dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | Thus far, we've only factorized lists (which are internally coerced to\n",
|
||
|
" | NumPy arrays). When factorizing pandas objects, the type of `uniques`\n",
|
||
|
" | will differ. For Categoricals, a `Categorical` is returned.\n",
|
||
|
" | \n",
|
||
|
" | >>> cat = pd.Categorical(['a', 'a', 'c'], categories=['a', 'b', 'c'])\n",
|
||
|
" | >>> labels, uniques = pd.factorize(cat)\n",
|
||
|
" | >>> labels\n",
|
||
|
" | array([0, 0, 1])\n",
|
||
|
" | >>> uniques\n",
|
||
|
" | [a, c]\n",
|
||
|
" | Categories (3, object): [a, b, c]\n",
|
||
|
" | \n",
|
||
|
" | Notice that ``'b'`` is in ``uniques.categories``, despite not being\n",
|
||
|
" | present in ``cat.values``.\n",
|
||
|
" | \n",
|
||
|
" | For all other pandas objects, an Index of the appropriate type is\n",
|
||
|
" | returned.\n",
|
||
|
" | \n",
|
||
|
" | >>> cat = pd.Series(['a', 'a', 'c'])\n",
|
||
|
" | >>> labels, uniques = pd.factorize(cat)\n",
|
||
|
" | >>> labels\n",
|
||
|
" | array([0, 0, 1])\n",
|
||
|
" | >>> uniques\n",
|
||
|
" | Index(['a', 'c'], dtype='object')\n",
|
||
|
" | \n",
|
||
|
" | item(self)\n",
|
||
|
" | Return the first element of the underlying data as a python scalar.\n",
|
||
|
" | \n",
|
||
|
" | nunique(self, dropna=True)\n",
|
||
|
" | Return number of unique elements in the object.\n",
|
||
|
" | \n",
|
||
|
" | Excludes NA values by default.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | dropna : boolean, default True\n",
|
||
|
" | Don't include NaN in the count.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | nunique : int\n",
|
||
|
" | \n",
|
||
|
" | to_list = tolist(self)\n",
|
||
|
" | Return a list of the values.\n",
|
||
|
" | \n",
|
||
|
" | These are each a scalar type, which is a Python scalar\n",
|
||
|
" | (for str, int, float) or a pandas scalar\n",
|
||
|
" | (for Timestamp/Timedelta/Interval/Period)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.tolist\n",
|
||
|
" | \n",
|
||
|
" | to_numpy(self, dtype=None, copy=False)\n",
|
||
|
" | A NumPy ndarray representing the values in this Series or Index.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | dtype : str or numpy.dtype, optional\n",
|
||
|
" | The dtype to pass to :meth:`numpy.asarray`\n",
|
||
|
" | copy : bool, default False\n",
|
||
|
" | Whether to ensure that the returned value is a not a view on\n",
|
||
|
" | another array. Note that ``copy=False`` does not *ensure* that\n",
|
||
|
" | ``to_numpy()`` is no-copy. Rather, ``copy=True`` ensure that\n",
|
||
|
" | a copy is made, even if not strictly necessary.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | numpy.ndarray\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.array : Get the actual data stored within.\n",
|
||
|
" | Index.array : Get the actual data stored within.\n",
|
||
|
" | DataFrame.to_numpy : Similar method for DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | The returned array will be the same up to equality (values equal\n",
|
||
|
" | in `self` will be equal in the returned array; likewise for values\n",
|
||
|
" | that are not equal). When `self` contains an ExtensionArray, the\n",
|
||
|
" | dtype may be different. For example, for a category-dtype Series,\n",
|
||
|
" | ``to_numpy()`` will return a NumPy array and the categorical dtype\n",
|
||
|
" | will be lost.\n",
|
||
|
" | \n",
|
||
|
" | For NumPy dtypes, this will be a reference to the actual data stored\n",
|
||
|
" | in this Series or Index (assuming ``copy=False``). Modifying the result\n",
|
||
|
" | in place will modify the data stored in the Series or Index (not that\n",
|
||
|
" | we recommend doing that).\n",
|
||
|
" | \n",
|
||
|
" | For extension types, ``to_numpy()`` *may* require copying data and\n",
|
||
|
" | coercing the result to a NumPy type (possibly object), which may be\n",
|
||
|
" | expensive. When you need a no-copy reference to the underlying data,\n",
|
||
|
" | :attr:`Series.array` should be used instead.\n",
|
||
|
" | \n",
|
||
|
" | This table lays out the different dtypes and default return types of\n",
|
||
|
" | ``to_numpy()`` for various dtypes within pandas.\n",
|
||
|
" | \n",
|
||
|
" | ================== ================================\n",
|
||
|
" | dtype array type\n",
|
||
|
" | ================== ================================\n",
|
||
|
" | category[T] ndarray[T] (same dtype as input)\n",
|
||
|
" | period ndarray[object] (Periods)\n",
|
||
|
" | interval ndarray[object] (Intervals)\n",
|
||
|
" | IntegerNA ndarray[object]\n",
|
||
|
" | datetime64[ns] datetime64[ns]\n",
|
||
|
" | datetime64[ns, tz] ndarray[object] (Timestamps)\n",
|
||
|
" | ================== ================================\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> ser = pd.Series(pd.Categorical(['a', 'b', 'a']))\n",
|
||
|
" | >>> ser.to_numpy()\n",
|
||
|
" | array(['a', 'b', 'a'], dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | Specify the `dtype` to control how datetime-aware data is represented.\n",
|
||
|
" | Use ``dtype=object`` to return an ndarray of pandas :class:`Timestamp`\n",
|
||
|
" | objects, each with the correct ``tz``.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series(pd.date_range('2000', periods=2, tz=\"CET\"))\n",
|
||
|
" | >>> ser.to_numpy(dtype=object)\n",
|
||
|
" | array([Timestamp('2000-01-01 00:00:00+0100', tz='CET', freq='D'),\n",
|
||
|
" | Timestamp('2000-01-02 00:00:00+0100', tz='CET', freq='D')],\n",
|
||
|
" | dtype=object)\n",
|
||
|
" | \n",
|
||
|
" | Or ``dtype='datetime64[ns]'`` to return an ndarray of native\n",
|
||
|
" | datetime64 values. The values are converted to UTC and the timezone\n",
|
||
|
" | info is dropped.\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.to_numpy(dtype=\"datetime64[ns]\")\n",
|
||
|
" | ... # doctest: +ELLIPSIS\n",
|
||
|
" | array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00...'],\n",
|
||
|
" | dtype='datetime64[ns]')\n",
|
||
|
" | \n",
|
||
|
" | tolist(self)\n",
|
||
|
" | Return a list of the values.\n",
|
||
|
" | \n",
|
||
|
" | These are each a scalar type, which is a Python scalar\n",
|
||
|
" | (for str, int, float) or a pandas scalar\n",
|
||
|
" | (for Timestamp/Timedelta/Interval/Period)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.ndarray.tolist\n",
|
||
|
" | \n",
|
||
|
" | transpose(self, *args, **kwargs)\n",
|
||
|
" | Return the transpose, which is by definition self.\n",
|
||
|
" | \n",
|
||
|
" | value_counts(self, normalize=False, sort=True, ascending=False, bins=None, dropna=True)\n",
|
||
|
" | Return a Series containing counts of unique values.\n",
|
||
|
" | \n",
|
||
|
" | The resulting object will be in descending order so that the\n",
|
||
|
" | first element is the most frequently-occurring element.\n",
|
||
|
" | Excludes NA values by default.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | normalize : boolean, default False\n",
|
||
|
" | If True then the object returned will contain the relative\n",
|
||
|
" | frequencies of the unique values.\n",
|
||
|
" | sort : boolean, default True\n",
|
||
|
" | Sort by values.\n",
|
||
|
" | ascending : boolean, default False\n",
|
||
|
" | Sort in ascending order.\n",
|
||
|
" | bins : integer, optional\n",
|
||
|
" | Rather than count values, group them into half-open bins,\n",
|
||
|
" | a convenience for ``pd.cut``, only works with numeric data.\n",
|
||
|
" | dropna : boolean, default True\n",
|
||
|
" | Don't include counts of NaN.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | counts : Series\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.count: Number of non-NA elements in a Series.\n",
|
||
|
" | DataFrame.count: Number of non-NA elements in a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> index = pd.Index([3, 1, 2, 3, 4, np.nan])\n",
|
||
|
" | >>> index.value_counts()\n",
|
||
|
" | 3.0 2\n",
|
||
|
" | 4.0 1\n",
|
||
|
" | 2.0 1\n",
|
||
|
" | 1.0 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | With `normalize` set to `True`, returns the relative frequency by\n",
|
||
|
" | dividing all values by the sum of values.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([3, 1, 2, 3, 4, np.nan])\n",
|
||
|
" | >>> s.value_counts(normalize=True)\n",
|
||
|
" | 3.0 0.4\n",
|
||
|
" | 4.0 0.2\n",
|
||
|
" | 2.0 0.2\n",
|
||
|
" | 1.0 0.2\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | **bins**\n",
|
||
|
" | \n",
|
||
|
" | Bins can be useful for going from a continuous variable to a\n",
|
||
|
" | categorical variable; instead of counting unique\n",
|
||
|
" | apparitions of values, divide the index in the specified\n",
|
||
|
" | number of half-open bins.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.value_counts(bins=3)\n",
|
||
|
" | (2.0, 3.0] 2\n",
|
||
|
" | (0.996, 2.0] 2\n",
|
||
|
" | (3.0, 4.0] 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | **dropna**\n",
|
||
|
" | \n",
|
||
|
" | With `dropna` set to `False` we can also see NaN index values.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.value_counts(dropna=False)\n",
|
||
|
" | 3.0 2\n",
|
||
|
" | NaN 1\n",
|
||
|
" | 4.0 1\n",
|
||
|
" | 2.0 1\n",
|
||
|
" | 1.0 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Data descriptors inherited from pandas.core.base.IndexOpsMixin:\n",
|
||
|
" | \n",
|
||
|
" | T\n",
|
||
|
" | Return the transpose, which is by definition self.\n",
|
||
|
" | \n",
|
||
|
" | __dict__\n",
|
||
|
" | dictionary for instance variables (if defined)\n",
|
||
|
" | \n",
|
||
|
" | __weakref__\n",
|
||
|
" | list of weak references to the object (if defined)\n",
|
||
|
" | \n",
|
||
|
" | array\n",
|
||
|
" | The ExtensionArray of the data backing this Series or Index.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | array : ExtensionArray\n",
|
||
|
" | An ExtensionArray of the values stored within. For extension\n",
|
||
|
" | types, this is the actual array. For NumPy native types, this\n",
|
||
|
" | is a thin (no copy) wrapper around :class:`numpy.ndarray`.\n",
|
||
|
" | \n",
|
||
|
" | ``.array`` differs ``.values`` which may require converting the\n",
|
||
|
" | data to a different form.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Index.to_numpy : Similar method that always returns a NumPy array.\n",
|
||
|
" | Series.to_numpy : Similar method that always returns a NumPy array.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This table lays out the different array types for each extension\n",
|
||
|
" | dtype within pandas.\n",
|
||
|
" | \n",
|
||
|
" | ================== =============================\n",
|
||
|
" | dtype array type\n",
|
||
|
" | ================== =============================\n",
|
||
|
" | category Categorical\n",
|
||
|
" | period PeriodArray\n",
|
||
|
" | interval IntervalArray\n",
|
||
|
" | IntegerNA IntegerArray\n",
|
||
|
" | datetime64[ns, tz] DatetimeArray\n",
|
||
|
" | ================== =============================\n",
|
||
|
" | \n",
|
||
|
" | For any 3rd-party extension types, the array type will be an\n",
|
||
|
" | ExtensionArray.\n",
|
||
|
" | \n",
|
||
|
" | For all remaining dtypes ``.array`` will be a\n",
|
||
|
" | :class:`arrays.NumpyExtensionArray` wrapping the actual ndarray\n",
|
||
|
" | stored within. If you absolutely need a NumPy array (possibly with\n",
|
||
|
" | copying / coercing data), then use :meth:`Series.to_numpy` instead.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | For regular NumPy types like int, and float, a PandasArray\n",
|
||
|
" | is returned.\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.Series([1, 2, 3]).array\n",
|
||
|
" | <PandasArray>\n",
|
||
|
" | [1, 2, 3]\n",
|
||
|
" | Length: 3, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | For extension types, like Categorical, the actual ExtensionArray\n",
|
||
|
" | is returned\n",
|
||
|
" | \n",
|
||
|
" | >>> ser = pd.Series(pd.Categorical(['a', 'b', 'a']))\n",
|
||
|
" | >>> ser.array\n",
|
||
|
" | [a, b, a]\n",
|
||
|
" | Categories (2, object): [a, b]\n",
|
||
|
" | \n",
|
||
|
" | base\n",
|
||
|
" | Return the base object if the memory of the underlying data is shared.\n",
|
||
|
" | \n",
|
||
|
" | data\n",
|
||
|
" | Return the data pointer of the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | empty\n",
|
||
|
" | \n",
|
||
|
" | flags\n",
|
||
|
" | Return the ndarray.flags for the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | is_monotonic\n",
|
||
|
" | Return boolean if values in the object are\n",
|
||
|
" | monotonic_increasing.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | is_monotonic : boolean\n",
|
||
|
" | \n",
|
||
|
" | is_monotonic_decreasing\n",
|
||
|
" | Return boolean if values in the object are\n",
|
||
|
" | monotonic_decreasing.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | is_monotonic_decreasing : boolean\n",
|
||
|
" | \n",
|
||
|
" | is_monotonic_increasing\n",
|
||
|
" | Return boolean if values in the object are\n",
|
||
|
" | monotonic_increasing.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | is_monotonic : boolean\n",
|
||
|
" | \n",
|
||
|
" | is_unique\n",
|
||
|
" | Return boolean if values in the object are unique.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | is_unique : boolean\n",
|
||
|
" | \n",
|
||
|
" | itemsize\n",
|
||
|
" | Return the size of the dtype of the item of the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | nbytes\n",
|
||
|
" | Return the number of bytes in the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | ndim\n",
|
||
|
" | Number of dimensions of the underlying data, by definition 1.\n",
|
||
|
" | \n",
|
||
|
" | shape\n",
|
||
|
" | Return a tuple of the shape of the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | size\n",
|
||
|
" | Return the number of elements in the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | strides\n",
|
||
|
" | Return the strides of the underlying data.\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Data and other attributes inherited from pandas.core.base.IndexOpsMixin:\n",
|
||
|
" | \n",
|
||
|
" | __array_priority__ = 1000\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Methods inherited from pandas.core.generic.NDFrame:\n",
|
||
|
" | \n",
|
||
|
" | __abs__(self)\n",
|
||
|
" | \n",
|
||
|
" | __bool__ = __nonzero__(self)\n",
|
||
|
" | \n",
|
||
|
" | __contains__(self, key)\n",
|
||
|
" | True if the key is in the info axis\n",
|
||
|
" | \n",
|
||
|
" | __copy__(self, deep=True)\n",
|
||
|
" | \n",
|
||
|
" | __deepcopy__(self, memo=None)\n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | memo, default None\n",
|
||
|
" | Standard signature. Unused\n",
|
||
|
" | \n",
|
||
|
" | __delitem__(self, key)\n",
|
||
|
" | Delete item\n",
|
||
|
" | \n",
|
||
|
" | __finalize__(self, other, method=None, **kwargs)\n",
|
||
|
" | Propagate metadata from other to self.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : the object from which to get the attributes that we are going\n",
|
||
|
" | to propagate\n",
|
||
|
" | method : optional, a passed method name ; possibly to take different\n",
|
||
|
" | types of propagation actions based on this\n",
|
||
|
" | \n",
|
||
|
" | __getattr__(self, name)\n",
|
||
|
" | After regular attribute access, try looking up the name\n",
|
||
|
" | This allows simpler access to columns for interactive use.\n",
|
||
|
" | \n",
|
||
|
" | __getstate__(self)\n",
|
||
|
" | \n",
|
||
|
" | __hash__(self)\n",
|
||
|
" | Return hash(self).\n",
|
||
|
" | \n",
|
||
|
" | __invert__(self)\n",
|
||
|
" | \n",
|
||
|
" | __neg__(self)\n",
|
||
|
" | \n",
|
||
|
" | __nonzero__(self)\n",
|
||
|
" | \n",
|
||
|
" | __pos__(self)\n",
|
||
|
" | \n",
|
||
|
" | __round__(self, decimals=0)\n",
|
||
|
" | \n",
|
||
|
" | __setattr__(self, name, value)\n",
|
||
|
" | After regular attribute access, try setting the name\n",
|
||
|
" | This allows simpler access to columns for interactive use.\n",
|
||
|
" | \n",
|
||
|
" | __setstate__(self, state)\n",
|
||
|
" | \n",
|
||
|
" | abs(self)\n",
|
||
|
" | Return a Series/DataFrame with absolute numeric value of each element.\n",
|
||
|
" | \n",
|
||
|
" | This function only applies to elements that are all numeric.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | abs\n",
|
||
|
" | Series/DataFrame containing the absolute value of each element.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.absolute : Calculate the absolute value element-wise.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | For ``complex`` inputs, ``1.2 + 1j``, the absolute value is\n",
|
||
|
" | :math:`\\sqrt{ a^2 + b^2 }`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Absolute numeric values in a Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([-1.10, 2, -3.33, 4])\n",
|
||
|
" | >>> s.abs()\n",
|
||
|
" | 0 1.10\n",
|
||
|
" | 1 2.00\n",
|
||
|
" | 2 3.33\n",
|
||
|
" | 3 4.00\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Absolute numeric values in a Series with complex numbers.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1.2 + 1j])\n",
|
||
|
" | >>> s.abs()\n",
|
||
|
" | 0 1.56205\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Absolute numeric values in a Series with a Timedelta element.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([pd.Timedelta('1 days')])\n",
|
||
|
" | >>> s.abs()\n",
|
||
|
" | 0 1 days\n",
|
||
|
" | dtype: timedelta64[ns]\n",
|
||
|
" | \n",
|
||
|
" | Select rows with data closest to certain value using argsort (from\n",
|
||
|
" | `StackOverflow <https://stackoverflow.com/a/17758115>`__).\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\n",
|
||
|
" | ... 'a': [4, 5, 6, 7],\n",
|
||
|
" | ... 'b': [10, 20, 30, 40],\n",
|
||
|
" | ... 'c': [100, 50, -30, -50]\n",
|
||
|
" | ... })\n",
|
||
|
" | >>> df\n",
|
||
|
" | a b c\n",
|
||
|
" | 0 4 10 100\n",
|
||
|
" | 1 5 20 50\n",
|
||
|
" | 2 6 30 -30\n",
|
||
|
" | 3 7 40 -50\n",
|
||
|
" | >>> df.loc[(df.c - 43).abs().argsort()]\n",
|
||
|
" | a b c\n",
|
||
|
" | 1 5 20 50\n",
|
||
|
" | 0 4 10 100\n",
|
||
|
" | 2 6 30 -30\n",
|
||
|
" | 3 7 40 -50\n",
|
||
|
" | \n",
|
||
|
" | add_prefix(self, prefix)\n",
|
||
|
" | Prefix labels with string `prefix`.\n",
|
||
|
" | \n",
|
||
|
" | For Series, the row labels are prefixed.\n",
|
||
|
" | For DataFrame, the column labels are prefixed.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | prefix : str\n",
|
||
|
" | The string to add before each label.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | New Series or DataFrame with updated labels.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.add_suffix: Suffix row labels with string `suffix`.\n",
|
||
|
" | DataFrame.add_suffix: Suffix column labels with string `suffix`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.add_prefix('item_')\n",
|
||
|
" | item_0 1\n",
|
||
|
" | item_1 2\n",
|
||
|
" | item_2 3\n",
|
||
|
" | item_3 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [3, 4, 5, 6]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 1 3\n",
|
||
|
" | 1 2 4\n",
|
||
|
" | 2 3 5\n",
|
||
|
" | 3 4 6\n",
|
||
|
" | \n",
|
||
|
" | >>> df.add_prefix('col_')\n",
|
||
|
" | col_A col_B\n",
|
||
|
" | 0 1 3\n",
|
||
|
" | 1 2 4\n",
|
||
|
" | 2 3 5\n",
|
||
|
" | 3 4 6\n",
|
||
|
" | \n",
|
||
|
" | add_suffix(self, suffix)\n",
|
||
|
" | Suffix labels with string `suffix`.\n",
|
||
|
" | \n",
|
||
|
" | For Series, the row labels are suffixed.\n",
|
||
|
" | For DataFrame, the column labels are suffixed.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | suffix : str\n",
|
||
|
" | The string to add after each label.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | New Series or DataFrame with updated labels.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.add_prefix: Prefix row labels with string `prefix`.\n",
|
||
|
" | DataFrame.add_prefix: Prefix column labels with string `prefix`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.add_suffix('_item')\n",
|
||
|
" | 0_item 1\n",
|
||
|
" | 1_item 2\n",
|
||
|
" | 2_item 3\n",
|
||
|
" | 3_item 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [3, 4, 5, 6]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 1 3\n",
|
||
|
" | 1 2 4\n",
|
||
|
" | 2 3 5\n",
|
||
|
" | 3 4 6\n",
|
||
|
" | \n",
|
||
|
" | >>> df.add_suffix('_col')\n",
|
||
|
" | A_col B_col\n",
|
||
|
" | 0 1 3\n",
|
||
|
" | 1 2 4\n",
|
||
|
" | 2 3 5\n",
|
||
|
" | 3 4 6\n",
|
||
|
" | \n",
|
||
|
" | as_blocks(self, copy=True)\n",
|
||
|
" | Convert the frame to a dict of dtype -> Constructor Types that each has\n",
|
||
|
" | a homogeneous dtype.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | NOTE: the dtypes of the blocks WILL BE PRESERVED HERE (unlike in\n",
|
||
|
" | as_matrix)\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | copy : boolean, default True\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | values : a dict of dtype -> Constructor Types\n",
|
||
|
" | \n",
|
||
|
" | as_matrix(self, columns=None)\n",
|
||
|
" | Convert the frame to its Numpy-array representation.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.23.0\n",
|
||
|
" | Use :meth:`DataFrame.values` instead.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | columns : list, optional, default:None\n",
|
||
|
" | If None, return all columns, otherwise, returns specified columns.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | values : ndarray\n",
|
||
|
" | If the caller is heterogeneous and contains booleans or objects,\n",
|
||
|
" | the result will be of dtype=object. See Notes.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.values\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Return is NOT a Numpy-matrix, rather, a Numpy-array.\n",
|
||
|
" | \n",
|
||
|
" | The dtype will be a lower-common-denominator dtype (implicit\n",
|
||
|
" | upcasting); that is to say if the dtypes (even of numeric types)\n",
|
||
|
" | are mixed, the one that accommodates all will be chosen. Use this\n",
|
||
|
" | with care if you are not dealing with the blocks.\n",
|
||
|
" | \n",
|
||
|
" | e.g. If the dtypes are float16 and float32, dtype will be upcast to\n",
|
||
|
" | float32. If dtypes are int32 and uint8, dtype will be upcase to\n",
|
||
|
" | int32. By numpy.find_common_type convention, mixing int64 and uint64\n",
|
||
|
" | will result in a float64 dtype.\n",
|
||
|
" | \n",
|
||
|
" | This method is provided for backwards compatibility. Generally,\n",
|
||
|
" | it is recommended to use '.values'.\n",
|
||
|
" | \n",
|
||
|
" | asfreq(self, freq, method=None, how=None, normalize=False, fill_value=None)\n",
|
||
|
" | Convert TimeSeries to specified frequency.\n",
|
||
|
" | \n",
|
||
|
" | Optionally provide filling method to pad/backfill missing values.\n",
|
||
|
" | \n",
|
||
|
" | Returns the original data conformed to a new index with the specified\n",
|
||
|
" | frequency. ``resample`` is more appropriate if an operation, such as\n",
|
||
|
" | summarization, is necessary to represent the data at the new frequency.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | freq : DateOffset object, or string\n",
|
||
|
" | method : {'backfill'/'bfill', 'pad'/'ffill'}, default None\n",
|
||
|
" | Method to use for filling holes in reindexed Series (note this\n",
|
||
|
" | does not fill NaNs that already were present):\n",
|
||
|
" | \n",
|
||
|
" | * 'pad' / 'ffill': propagate last valid observation forward to next\n",
|
||
|
" | valid\n",
|
||
|
" | * 'backfill' / 'bfill': use NEXT valid observation to fill\n",
|
||
|
" | how : {'start', 'end'}, default end\n",
|
||
|
" | For PeriodIndex only, see PeriodIndex.asfreq\n",
|
||
|
" | normalize : bool, default False\n",
|
||
|
" | Whether to reset output index to midnight\n",
|
||
|
" | fill_value : scalar, optional\n",
|
||
|
" | Value to use for missing values, applied during upsampling (note\n",
|
||
|
" | this does not fill NaNs that already were present).\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | converted : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | reindex\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | To learn more about the frequency strings, please see `this link\n",
|
||
|
" | <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | Start by creating a series with 4 one minute timestamps.\n",
|
||
|
" | \n",
|
||
|
" | >>> index = pd.date_range('1/1/2000', periods=4, freq='T')\n",
|
||
|
" | >>> series = pd.Series([0.0, None, 2.0, 3.0], index=index)\n",
|
||
|
" | >>> df = pd.DataFrame({'s':series})\n",
|
||
|
" | >>> df\n",
|
||
|
" | s\n",
|
||
|
" | 2000-01-01 00:00:00 0.0\n",
|
||
|
" | 2000-01-01 00:01:00 NaN\n",
|
||
|
" | 2000-01-01 00:02:00 2.0\n",
|
||
|
" | 2000-01-01 00:03:00 3.0\n",
|
||
|
" | \n",
|
||
|
" | Upsample the series into 30 second bins.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.asfreq(freq='30S')\n",
|
||
|
" | s\n",
|
||
|
" | 2000-01-01 00:00:00 0.0\n",
|
||
|
" | 2000-01-01 00:00:30 NaN\n",
|
||
|
" | 2000-01-01 00:01:00 NaN\n",
|
||
|
" | 2000-01-01 00:01:30 NaN\n",
|
||
|
" | 2000-01-01 00:02:00 2.0\n",
|
||
|
" | 2000-01-01 00:02:30 NaN\n",
|
||
|
" | 2000-01-01 00:03:00 3.0\n",
|
||
|
" | \n",
|
||
|
" | Upsample again, providing a ``fill value``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.asfreq(freq='30S', fill_value=9.0)\n",
|
||
|
" | s\n",
|
||
|
" | 2000-01-01 00:00:00 0.0\n",
|
||
|
" | 2000-01-01 00:00:30 9.0\n",
|
||
|
" | 2000-01-01 00:01:00 NaN\n",
|
||
|
" | 2000-01-01 00:01:30 9.0\n",
|
||
|
" | 2000-01-01 00:02:00 2.0\n",
|
||
|
" | 2000-01-01 00:02:30 9.0\n",
|
||
|
" | 2000-01-01 00:03:00 3.0\n",
|
||
|
" | \n",
|
||
|
" | Upsample again, providing a ``method``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.asfreq(freq='30S', method='bfill')\n",
|
||
|
" | s\n",
|
||
|
" | 2000-01-01 00:00:00 0.0\n",
|
||
|
" | 2000-01-01 00:00:30 NaN\n",
|
||
|
" | 2000-01-01 00:01:00 NaN\n",
|
||
|
" | 2000-01-01 00:01:30 2.0\n",
|
||
|
" | 2000-01-01 00:02:00 2.0\n",
|
||
|
" | 2000-01-01 00:02:30 3.0\n",
|
||
|
" | 2000-01-01 00:03:00 3.0\n",
|
||
|
" | \n",
|
||
|
" | asof(self, where, subset=None)\n",
|
||
|
" | Return the last row(s) without any NaNs before `where`.\n",
|
||
|
" | \n",
|
||
|
" | The last row (for each element in `where`, if list) without any\n",
|
||
|
" | NaN is taken.\n",
|
||
|
" | In case of a :class:`~pandas.DataFrame`, the last row without NaN\n",
|
||
|
" | considering only the subset of columns (if not `None`)\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0 For DataFrame\n",
|
||
|
" | \n",
|
||
|
" | If there is no good value, NaN is returned for a Series or\n",
|
||
|
" | a Series of NaN values for a DataFrame\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | where : date or array-like of dates\n",
|
||
|
" | Date(s) before which the last row(s) are returned.\n",
|
||
|
" | subset : str or array-like of str, default `None`\n",
|
||
|
" | For DataFrame, if not `None`, only use these columns to\n",
|
||
|
" | check for NaNs.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | scalar, Series, or DataFrame\n",
|
||
|
" | \n",
|
||
|
" | * scalar : when `self` is a Series and `where` is a scalar\n",
|
||
|
" | * Series: when `self` is a Series and `where` is an array-like,\n",
|
||
|
" | or when `self` is a DataFrame and `where` is a scalar\n",
|
||
|
" | * DataFrame : when `self` is a DataFrame and `where` is an\n",
|
||
|
" | array-like\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | merge_asof : Perform an asof merge. Similar to left join.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Dates are assumed to be sorted. Raises if this is not the case.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | A Series and a scalar `where`.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2, np.nan, 4], index=[10, 20, 30, 40])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 10 1.0\n",
|
||
|
" | 20 2.0\n",
|
||
|
" | 30 NaN\n",
|
||
|
" | 40 4.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.asof(20)\n",
|
||
|
" | 2.0\n",
|
||
|
" | \n",
|
||
|
" | For a sequence `where`, a Series is returned. The first value is\n",
|
||
|
" | NaN, because the first element of `where` is before the first\n",
|
||
|
" | index value.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.asof([5, 20])\n",
|
||
|
" | 5 NaN\n",
|
||
|
" | 20 2.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Missing values are not considered. The following is ``2.0``, not\n",
|
||
|
" | NaN, even though NaN is at the index location for ``30``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s.asof(30)\n",
|
||
|
" | 2.0\n",
|
||
|
" | \n",
|
||
|
" | Take all columns into consideration\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'a': [10, 20, 30, 40, 50],\n",
|
||
|
" | ... 'b': [None, None, None, None, 500]},\n",
|
||
|
" | ... index=pd.DatetimeIndex(['2018-02-27 09:01:00',\n",
|
||
|
" | ... '2018-02-27 09:02:00',\n",
|
||
|
" | ... '2018-02-27 09:03:00',\n",
|
||
|
" | ... '2018-02-27 09:04:00',\n",
|
||
|
" | ... '2018-02-27 09:05:00']))\n",
|
||
|
" | >>> df.asof(pd.DatetimeIndex(['2018-02-27 09:03:30',\n",
|
||
|
" | ... '2018-02-27 09:04:30']))\n",
|
||
|
" | a b\n",
|
||
|
" | 2018-02-27 09:03:30 NaN NaN\n",
|
||
|
" | 2018-02-27 09:04:30 NaN NaN\n",
|
||
|
" | \n",
|
||
|
" | Take a single column into consideration\n",
|
||
|
" | \n",
|
||
|
" | >>> df.asof(pd.DatetimeIndex(['2018-02-27 09:03:30',\n",
|
||
|
" | ... '2018-02-27 09:04:30']),\n",
|
||
|
" | ... subset=['a'])\n",
|
||
|
" | a b\n",
|
||
|
" | 2018-02-27 09:03:30 30.0 NaN\n",
|
||
|
" | 2018-02-27 09:04:30 40.0 NaN\n",
|
||
|
" | \n",
|
||
|
" | astype(self, dtype, copy=True, errors='raise', **kwargs)\n",
|
||
|
" | Cast a pandas object to a specified dtype ``dtype``.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | dtype : data type, or dict of column name -> data type\n",
|
||
|
" | Use a numpy.dtype or Python type to cast entire pandas object to\n",
|
||
|
" | the same type. Alternatively, use {col: dtype, ...}, where col is a\n",
|
||
|
" | column label and dtype is a numpy.dtype or Python type to cast one\n",
|
||
|
" | or more of the DataFrame's columns to column-specific types.\n",
|
||
|
" | copy : bool, default True\n",
|
||
|
" | Return a copy when ``copy=True`` (be very careful setting\n",
|
||
|
" | ``copy=False`` as changes to values then may propagate to other\n",
|
||
|
" | pandas objects).\n",
|
||
|
" | errors : {'raise', 'ignore'}, default 'raise'\n",
|
||
|
" | Control raising of exceptions on invalid data for provided dtype.\n",
|
||
|
" | \n",
|
||
|
" | - ``raise`` : allow exceptions to be raised\n",
|
||
|
" | - ``ignore`` : suppress exceptions. On error return original object\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | kwargs : keyword arguments to pass on to the constructor\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | casted : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | to_datetime : Convert argument to datetime.\n",
|
||
|
" | to_timedelta : Convert argument to timedelta.\n",
|
||
|
" | to_numeric : Convert argument to a numeric type.\n",
|
||
|
" | numpy.ndarray.astype : Cast a numpy array to a specified type.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> ser = pd.Series([1, 2], dtype='int32')\n",
|
||
|
" | >>> ser\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | dtype: int32\n",
|
||
|
" | >>> ser.astype('int64')\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Convert to categorical type:\n",
|
||
|
" | \n",
|
||
|
" | >>> ser.astype('category')\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | dtype: category\n",
|
||
|
" | Categories (2, int64): [1, 2]\n",
|
||
|
" | \n",
|
||
|
" | Convert to ordered categorical type with custom ordering:\n",
|
||
|
" | \n",
|
||
|
" | >>> cat_dtype = pd.api.types.CategoricalDtype(\n",
|
||
|
" | ... categories=[2, 1], ordered=True)\n",
|
||
|
" | >>> ser.astype(cat_dtype)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | dtype: category\n",
|
||
|
" | Categories (2, int64): [2 < 1]\n",
|
||
|
" | \n",
|
||
|
" | Note that using ``copy=False`` and changing data on a new\n",
|
||
|
" | pandas object may propagate changes:\n",
|
||
|
" | \n",
|
||
|
" | >>> s1 = pd.Series([1,2])\n",
|
||
|
" | >>> s2 = s1.astype('int64', copy=False)\n",
|
||
|
" | >>> s2[0] = 10\n",
|
||
|
" | >>> s1 # note that s1[0] has changed too\n",
|
||
|
" | 0 10\n",
|
||
|
" | 1 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | at_time(self, time, asof=False, axis=None)\n",
|
||
|
" | Select values at particular time of day (e.g. 9:30AM).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | time : datetime.time or string\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | values_at_time : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | If the index is not a :class:`DatetimeIndex`\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | between_time : Select values between particular times of the day.\n",
|
||
|
" | first : Select initial periods of time series based on a date offset.\n",
|
||
|
" | last : Select final periods of time series based on a date offset.\n",
|
||
|
" | DatetimeIndex.indexer_at_time : Get just the index locations for\n",
|
||
|
" | values at particular time of the day.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> i = pd.date_range('2018-04-09', periods=4, freq='12H')\n",
|
||
|
" | >>> ts = pd.DataFrame({'A': [1,2,3,4]}, index=i)\n",
|
||
|
" | >>> ts\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 00:00:00 1\n",
|
||
|
" | 2018-04-09 12:00:00 2\n",
|
||
|
" | 2018-04-10 00:00:00 3\n",
|
||
|
" | 2018-04-10 12:00:00 4\n",
|
||
|
" | \n",
|
||
|
" | >>> ts.at_time('12:00')\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 12:00:00 2\n",
|
||
|
" | 2018-04-10 12:00:00 4\n",
|
||
|
" | \n",
|
||
|
" | between_time(self, start_time, end_time, include_start=True, include_end=True, axis=None)\n",
|
||
|
" | Select values between particular times of the day (e.g., 9:00-9:30 AM).\n",
|
||
|
" | \n",
|
||
|
" | By setting ``start_time`` to be later than ``end_time``,\n",
|
||
|
" | you can get the times that are *not* between the two times.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | start_time : datetime.time or string\n",
|
||
|
" | end_time : datetime.time or string\n",
|
||
|
" | include_start : boolean, default True\n",
|
||
|
" | include_end : boolean, default True\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | values_between_time : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | If the index is not a :class:`DatetimeIndex`\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | at_time : Select values at a particular time of the day.\n",
|
||
|
" | first : Select initial periods of time series based on a date offset.\n",
|
||
|
" | last : Select final periods of time series based on a date offset.\n",
|
||
|
" | DatetimeIndex.indexer_between_time : Get just the index locations for\n",
|
||
|
" | values between particular times of the day.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> i = pd.date_range('2018-04-09', periods=4, freq='1D20min')\n",
|
||
|
" | >>> ts = pd.DataFrame({'A': [1,2,3,4]}, index=i)\n",
|
||
|
" | >>> ts\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 00:00:00 1\n",
|
||
|
" | 2018-04-10 00:20:00 2\n",
|
||
|
" | 2018-04-11 00:40:00 3\n",
|
||
|
" | 2018-04-12 01:00:00 4\n",
|
||
|
" | \n",
|
||
|
" | >>> ts.between_time('0:15', '0:45')\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-10 00:20:00 2\n",
|
||
|
" | 2018-04-11 00:40:00 3\n",
|
||
|
" | \n",
|
||
|
" | You get the times that are *not* between two times by setting\n",
|
||
|
" | ``start_time`` later than ``end_time``:\n",
|
||
|
" | \n",
|
||
|
" | >>> ts.between_time('0:45', '0:15')\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 00:00:00 1\n",
|
||
|
" | 2018-04-12 01:00:00 4\n",
|
||
|
" | \n",
|
||
|
" | bfill(self, axis=None, inplace=False, limit=None, downcast=None)\n",
|
||
|
" | Synonym for :meth:`DataFrame.fillna` with ``method='bfill'``.\n",
|
||
|
" | \n",
|
||
|
" | bool(self)\n",
|
||
|
" | Return the bool of a single element PandasObject.\n",
|
||
|
" | \n",
|
||
|
" | This must be a boolean scalar value, either True or False. Raise a\n",
|
||
|
" | ValueError if the PandasObject does not have exactly 1 element, or that\n",
|
||
|
" | element is not boolean\n",
|
||
|
" | \n",
|
||
|
" | clip(self, lower=None, upper=None, axis=None, inplace=False, *args, **kwargs)\n",
|
||
|
" | Trim values at input threshold(s).\n",
|
||
|
" | \n",
|
||
|
" | Assigns values outside boundary to boundary values. Thresholds\n",
|
||
|
" | can be singular values or array like, and in the latter case\n",
|
||
|
" | the clipping is performed element-wise in the specified axis.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | lower : float or array_like, default None\n",
|
||
|
" | Minimum threshold value. All values below this\n",
|
||
|
" | threshold will be set to it.\n",
|
||
|
" | upper : float or array_like, default None\n",
|
||
|
" | Maximum threshold value. All values above this\n",
|
||
|
" | threshold will be set to it.\n",
|
||
|
" | axis : int or string axis name, optional\n",
|
||
|
" | Align object with lower and upper along the given axis.\n",
|
||
|
" | inplace : boolean, default False\n",
|
||
|
" | Whether to perform the operation in place on the data.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | *args, **kwargs\n",
|
||
|
" | Additional keywords have no effect but might be accepted\n",
|
||
|
" | for compatibility with numpy.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Same type as calling object with the values outside the\n",
|
||
|
" | clip boundaries replaced\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> data = {'col_0': [9, -3, 0, -1, 5], 'col_1': [-2, -7, 6, 8, -5]}\n",
|
||
|
" | >>> df = pd.DataFrame(data)\n",
|
||
|
" | >>> df\n",
|
||
|
" | col_0 col_1\n",
|
||
|
" | 0 9 -2\n",
|
||
|
" | 1 -3 -7\n",
|
||
|
" | 2 0 6\n",
|
||
|
" | 3 -1 8\n",
|
||
|
" | 4 5 -5\n",
|
||
|
" | \n",
|
||
|
" | Clips per column using lower and upper thresholds:\n",
|
||
|
" | \n",
|
||
|
" | >>> df.clip(-4, 6)\n",
|
||
|
" | col_0 col_1\n",
|
||
|
" | 0 6 -2\n",
|
||
|
" | 1 -3 -4\n",
|
||
|
" | 2 0 6\n",
|
||
|
" | 3 -1 6\n",
|
||
|
" | 4 5 -4\n",
|
||
|
" | \n",
|
||
|
" | Clips using specific lower and upper thresholds per column element:\n",
|
||
|
" | \n",
|
||
|
" | >>> t = pd.Series([2, -4, -1, 6, 3])\n",
|
||
|
" | >>> t\n",
|
||
|
" | 0 2\n",
|
||
|
" | 1 -4\n",
|
||
|
" | 2 -1\n",
|
||
|
" | 3 6\n",
|
||
|
" | 4 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> df.clip(t, t + 4, axis=0)\n",
|
||
|
" | col_0 col_1\n",
|
||
|
" | 0 6 2\n",
|
||
|
" | 1 -3 -4\n",
|
||
|
" | 2 0 3\n",
|
||
|
" | 3 6 8\n",
|
||
|
" | 4 5 3\n",
|
||
|
" | \n",
|
||
|
" | clip_lower(self, threshold, axis=None, inplace=False)\n",
|
||
|
" | Trim values below a given threshold.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.24.0\n",
|
||
|
" | Use clip(lower=threshold) instead.\n",
|
||
|
" | \n",
|
||
|
" | Elements below the `threshold` will be changed to match the\n",
|
||
|
" | `threshold` value(s). Threshold can be a single value or an array,\n",
|
||
|
" | in the latter case it performs the truncation element-wise.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | threshold : numeric or array-like\n",
|
||
|
" | Minimum value allowed. All values below threshold will be set to\n",
|
||
|
" | this value.\n",
|
||
|
" | \n",
|
||
|
" | * float : every value is compared to `threshold`.\n",
|
||
|
" | * array-like : The shape of `threshold` should match the object\n",
|
||
|
" | it's compared to. When `self` is a Series, `threshold` should be\n",
|
||
|
" | the length. When `self` is a DataFrame, `threshold` should 2-D\n",
|
||
|
" | and the same shape as `self` for ``axis=None``, or 1-D and the\n",
|
||
|
" | same length as the axis being compared.\n",
|
||
|
" | \n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | Align `self` with `threshold` along the given axis.\n",
|
||
|
" | \n",
|
||
|
" | inplace : boolean, default False\n",
|
||
|
" | Whether to perform the operation in place on the data.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Original data with values trimmed.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.clip : General purpose method to trim Series values to given\n",
|
||
|
" | threshold(s).\n",
|
||
|
" | DataFrame.clip : General purpose method to trim DataFrame values to\n",
|
||
|
" | given threshold(s).\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | Series single threshold clipping:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([5, 6, 7, 8, 9])\n",
|
||
|
" | >>> s.clip(lower=8)\n",
|
||
|
" | 0 8\n",
|
||
|
" | 1 8\n",
|
||
|
" | 2 8\n",
|
||
|
" | 3 8\n",
|
||
|
" | 4 9\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Series clipping element-wise using an array of thresholds. `threshold`\n",
|
||
|
" | should be the same length as the Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> elemwise_thresholds = [4, 8, 7, 2, 5]\n",
|
||
|
" | >>> s.clip(lower=elemwise_thresholds)\n",
|
||
|
" | 0 5\n",
|
||
|
" | 1 8\n",
|
||
|
" | 2 7\n",
|
||
|
" | 3 8\n",
|
||
|
" | 4 9\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | DataFrames can be compared to a scalar.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [1, 3, 5], \"B\": [2, 4, 6]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B\n",
|
||
|
" | 0 1 2\n",
|
||
|
" | 1 3 4\n",
|
||
|
" | 2 5 6\n",
|
||
|
" | \n",
|
||
|
" | >>> df.clip(lower=3)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 3 3\n",
|
||
|
" | 1 3 4\n",
|
||
|
" | 2 5 6\n",
|
||
|
" | \n",
|
||
|
" | Or to an array of values. By default, `threshold` should be the same\n",
|
||
|
" | shape as the DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.clip(lower=np.array([[3, 4], [2, 2], [6, 2]]))\n",
|
||
|
" | A B\n",
|
||
|
" | 0 3 4\n",
|
||
|
" | 1 3 4\n",
|
||
|
" | 2 6 6\n",
|
||
|
" | \n",
|
||
|
" | Control how `threshold` is broadcast with `axis`. In this case\n",
|
||
|
" | `threshold` should be the same length as the axis specified by\n",
|
||
|
" | `axis`.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.clip(lower=[3, 3, 5], axis='index')\n",
|
||
|
" | A B\n",
|
||
|
" | 0 3 3\n",
|
||
|
" | 1 3 4\n",
|
||
|
" | 2 5 6\n",
|
||
|
" | \n",
|
||
|
" | >>> df.clip(lower=[4, 5], axis='columns')\n",
|
||
|
" | A B\n",
|
||
|
" | 0 4 5\n",
|
||
|
" | 1 4 5\n",
|
||
|
" | 2 5 6\n",
|
||
|
" | \n",
|
||
|
" | clip_upper(self, threshold, axis=None, inplace=False)\n",
|
||
|
" | Trim values above a given threshold.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.24.0\n",
|
||
|
" | Use clip(upper=threshold) instead.\n",
|
||
|
" | \n",
|
||
|
" | Elements above the `threshold` will be changed to match the\n",
|
||
|
" | `threshold` value(s). Threshold can be a single value or an array,\n",
|
||
|
" | in the latter case it performs the truncation element-wise.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | threshold : numeric or array-like\n",
|
||
|
" | Maximum value allowed. All values above threshold will be set to\n",
|
||
|
" | this value.\n",
|
||
|
" | \n",
|
||
|
" | * float : every value is compared to `threshold`.\n",
|
||
|
" | * array-like : The shape of `threshold` should match the object\n",
|
||
|
" | it's compared to. When `self` is a Series, `threshold` should be\n",
|
||
|
" | the length. When `self` is a DataFrame, `threshold` should 2-D\n",
|
||
|
" | and the same shape as `self` for ``axis=None``, or 1-D and the\n",
|
||
|
" | same length as the axis being compared.\n",
|
||
|
" | \n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | Align object with `threshold` along the given axis.\n",
|
||
|
" | inplace : boolean, default False\n",
|
||
|
" | Whether to perform the operation in place on the data.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Original data with values trimmed.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.clip : General purpose method to trim Series values to given\n",
|
||
|
" | threshold(s).\n",
|
||
|
" | DataFrame.clip : General purpose method to trim DataFrame values to\n",
|
||
|
" | given threshold(s).\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4, 5])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | 4 5\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.clip(upper=3)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 3\n",
|
||
|
" | 4 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> elemwise_thresholds = [5, 4, 3, 2, 1]\n",
|
||
|
" | >>> elemwise_thresholds\n",
|
||
|
" | [5, 4, 3, 2, 1]\n",
|
||
|
" | \n",
|
||
|
" | >>> s.clip(upper=elemwise_thresholds)\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 2\n",
|
||
|
" | 4 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | convert_objects(self, convert_dates=True, convert_numeric=False, convert_timedeltas=True, copy=True)\n",
|
||
|
" | Attempt to infer better dtype for object columns.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | convert_dates : boolean, default True\n",
|
||
|
" | If True, convert to date where possible. If 'coerce', force\n",
|
||
|
" | conversion, with unconvertible values becoming NaT.\n",
|
||
|
" | convert_numeric : boolean, default False\n",
|
||
|
" | If True, attempt to coerce to numbers (including strings), with\n",
|
||
|
" | unconvertible values becoming NaN.\n",
|
||
|
" | convert_timedeltas : boolean, default True\n",
|
||
|
" | If True, convert to timedelta where possible. If 'coerce', force\n",
|
||
|
" | conversion, with unconvertible values becoming NaT.\n",
|
||
|
" | copy : boolean, default True\n",
|
||
|
" | If True, return a copy even if no copy is necessary (e.g. no\n",
|
||
|
" | conversion was done). Note: This is meant for internal use, and\n",
|
||
|
" | should not be confused with inplace.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | converted : same as input object\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | to_datetime : Convert argument to datetime.\n",
|
||
|
" | to_timedelta : Convert argument to timedelta.\n",
|
||
|
" | to_numeric : Convert argument to numeric type.\n",
|
||
|
" | \n",
|
||
|
" | copy(self, deep=True)\n",
|
||
|
" | Make a copy of this object's indices and data.\n",
|
||
|
" | \n",
|
||
|
" | When ``deep=True`` (default), a new object will be created with a\n",
|
||
|
" | copy of the calling object's data and indices. Modifications to\n",
|
||
|
" | the data or indices of the copy will not be reflected in the\n",
|
||
|
" | original object (see notes below).\n",
|
||
|
" | \n",
|
||
|
" | When ``deep=False``, a new object will be created without copying\n",
|
||
|
" | the calling object's data or index (only references to the data\n",
|
||
|
" | and index are copied). Any changes to the data of the original\n",
|
||
|
" | will be reflected in the shallow copy (and vice versa).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | deep : bool, default True\n",
|
||
|
" | Make a deep copy, including a copy of the data and the indices.\n",
|
||
|
" | With ``deep=False`` neither the indices nor the data are copied.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | copy : Series, DataFrame or Panel\n",
|
||
|
" | Object type matches caller.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | When ``deep=True``, data is copied but actual Python objects\n",
|
||
|
" | will not be copied recursively, only the reference to the object.\n",
|
||
|
" | This is in contrast to `copy.deepcopy` in the Standard Library,\n",
|
||
|
" | which recursively copies object data (see examples below).\n",
|
||
|
" | \n",
|
||
|
" | While ``Index`` objects are copied when ``deep=True``, the underlying\n",
|
||
|
" | numpy array is not copied for performance reasons. Since ``Index`` is\n",
|
||
|
" | immutable, the underlying data can be safely shared and a copy\n",
|
||
|
" | is not needed.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series([1, 2], index=[\"a\", \"b\"])\n",
|
||
|
" | >>> s\n",
|
||
|
" | a 1\n",
|
||
|
" | b 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s_copy = s.copy()\n",
|
||
|
" | >>> s_copy\n",
|
||
|
" | a 1\n",
|
||
|
" | b 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | **Shallow copy versus default (deep) copy:**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2], index=[\"a\", \"b\"])\n",
|
||
|
" | >>> deep = s.copy()\n",
|
||
|
" | >>> shallow = s.copy(deep=False)\n",
|
||
|
" | \n",
|
||
|
" | Shallow copy shares data and index with original.\n",
|
||
|
" | \n",
|
||
|
" | >>> s is shallow\n",
|
||
|
" | False\n",
|
||
|
" | >>> s.values is shallow.values and s.index is shallow.index\n",
|
||
|
" | True\n",
|
||
|
" | \n",
|
||
|
" | Deep copy has own copy of data and index.\n",
|
||
|
" | \n",
|
||
|
" | >>> s is deep\n",
|
||
|
" | False\n",
|
||
|
" | >>> s.values is deep.values or s.index is deep.index\n",
|
||
|
" | False\n",
|
||
|
" | \n",
|
||
|
" | Updates to the data shared by shallow copy and original is reflected\n",
|
||
|
" | in both; deep copy remains unchanged.\n",
|
||
|
" | \n",
|
||
|
" | >>> s[0] = 3\n",
|
||
|
" | >>> shallow[1] = 4\n",
|
||
|
" | >>> s\n",
|
||
|
" | a 3\n",
|
||
|
" | b 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> shallow\n",
|
||
|
" | a 3\n",
|
||
|
" | b 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> deep\n",
|
||
|
" | a 1\n",
|
||
|
" | b 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Note that when copying an object containing Python objects, a deep copy\n",
|
||
|
" | will copy the data, but will not do so recursively. Updating a nested\n",
|
||
|
" | data object will be reflected in the deep copy.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([[1, 2], [3, 4]])\n",
|
||
|
" | >>> deep = s.copy()\n",
|
||
|
" | >>> s[0][0] = 10\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 [10, 2]\n",
|
||
|
" | 1 [3, 4]\n",
|
||
|
" | dtype: object\n",
|
||
|
" | >>> deep\n",
|
||
|
" | 0 [10, 2]\n",
|
||
|
" | 1 [3, 4]\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | describe(self, percentiles=None, include=None, exclude=None)\n",
|
||
|
" | Generate descriptive statistics that summarize the central tendency,\n",
|
||
|
" | dispersion and shape of a dataset's distribution, excluding\n",
|
||
|
" | ``NaN`` values.\n",
|
||
|
" | \n",
|
||
|
" | Analyzes both numeric and object series, as well\n",
|
||
|
" | as ``DataFrame`` column sets of mixed data types. The output\n",
|
||
|
" | will vary depending on what is provided. Refer to the notes\n",
|
||
|
" | below for more detail.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | percentiles : list-like of numbers, optional\n",
|
||
|
" | The percentiles to include in the output. All should\n",
|
||
|
" | fall between 0 and 1. The default is\n",
|
||
|
" | ``[.25, .5, .75]``, which returns the 25th, 50th, and\n",
|
||
|
" | 75th percentiles.\n",
|
||
|
" | include : 'all', list-like of dtypes or None (default), optional\n",
|
||
|
" | A white list of data types to include in the result. Ignored\n",
|
||
|
" | for ``Series``. Here are the options:\n",
|
||
|
" | \n",
|
||
|
" | - 'all' : All columns of the input will be included in the output.\n",
|
||
|
" | - A list-like of dtypes : Limits the results to the\n",
|
||
|
" | provided data types.\n",
|
||
|
" | To limit the result to numeric types submit\n",
|
||
|
" | ``numpy.number``. To limit it instead to object columns submit\n",
|
||
|
" | the ``numpy.object`` data type. Strings\n",
|
||
|
" | can also be used in the style of\n",
|
||
|
" | ``select_dtypes`` (e.g. ``df.describe(include=['O'])``). To\n",
|
||
|
" | select pandas categorical columns, use ``'category'``\n",
|
||
|
" | - None (default) : The result will include all numeric columns.\n",
|
||
|
" | exclude : list-like of dtypes or None (default), optional,\n",
|
||
|
" | A black list of data types to omit from the result. Ignored\n",
|
||
|
" | for ``Series``. Here are the options:\n",
|
||
|
" | \n",
|
||
|
" | - A list-like of dtypes : Excludes the provided data types\n",
|
||
|
" | from the result. To exclude numeric types submit\n",
|
||
|
" | ``numpy.number``. To exclude object columns submit the data\n",
|
||
|
" | type ``numpy.object``. Strings can also be used in the style of\n",
|
||
|
" | ``select_dtypes`` (e.g. ``df.describe(include=['O'])``). To\n",
|
||
|
" | exclude pandas categorical columns, use ``'category'``\n",
|
||
|
" | - None (default) : The result will exclude nothing.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Summary statistics of the Series or Dataframe provided.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.count: Count number of non-NA/null observations.\n",
|
||
|
" | DataFrame.max: Maximum of the values in the object.\n",
|
||
|
" | DataFrame.min: Minimum of the values in the object.\n",
|
||
|
" | DataFrame.mean: Mean of the values.\n",
|
||
|
" | DataFrame.std: Standard deviation of the obersvations.\n",
|
||
|
" | DataFrame.select_dtypes: Subset of a DataFrame including/excluding\n",
|
||
|
" | columns based on their dtype.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | For numeric data, the result's index will include ``count``,\n",
|
||
|
" | ``mean``, ``std``, ``min``, ``max`` as well as lower, ``50`` and\n",
|
||
|
" | upper percentiles. By default the lower percentile is ``25`` and the\n",
|
||
|
" | upper percentile is ``75``. The ``50`` percentile is the\n",
|
||
|
" | same as the median.\n",
|
||
|
" | \n",
|
||
|
" | For object data (e.g. strings or timestamps), the result's index\n",
|
||
|
" | will include ``count``, ``unique``, ``top``, and ``freq``. The ``top``\n",
|
||
|
" | is the most common value. The ``freq`` is the most common value's\n",
|
||
|
" | frequency. Timestamps also include the ``first`` and ``last`` items.\n",
|
||
|
" | \n",
|
||
|
" | If multiple object values have the highest count, then the\n",
|
||
|
" | ``count`` and ``top`` results will be arbitrarily chosen from\n",
|
||
|
" | among those with the highest count.\n",
|
||
|
" | \n",
|
||
|
" | For mixed data types provided via a ``DataFrame``, the default is to\n",
|
||
|
" | return only an analysis of numeric columns. If the dataframe consists\n",
|
||
|
" | only of object and categorical data without any numeric columns, the\n",
|
||
|
" | default is to return an analysis of both the object and categorical\n",
|
||
|
" | columns. If ``include='all'`` is provided as an option, the result\n",
|
||
|
" | will include a union of attributes of each type.\n",
|
||
|
" | \n",
|
||
|
" | The `include` and `exclude` parameters can be used to limit\n",
|
||
|
" | which columns in a ``DataFrame`` are analyzed for the output.\n",
|
||
|
" | The parameters are ignored when analyzing a ``Series``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Describing a numeric ``Series``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s.describe()\n",
|
||
|
" | count 3.0\n",
|
||
|
" | mean 2.0\n",
|
||
|
" | std 1.0\n",
|
||
|
" | min 1.0\n",
|
||
|
" | 25% 1.5\n",
|
||
|
" | 50% 2.0\n",
|
||
|
" | 75% 2.5\n",
|
||
|
" | max 3.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Describing a categorical ``Series``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(['a', 'a', 'b', 'c'])\n",
|
||
|
" | >>> s.describe()\n",
|
||
|
" | count 4\n",
|
||
|
" | unique 3\n",
|
||
|
" | top a\n",
|
||
|
" | freq 2\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | Describing a timestamp ``Series``.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([\n",
|
||
|
" | ... np.datetime64(\"2000-01-01\"),\n",
|
||
|
" | ... np.datetime64(\"2010-01-01\"),\n",
|
||
|
" | ... np.datetime64(\"2010-01-01\")\n",
|
||
|
" | ... ])\n",
|
||
|
" | >>> s.describe()\n",
|
||
|
" | count 3\n",
|
||
|
" | unique 2\n",
|
||
|
" | top 2010-01-01 00:00:00\n",
|
||
|
" | freq 2\n",
|
||
|
" | first 2000-01-01 00:00:00\n",
|
||
|
" | last 2010-01-01 00:00:00\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | Describing a ``DataFrame``. By default only numeric fields\n",
|
||
|
" | are returned.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'categorical': pd.Categorical(['d','e','f']),\n",
|
||
|
" | ... 'numeric': [1, 2, 3],\n",
|
||
|
" | ... 'object': ['a', 'b', 'c']\n",
|
||
|
" | ... })\n",
|
||
|
" | >>> df.describe()\n",
|
||
|
" | numeric\n",
|
||
|
" | count 3.0\n",
|
||
|
" | mean 2.0\n",
|
||
|
" | std 1.0\n",
|
||
|
" | min 1.0\n",
|
||
|
" | 25% 1.5\n",
|
||
|
" | 50% 2.0\n",
|
||
|
" | 75% 2.5\n",
|
||
|
" | max 3.0\n",
|
||
|
" | \n",
|
||
|
" | Describing all columns of a ``DataFrame`` regardless of data type.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.describe(include='all')\n",
|
||
|
" | categorical numeric object\n",
|
||
|
" | count 3 3.0 3\n",
|
||
|
" | unique 3 NaN 3\n",
|
||
|
" | top f NaN c\n",
|
||
|
" | freq 1 NaN 1\n",
|
||
|
" | mean NaN 2.0 NaN\n",
|
||
|
" | std NaN 1.0 NaN\n",
|
||
|
" | min NaN 1.0 NaN\n",
|
||
|
" | 25% NaN 1.5 NaN\n",
|
||
|
" | 50% NaN 2.0 NaN\n",
|
||
|
" | 75% NaN 2.5 NaN\n",
|
||
|
" | max NaN 3.0 NaN\n",
|
||
|
" | \n",
|
||
|
" | Describing a column from a ``DataFrame`` by accessing it as\n",
|
||
|
" | an attribute.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.numeric.describe()\n",
|
||
|
" | count 3.0\n",
|
||
|
" | mean 2.0\n",
|
||
|
" | std 1.0\n",
|
||
|
" | min 1.0\n",
|
||
|
" | 25% 1.5\n",
|
||
|
" | 50% 2.0\n",
|
||
|
" | 75% 2.5\n",
|
||
|
" | max 3.0\n",
|
||
|
" | Name: numeric, dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Including only numeric columns in a ``DataFrame`` description.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.describe(include=[np.number])\n",
|
||
|
" | numeric\n",
|
||
|
" | count 3.0\n",
|
||
|
" | mean 2.0\n",
|
||
|
" | std 1.0\n",
|
||
|
" | min 1.0\n",
|
||
|
" | 25% 1.5\n",
|
||
|
" | 50% 2.0\n",
|
||
|
" | 75% 2.5\n",
|
||
|
" | max 3.0\n",
|
||
|
" | \n",
|
||
|
" | Including only string columns in a ``DataFrame`` description.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.describe(include=[np.object])\n",
|
||
|
" | object\n",
|
||
|
" | count 3\n",
|
||
|
" | unique 3\n",
|
||
|
" | top c\n",
|
||
|
" | freq 1\n",
|
||
|
" | \n",
|
||
|
" | Including only categorical columns from a ``DataFrame`` description.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.describe(include=['category'])\n",
|
||
|
" | categorical\n",
|
||
|
" | count 3\n",
|
||
|
" | unique 3\n",
|
||
|
" | top f\n",
|
||
|
" | freq 1\n",
|
||
|
" | \n",
|
||
|
" | Excluding numeric columns from a ``DataFrame`` description.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.describe(exclude=[np.number])\n",
|
||
|
" | categorical object\n",
|
||
|
" | count 3 3\n",
|
||
|
" | unique 3 3\n",
|
||
|
" | top f c\n",
|
||
|
" | freq 1 1\n",
|
||
|
" | \n",
|
||
|
" | Excluding object columns from a ``DataFrame`` description.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.describe(exclude=[np.object])\n",
|
||
|
" | categorical numeric\n",
|
||
|
" | count 3 3.0\n",
|
||
|
" | unique 3 NaN\n",
|
||
|
" | top f NaN\n",
|
||
|
" | freq 1 NaN\n",
|
||
|
" | mean NaN 2.0\n",
|
||
|
" | std NaN 1.0\n",
|
||
|
" | min NaN 1.0\n",
|
||
|
" | 25% NaN 1.5\n",
|
||
|
" | 50% NaN 2.0\n",
|
||
|
" | 75% NaN 2.5\n",
|
||
|
" | max NaN 3.0\n",
|
||
|
" | \n",
|
||
|
" | droplevel(self, level, axis=0)\n",
|
||
|
" | Return DataFrame with requested index / column level(s) removed.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | level : int, str, or list-like\n",
|
||
|
" | If a string is given, must be the name of a level\n",
|
||
|
" | If list-like, elements must be names or positional indexes\n",
|
||
|
" | of levels.\n",
|
||
|
" | \n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | DataFrame.droplevel()\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([\n",
|
||
|
" | ... [1, 2, 3, 4],\n",
|
||
|
" | ... [5, 6, 7, 8],\n",
|
||
|
" | ... [9, 10, 11, 12]\n",
|
||
|
" | ... ]).set_index([0, 1]).rename_axis(['a', 'b'])\n",
|
||
|
" | \n",
|
||
|
" | >>> df.columns = pd.MultiIndex.from_tuples([\n",
|
||
|
" | ... ('c', 'e'), ('d', 'f')\n",
|
||
|
" | ... ], names=['level_1', 'level_2'])\n",
|
||
|
" | \n",
|
||
|
" | >>> df\n",
|
||
|
" | level_1 c d\n",
|
||
|
" | level_2 e f\n",
|
||
|
" | a b\n",
|
||
|
" | 1 2 3 4\n",
|
||
|
" | 5 6 7 8\n",
|
||
|
" | 9 10 11 12\n",
|
||
|
" | \n",
|
||
|
" | >>> df.droplevel('a')\n",
|
||
|
" | level_1 c d\n",
|
||
|
" | level_2 e f\n",
|
||
|
" | b\n",
|
||
|
" | 2 3 4\n",
|
||
|
" | 6 7 8\n",
|
||
|
" | 10 11 12\n",
|
||
|
" | \n",
|
||
|
" | >>> df.droplevel('level2', axis=1)\n",
|
||
|
" | level_1 c d\n",
|
||
|
" | a b\n",
|
||
|
" | 1 2 3 4\n",
|
||
|
" | 5 6 7 8\n",
|
||
|
" | 9 10 11 12\n",
|
||
|
" | \n",
|
||
|
" | equals(self, other)\n",
|
||
|
" | Test whether two objects contain the same elements.\n",
|
||
|
" | \n",
|
||
|
" | This function allows two Series or DataFrames to be compared against\n",
|
||
|
" | each other to see if they have the same shape and elements. NaNs in\n",
|
||
|
" | the same location are considered equal. The column headers do not\n",
|
||
|
" | need to have the same type, but the elements within the columns must\n",
|
||
|
" | be the same dtype.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Series or DataFrame\n",
|
||
|
" | The other Series or DataFrame to be compared with the first.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | bool\n",
|
||
|
" | True if all elements are the same in both objects, False\n",
|
||
|
" | otherwise.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.eq : Compare two Series objects of the same length\n",
|
||
|
" | and return a Series where each element is True if the element\n",
|
||
|
" | in each Series is equal, False otherwise.\n",
|
||
|
" | DataFrame.eq : Compare two DataFrame objects of the same shape and\n",
|
||
|
" | return a DataFrame where each element is True if the respective\n",
|
||
|
" | element in each DataFrame is equal, False otherwise.\n",
|
||
|
" | assert_series_equal : Return True if left and right Series are equal,\n",
|
||
|
" | False otherwise.\n",
|
||
|
" | assert_frame_equal : Return True if left and right DataFrames are\n",
|
||
|
" | equal, False otherwise.\n",
|
||
|
" | numpy.array_equal : Return True if two arrays have the same shape\n",
|
||
|
" | and elements, False otherwise.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | This function requires that the elements have the same dtype as their\n",
|
||
|
" | respective elements in the other Series or DataFrame. However, the\n",
|
||
|
" | column labels do not need to have the same type, as long as they are\n",
|
||
|
" | still considered equal.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({1: [10], 2: [20]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | 1 2\n",
|
||
|
" | 0 10 20\n",
|
||
|
" | \n",
|
||
|
" | DataFrames df and exactly_equal have the same types and values for\n",
|
||
|
" | their elements and column labels, which will return True.\n",
|
||
|
" | \n",
|
||
|
" | >>> exactly_equal = pd.DataFrame({1: [10], 2: [20]})\n",
|
||
|
" | >>> exactly_equal\n",
|
||
|
" | 1 2\n",
|
||
|
" | 0 10 20\n",
|
||
|
" | >>> df.equals(exactly_equal)\n",
|
||
|
" | True\n",
|
||
|
" | \n",
|
||
|
" | DataFrames df and different_column_type have the same element\n",
|
||
|
" | types and values, but have different types for the column labels,\n",
|
||
|
" | which will still return True.\n",
|
||
|
" | \n",
|
||
|
" | >>> different_column_type = pd.DataFrame({1.0: [10], 2.0: [20]})\n",
|
||
|
" | >>> different_column_type\n",
|
||
|
" | 1.0 2.0\n",
|
||
|
" | 0 10 20\n",
|
||
|
" | >>> df.equals(different_column_type)\n",
|
||
|
" | True\n",
|
||
|
" | \n",
|
||
|
" | DataFrames df and different_data_type have different types for the\n",
|
||
|
" | same values for their elements, and will return False even though\n",
|
||
|
" | their column labels are the same values and types.\n",
|
||
|
" | \n",
|
||
|
" | >>> different_data_type = pd.DataFrame({1: [10.0], 2: [20.0]})\n",
|
||
|
" | >>> different_data_type\n",
|
||
|
" | 1 2\n",
|
||
|
" | 0 10.0 20.0\n",
|
||
|
" | >>> df.equals(different_data_type)\n",
|
||
|
" | False\n",
|
||
|
" | \n",
|
||
|
" | ffill(self, axis=None, inplace=False, limit=None, downcast=None)\n",
|
||
|
" | Synonym for :meth:`DataFrame.fillna` with ``method='ffill'``.\n",
|
||
|
" | \n",
|
||
|
" | filter(self, items=None, like=None, regex=None, axis=None)\n",
|
||
|
" | Subset rows or columns of dataframe according to labels in\n",
|
||
|
" | the specified index.\n",
|
||
|
" | \n",
|
||
|
" | Note that this routine does not filter a dataframe on its\n",
|
||
|
" | contents. The filter is applied to the labels of the index.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | items : list-like\n",
|
||
|
" | List of axis to restrict to (must not all be present).\n",
|
||
|
" | like : string\n",
|
||
|
" | Keep axis where \"arg in col == True\".\n",
|
||
|
" | regex : string (regular expression)\n",
|
||
|
" | Keep axis with re.search(regex, col) == True.\n",
|
||
|
" | axis : int or string axis name\n",
|
||
|
" | The axis to filter on. By default this is the info axis,\n",
|
||
|
" | 'index' for Series, 'columns' for DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | same type as input object\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.loc\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | The ``items``, ``like``, and ``regex`` parameters are\n",
|
||
|
" | enforced to be mutually exclusive.\n",
|
||
|
" | \n",
|
||
|
" | ``axis`` defaults to the info axis that is used when indexing\n",
|
||
|
" | with ``[]``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame(np.array(([1,2,3], [4,5,6])),\n",
|
||
|
" | ... index=['mouse', 'rabbit'],\n",
|
||
|
" | ... columns=['one', 'two', 'three'])\n",
|
||
|
" | \n",
|
||
|
" | >>> # select columns by name\n",
|
||
|
" | >>> df.filter(items=['one', 'three'])\n",
|
||
|
" | one three\n",
|
||
|
" | mouse 1 3\n",
|
||
|
" | rabbit 4 6\n",
|
||
|
" | \n",
|
||
|
" | >>> # select columns by regular expression\n",
|
||
|
" | >>> df.filter(regex='e$', axis=1)\n",
|
||
|
" | one three\n",
|
||
|
" | mouse 1 3\n",
|
||
|
" | rabbit 4 6\n",
|
||
|
" | \n",
|
||
|
" | >>> # select rows containing 'bbi'\n",
|
||
|
" | >>> df.filter(like='bbi', axis=0)\n",
|
||
|
" | one two three\n",
|
||
|
" | rabbit 4 5 6\n",
|
||
|
" | \n",
|
||
|
" | first(self, offset)\n",
|
||
|
" | Convenience method for subsetting initial periods of time series data\n",
|
||
|
" | based on a date offset.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | offset : string, DateOffset, dateutil.relativedelta\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | subset : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | If the index is not a :class:`DatetimeIndex`\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | last : Select final periods of time series based on a date offset.\n",
|
||
|
" | at_time : Select values at a particular time of the day.\n",
|
||
|
" | between_time : Select values between particular times of the day.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> i = pd.date_range('2018-04-09', periods=4, freq='2D')\n",
|
||
|
" | >>> ts = pd.DataFrame({'A': [1,2,3,4]}, index=i)\n",
|
||
|
" | >>> ts\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 1\n",
|
||
|
" | 2018-04-11 2\n",
|
||
|
" | 2018-04-13 3\n",
|
||
|
" | 2018-04-15 4\n",
|
||
|
" | \n",
|
||
|
" | Get the rows for the first 3 days:\n",
|
||
|
" | \n",
|
||
|
" | >>> ts.first('3D')\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 1\n",
|
||
|
" | 2018-04-11 2\n",
|
||
|
" | \n",
|
||
|
" | Notice the data for 3 first calender days were returned, not the first\n",
|
||
|
" | 3 days observed in the dataset, and therefore data for 2018-04-13 was\n",
|
||
|
" | not returned.\n",
|
||
|
" | \n",
|
||
|
" | first_valid_index(self)\n",
|
||
|
" | Return index for first non-NA/null value.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | --------\n",
|
||
|
" | scalar : type of index\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | --------\n",
|
||
|
" | If all elements are non-NA/null, returns None.\n",
|
||
|
" | Also returns None for empty NDFrame.\n",
|
||
|
" | \n",
|
||
|
" | get(self, key, default=None)\n",
|
||
|
" | Get item from object for given key (DataFrame column, Panel slice,\n",
|
||
|
" | etc.). Returns default value if not found.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | key : object\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | value : same type as items contained in object\n",
|
||
|
" | \n",
|
||
|
" | get_dtype_counts(self)\n",
|
||
|
" | Return counts of unique dtypes in this object.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | dtype : Series\n",
|
||
|
" | Series with the count of columns with each dtype.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | dtypes : Return the dtypes in this object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = [['a', 1, 1.0], ['b', 2, 2.0], ['c', 3, 3.0]]\n",
|
||
|
" | >>> df = pd.DataFrame(a, columns=['str', 'int', 'float'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | str int float\n",
|
||
|
" | 0 a 1 1.0\n",
|
||
|
" | 1 b 2 2.0\n",
|
||
|
" | 2 c 3 3.0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.get_dtype_counts()\n",
|
||
|
" | float64 1\n",
|
||
|
" | int64 1\n",
|
||
|
" | object 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | get_ftype_counts(self)\n",
|
||
|
" | Return counts of unique ftypes in this object.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.23.0\n",
|
||
|
" | \n",
|
||
|
" | This is useful for SparseDataFrame or for DataFrames containing\n",
|
||
|
" | sparse arrays.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | dtype : Series\n",
|
||
|
" | Series with the count of columns with each type and\n",
|
||
|
" | sparsity (dense/sparse)\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | ftypes : Return ftypes (indication of sparse/dense and dtype) in\n",
|
||
|
" | this object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> a = [['a', 1, 1.0], ['b', 2, 2.0], ['c', 3, 3.0]]\n",
|
||
|
" | >>> df = pd.DataFrame(a, columns=['str', 'int', 'float'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | str int float\n",
|
||
|
" | 0 a 1 1.0\n",
|
||
|
" | 1 b 2 2.0\n",
|
||
|
" | 2 c 3 3.0\n",
|
||
|
" | \n",
|
||
|
" | >>> df.get_ftype_counts() # doctest: +SKIP\n",
|
||
|
" | float64:dense 1\n",
|
||
|
" | int64:dense 1\n",
|
||
|
" | object:dense 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | groupby(self, by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, observed=False, **kwargs)\n",
|
||
|
" | Group DataFrame or Series using a mapper or by a Series of columns.\n",
|
||
|
" | \n",
|
||
|
" | A groupby operation involves some combination of splitting the\n",
|
||
|
" | object, applying a function, and combining the results. This can be\n",
|
||
|
" | used to group large amounts of data and compute operations on these\n",
|
||
|
" | groups.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | by : mapping, function, label, or list of labels\n",
|
||
|
" | Used to determine the groups for the groupby.\n",
|
||
|
" | If ``by`` is a function, it's called on each value of the object's\n",
|
||
|
" | index. If a dict or Series is passed, the Series or dict VALUES\n",
|
||
|
" | will be used to determine the groups (the Series' values are first\n",
|
||
|
" | aligned; see ``.align()`` method). If an ndarray is passed, the\n",
|
||
|
" | values are used as-is determine the groups. A label or list of\n",
|
||
|
" | labels may be passed to group by the columns in ``self``. Notice\n",
|
||
|
" | that a tuple is interpreted a (single) key.\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | Split along rows (0) or columns (1).\n",
|
||
|
" | level : int, level name, or sequence of such, default None\n",
|
||
|
" | If the axis is a MultiIndex (hierarchical), group by a particular\n",
|
||
|
" | level or levels.\n",
|
||
|
" | as_index : bool, default True\n",
|
||
|
" | For aggregated output, return object with group labels as the\n",
|
||
|
" | index. Only relevant for DataFrame input. as_index=False is\n",
|
||
|
" | effectively \"SQL-style\" grouped output.\n",
|
||
|
" | sort : bool, default True\n",
|
||
|
" | Sort group keys. Get better performance by turning this off.\n",
|
||
|
" | Note this does not influence the order of observations within each\n",
|
||
|
" | group. Groupby preserves the order of rows within each group.\n",
|
||
|
" | group_keys : bool, default True\n",
|
||
|
" | When calling apply, add group keys to index to identify pieces.\n",
|
||
|
" | squeeze : bool, default False\n",
|
||
|
" | Reduce the dimensionality of the return type if possible,\n",
|
||
|
" | otherwise return a consistent type.\n",
|
||
|
" | observed : bool, default False\n",
|
||
|
" | This only applies if any of the groupers are Categoricals.\n",
|
||
|
" | If True: only show observed values for categorical groupers.\n",
|
||
|
" | If False: show all values for categorical groupers.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.23.0\n",
|
||
|
" | \n",
|
||
|
" | **kwargs\n",
|
||
|
" | Optional, only accepts keyword argument 'mutated' and is passed\n",
|
||
|
" | to groupby.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | DataFrameGroupBy or SeriesGroupBy\n",
|
||
|
" | Depends on the calling object and returns groupby object that\n",
|
||
|
" | contains information about the groups.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | resample : Convenience method for frequency conversion and resampling\n",
|
||
|
" | of time series.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | See the `user guide\n",
|
||
|
" | <http://pandas.pydata.org/pandas-docs/stable/groupby.html>`_ for more.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'Animal' : ['Falcon', 'Falcon',\n",
|
||
|
" | ... 'Parrot', 'Parrot'],\n",
|
||
|
" | ... 'Max Speed' : [380., 370., 24., 26.]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | Animal Max Speed\n",
|
||
|
" | 0 Falcon 380.0\n",
|
||
|
" | 1 Falcon 370.0\n",
|
||
|
" | 2 Parrot 24.0\n",
|
||
|
" | 3 Parrot 26.0\n",
|
||
|
" | >>> df.groupby(['Animal']).mean()\n",
|
||
|
" | Max Speed\n",
|
||
|
" | Animal\n",
|
||
|
" | Falcon 375.0\n",
|
||
|
" | Parrot 25.0\n",
|
||
|
" | \n",
|
||
|
" | **Hierarchical Indexes**\n",
|
||
|
" | \n",
|
||
|
" | We can groupby different levels of a hierarchical index\n",
|
||
|
" | using the `level` parameter:\n",
|
||
|
" | \n",
|
||
|
" | >>> arrays = [['Falcon', 'Falcon', 'Parrot', 'Parrot'],\n",
|
||
|
" | ... ['Capitve', 'Wild', 'Capitve', 'Wild']]\n",
|
||
|
" | >>> index = pd.MultiIndex.from_arrays(arrays, names=('Animal', 'Type'))\n",
|
||
|
" | >>> df = pd.DataFrame({'Max Speed' : [390., 350., 30., 20.]},\n",
|
||
|
" | ... index=index)\n",
|
||
|
" | >>> df\n",
|
||
|
" | Max Speed\n",
|
||
|
" | Animal Type\n",
|
||
|
" | Falcon Capitve 390.0\n",
|
||
|
" | Wild 350.0\n",
|
||
|
" | Parrot Capitve 30.0\n",
|
||
|
" | Wild 20.0\n",
|
||
|
" | >>> df.groupby(level=0).mean()\n",
|
||
|
" | Max Speed\n",
|
||
|
" | Animal\n",
|
||
|
" | Falcon 370.0\n",
|
||
|
" | Parrot 25.0\n",
|
||
|
" | >>> df.groupby(level=1).mean()\n",
|
||
|
" | Max Speed\n",
|
||
|
" | Type\n",
|
||
|
" | Capitve 210.0\n",
|
||
|
" | Wild 185.0\n",
|
||
|
" | \n",
|
||
|
" | head(self, n=5)\n",
|
||
|
" | Return the first `n` rows.\n",
|
||
|
" | \n",
|
||
|
" | This function returns the first `n` rows for the object based\n",
|
||
|
" | on position. It is useful for quickly testing if your object\n",
|
||
|
" | has the right type of data in it.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | n : int, default 5\n",
|
||
|
" | Number of rows to select.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | obj_head : same type as caller\n",
|
||
|
" | The first `n` rows of the caller object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.tail: Returns the last `n` rows.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'animal':['alligator', 'bee', 'falcon', 'lion',\n",
|
||
|
" | ... 'monkey', 'parrot', 'shark', 'whale', 'zebra']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | animal\n",
|
||
|
" | 0 alligator\n",
|
||
|
" | 1 bee\n",
|
||
|
" | 2 falcon\n",
|
||
|
" | 3 lion\n",
|
||
|
" | 4 monkey\n",
|
||
|
" | 5 parrot\n",
|
||
|
" | 6 shark\n",
|
||
|
" | 7 whale\n",
|
||
|
" | 8 zebra\n",
|
||
|
" | \n",
|
||
|
" | Viewing the first 5 lines\n",
|
||
|
" | \n",
|
||
|
" | >>> df.head()\n",
|
||
|
" | animal\n",
|
||
|
" | 0 alligator\n",
|
||
|
" | 1 bee\n",
|
||
|
" | 2 falcon\n",
|
||
|
" | 3 lion\n",
|
||
|
" | 4 monkey\n",
|
||
|
" | \n",
|
||
|
" | Viewing the first `n` lines (three in this case)\n",
|
||
|
" | \n",
|
||
|
" | >>> df.head(3)\n",
|
||
|
" | animal\n",
|
||
|
" | 0 alligator\n",
|
||
|
" | 1 bee\n",
|
||
|
" | 2 falcon\n",
|
||
|
" | \n",
|
||
|
" | infer_objects(self)\n",
|
||
|
" | Attempt to infer better dtypes for object columns.\n",
|
||
|
" | \n",
|
||
|
" | Attempts soft conversion of object-dtyped\n",
|
||
|
" | columns, leaving non-object and unconvertible\n",
|
||
|
" | columns unchanged. The inference rules are the\n",
|
||
|
" | same as during normal Series/DataFrame construction.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | converted : same type as input object\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | to_datetime : Convert argument to datetime.\n",
|
||
|
" | to_timedelta : Convert argument to timedelta.\n",
|
||
|
" | to_numeric : Convert argument to numeric type.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [\"a\", 1, 2, 3]})\n",
|
||
|
" | >>> df = df.iloc[1:]\n",
|
||
|
" | >>> df\n",
|
||
|
" | A\n",
|
||
|
" | 1 1\n",
|
||
|
" | 2 2\n",
|
||
|
" | 3 3\n",
|
||
|
" | \n",
|
||
|
" | >>> df.dtypes\n",
|
||
|
" | A object\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | >>> df.infer_objects().dtypes\n",
|
||
|
" | A int64\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | interpolate(self, method='linear', axis=0, limit=None, inplace=False, limit_direction='forward', limit_area=None, downcast=None, **kwargs)\n",
|
||
|
" | Interpolate values according to different methods.\n",
|
||
|
" | \n",
|
||
|
" | Please note that only ``method='linear'`` is supported for\n",
|
||
|
" | DataFrame/Series with a MultiIndex.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | method : str, default 'linear'\n",
|
||
|
" | Interpolation technique to use. One of:\n",
|
||
|
" | \n",
|
||
|
" | * 'linear': Ignore the index and treat the values as equally\n",
|
||
|
" | spaced. This is the only method supported on MultiIndexes.\n",
|
||
|
" | * 'time': Works on daily and higher resolution data to interpolate\n",
|
||
|
" | given length of interval.\n",
|
||
|
" | * 'index', 'values': use the actual numerical values of the index.\n",
|
||
|
" | * 'pad': Fill in NaNs using existing values.\n",
|
||
|
" | * 'nearest', 'zero', 'slinear', 'quadratic', 'cubic', 'spline',\n",
|
||
|
" | 'barycentric', 'polynomial': Passed to\n",
|
||
|
" | `scipy.interpolate.interp1d`. Both 'polynomial' and 'spline'\n",
|
||
|
" | require that you also specify an `order` (int),\n",
|
||
|
" | e.g. ``df.interpolate(method='polynomial', order=4)``.\n",
|
||
|
" | These use the numerical values of the index.\n",
|
||
|
" | * 'krogh', 'piecewise_polynomial', 'spline', 'pchip', 'akima':\n",
|
||
|
" | Wrappers around the SciPy interpolation methods of similar\n",
|
||
|
" | names. See `Notes`.\n",
|
||
|
" | * 'from_derivatives': Refers to\n",
|
||
|
" | `scipy.interpolate.BPoly.from_derivatives` which\n",
|
||
|
" | replaces 'piecewise_polynomial' interpolation method in\n",
|
||
|
" | scipy 0.18.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.1\n",
|
||
|
" | \n",
|
||
|
" | Added support for the 'akima' method.\n",
|
||
|
" | Added interpolate method 'from_derivatives' which replaces\n",
|
||
|
" | 'piecewise_polynomial' in SciPy 0.18; backwards-compatible with\n",
|
||
|
" | SciPy < 0.18\n",
|
||
|
" | \n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns', None}, default None\n",
|
||
|
" | Axis to interpolate along.\n",
|
||
|
" | limit : int, optional\n",
|
||
|
" | Maximum number of consecutive NaNs to fill. Must be greater than\n",
|
||
|
" | 0.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | Update the data in place if possible.\n",
|
||
|
" | limit_direction : {'forward', 'backward', 'both'}, default 'forward'\n",
|
||
|
" | If limit is specified, consecutive NaNs will be filled in this\n",
|
||
|
" | direction.\n",
|
||
|
" | limit_area : {`None`, 'inside', 'outside'}, default None\n",
|
||
|
" | If limit is specified, consecutive NaNs will be filled with this\n",
|
||
|
" | restriction.\n",
|
||
|
" | \n",
|
||
|
" | * ``None``: No fill restriction.\n",
|
||
|
" | * 'inside': Only fill NaNs surrounded by valid values\n",
|
||
|
" | (interpolate).\n",
|
||
|
" | * 'outside': Only fill NaNs outside valid values (extrapolate).\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | downcast : optional, 'infer' or None, defaults to None\n",
|
||
|
" | Downcast dtypes if possible.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Keyword arguments to pass on to the interpolating function.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Returns the same object type as the caller, interpolated at\n",
|
||
|
" | some or all ``NaN`` values\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | fillna : Fill missing values using different methods.\n",
|
||
|
" | scipy.interpolate.Akima1DInterpolator : Piecewise cubic polynomials\n",
|
||
|
" | (Akima interpolator).\n",
|
||
|
" | scipy.interpolate.BPoly.from_derivatives : Piecewise polynomial in the\n",
|
||
|
" | Bernstein basis.\n",
|
||
|
" | scipy.interpolate.interp1d : Interpolate a 1-D function.\n",
|
||
|
" | scipy.interpolate.KroghInterpolator : Interpolate polynomial (Krogh\n",
|
||
|
" | interpolator).\n",
|
||
|
" | scipy.interpolate.PchipInterpolator : PCHIP 1-d monotonic cubic\n",
|
||
|
" | interpolation.\n",
|
||
|
" | scipy.interpolate.CubicSpline : Cubic spline data interpolator.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | The 'krogh', 'piecewise_polynomial', 'spline', 'pchip' and 'akima'\n",
|
||
|
" | methods are wrappers around the respective SciPy implementations of\n",
|
||
|
" | similar names. These use the actual numerical values of the index.\n",
|
||
|
" | For more information on their behavior, see the\n",
|
||
|
" | `SciPy documentation\n",
|
||
|
" | <http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation>`__\n",
|
||
|
" | and `SciPy tutorial\n",
|
||
|
" | <http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html>`__.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Filling in ``NaN`` in a :class:`~pandas.Series` via linear\n",
|
||
|
" | interpolation.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([0, 1, np.nan, 3])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 3.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | >>> s.interpolate()\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 3.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Filling in ``NaN`` in a Series by padding, but filling at most two\n",
|
||
|
" | consecutive ``NaN`` at a time.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([np.nan, \"single_one\", np.nan,\n",
|
||
|
" | ... \"fill_two_more\", np.nan, np.nan, np.nan,\n",
|
||
|
" | ... 4.71, np.nan])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 single_one\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 fill_two_more\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | 5 NaN\n",
|
||
|
" | 6 NaN\n",
|
||
|
" | 7 4.71\n",
|
||
|
" | 8 NaN\n",
|
||
|
" | dtype: object\n",
|
||
|
" | >>> s.interpolate(method='pad', limit=2)\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 single_one\n",
|
||
|
" | 2 single_one\n",
|
||
|
" | 3 fill_two_more\n",
|
||
|
" | 4 fill_two_more\n",
|
||
|
" | 5 fill_two_more\n",
|
||
|
" | 6 NaN\n",
|
||
|
" | 7 4.71\n",
|
||
|
" | 8 4.71\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | Filling in ``NaN`` in a Series via polynomial interpolation or splines:\n",
|
||
|
" | Both 'polynomial' and 'spline' methods require that you also specify\n",
|
||
|
" | an ``order`` (int).\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([0, 2, np.nan, 8])\n",
|
||
|
" | >>> s.interpolate(method='polynomial', order=2)\n",
|
||
|
" | 0 0.000000\n",
|
||
|
" | 1 2.000000\n",
|
||
|
" | 2 4.666667\n",
|
||
|
" | 3 8.000000\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Fill the DataFrame forward (that is, going down) along each column\n",
|
||
|
" | using linear interpolation.\n",
|
||
|
" | \n",
|
||
|
" | Note how the last entry in column 'a' is interpolated differently,\n",
|
||
|
" | because there is no entry after it to use for interpolation.\n",
|
||
|
" | Note how the first entry in column 'b' remains ``NaN``, because there\n",
|
||
|
" | is no entry befofe it to use for interpolation.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([(0.0, np.nan, -1.0, 1.0),\n",
|
||
|
" | ... (np.nan, 2.0, np.nan, np.nan),\n",
|
||
|
" | ... (2.0, 3.0, np.nan, 9.0),\n",
|
||
|
" | ... (np.nan, 4.0, -4.0, 16.0)],\n",
|
||
|
" | ... columns=list('abcd'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 0.0 NaN -1.0 1.0\n",
|
||
|
" | 1 NaN 2.0 NaN NaN\n",
|
||
|
" | 2 2.0 3.0 NaN 9.0\n",
|
||
|
" | 3 NaN 4.0 -4.0 16.0\n",
|
||
|
" | >>> df.interpolate(method='linear', limit_direction='forward', axis=0)\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 0.0 NaN -1.0 1.0\n",
|
||
|
" | 1 1.0 2.0 -2.0 5.0\n",
|
||
|
" | 2 2.0 3.0 -3.0 9.0\n",
|
||
|
" | 3 2.0 4.0 -4.0 16.0\n",
|
||
|
" | \n",
|
||
|
" | Using polynomial interpolation.\n",
|
||
|
" | \n",
|
||
|
" | >>> df['d'].interpolate(method='polynomial', order=2)\n",
|
||
|
" | 0 1.0\n",
|
||
|
" | 1 4.0\n",
|
||
|
" | 2 9.0\n",
|
||
|
" | 3 16.0\n",
|
||
|
" | Name: d, dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | last(self, offset)\n",
|
||
|
" | Convenience method for subsetting final periods of time series data\n",
|
||
|
" | based on a date offset.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | offset : string, DateOffset, dateutil.relativedelta\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | subset : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | If the index is not a :class:`DatetimeIndex`\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | first : Select initial periods of time series based on a date offset.\n",
|
||
|
" | at_time : Select values at a particular time of the day.\n",
|
||
|
" | between_time : Select values between particular times of the day.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> i = pd.date_range('2018-04-09', periods=4, freq='2D')\n",
|
||
|
" | >>> ts = pd.DataFrame({'A': [1,2,3,4]}, index=i)\n",
|
||
|
" | >>> ts\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-09 1\n",
|
||
|
" | 2018-04-11 2\n",
|
||
|
" | 2018-04-13 3\n",
|
||
|
" | 2018-04-15 4\n",
|
||
|
" | \n",
|
||
|
" | Get the rows for the last 3 days:\n",
|
||
|
" | \n",
|
||
|
" | >>> ts.last('3D')\n",
|
||
|
" | A\n",
|
||
|
" | 2018-04-13 3\n",
|
||
|
" | 2018-04-15 4\n",
|
||
|
" | \n",
|
||
|
" | Notice the data for 3 last calender days were returned, not the last\n",
|
||
|
" | 3 observed days in the dataset, and therefore data for 2018-04-11 was\n",
|
||
|
" | not returned.\n",
|
||
|
" | \n",
|
||
|
" | last_valid_index(self)\n",
|
||
|
" | Return index for last non-NA/null value.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | --------\n",
|
||
|
" | scalar : type of index\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | --------\n",
|
||
|
" | If all elements are non-NA/null, returns None.\n",
|
||
|
" | Also returns None for empty NDFrame.\n",
|
||
|
" | \n",
|
||
|
" | mask(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False, raise_on_error=None)\n",
|
||
|
" | Replace values where the condition is True.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | cond : boolean NDFrame, array-like, or callable\n",
|
||
|
" | Where `cond` is False, keep the original value. Where\n",
|
||
|
" | True, replace with corresponding value from `other`.\n",
|
||
|
" | If `cond` is callable, it is computed on the NDFrame and\n",
|
||
|
" | should return boolean NDFrame or array. The callable must\n",
|
||
|
" | not change input NDFrame (though pandas doesn't check it).\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.1\n",
|
||
|
" | A callable can be used as cond.\n",
|
||
|
" | \n",
|
||
|
" | other : scalar, NDFrame, or callable\n",
|
||
|
" | Entries where `cond` is True are replaced with\n",
|
||
|
" | corresponding value from `other`.\n",
|
||
|
" | If other is callable, it is computed on the NDFrame and\n",
|
||
|
" | should return scalar or NDFrame. The callable must not\n",
|
||
|
" | change input NDFrame (though pandas doesn't check it).\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.1\n",
|
||
|
" | A callable can be used as other.\n",
|
||
|
" | \n",
|
||
|
" | inplace : boolean, default False\n",
|
||
|
" | Whether to perform the operation in place on the data.\n",
|
||
|
" | axis : int, default None\n",
|
||
|
" | Alignment axis if needed.\n",
|
||
|
" | level : int, default None\n",
|
||
|
" | Alignment level if needed.\n",
|
||
|
" | errors : str, {'raise', 'ignore'}, default `raise`\n",
|
||
|
" | Note that currently this parameter won't affect\n",
|
||
|
" | the results and will always coerce to a suitable dtype.\n",
|
||
|
" | \n",
|
||
|
" | - `raise` : allow exceptions to be raised.\n",
|
||
|
" | - `ignore` : suppress exceptions. On error return original object.\n",
|
||
|
" | \n",
|
||
|
" | try_cast : boolean, default False\n",
|
||
|
" | Try to cast the result back to the input type (if possible).\n",
|
||
|
" | raise_on_error : boolean, default True\n",
|
||
|
" | Whether to raise on invalid data types (e.g. trying to where on\n",
|
||
|
" | strings).\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Use `errors`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | wh : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | :func:`DataFrame.where` : Return an object of same shape as\n",
|
||
|
" | self.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | The mask method is an application of the if-then idiom. For each\n",
|
||
|
" | element in the calling DataFrame, if ``cond`` is ``False`` the\n",
|
||
|
" | element is used; otherwise the corresponding element from the DataFrame\n",
|
||
|
" | ``other`` is used.\n",
|
||
|
" | \n",
|
||
|
" | The signature for :func:`DataFrame.where` differs from\n",
|
||
|
" | :func:`numpy.where`. Roughly ``df1.where(m, df2)`` is equivalent to\n",
|
||
|
" | ``np.where(m, df1, df2)``.\n",
|
||
|
" | \n",
|
||
|
" | For further details and examples see the ``mask`` documentation in\n",
|
||
|
" | :ref:`indexing <indexing.where_mask>`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(range(5))\n",
|
||
|
" | >>> s.where(s > 0)\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 3.0\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.mask(s > 0)\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.where(s > 1, 10)\n",
|
||
|
" | 0 10\n",
|
||
|
" | 1 10\n",
|
||
|
" | 2 2\n",
|
||
|
" | 3 3\n",
|
||
|
" | 4 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])\n",
|
||
|
" | >>> m = df % 3 == 0\n",
|
||
|
" | >>> df.where(m, -df)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 0 -1\n",
|
||
|
" | 1 -2 3\n",
|
||
|
" | 2 -4 -5\n",
|
||
|
" | 3 6 -7\n",
|
||
|
" | 4 -8 9\n",
|
||
|
" | >>> df.where(m, -df) == np.where(m, df, -df)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 True True\n",
|
||
|
" | 1 True True\n",
|
||
|
" | 2 True True\n",
|
||
|
" | 3 True True\n",
|
||
|
" | 4 True True\n",
|
||
|
" | >>> df.where(m, -df) == df.mask(~m, -df)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 True True\n",
|
||
|
" | 1 True True\n",
|
||
|
" | 2 True True\n",
|
||
|
" | 3 True True\n",
|
||
|
" | 4 True True\n",
|
||
|
" | \n",
|
||
|
" | pct_change(self, periods=1, fill_method='pad', limit=None, freq=None, **kwargs)\n",
|
||
|
" | Percentage change between the current and a prior element.\n",
|
||
|
" | \n",
|
||
|
" | Computes the percentage change from the immediately previous row by\n",
|
||
|
" | default. This is useful in comparing the percentage of change in a time\n",
|
||
|
" | series of elements.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | periods : int, default 1\n",
|
||
|
" | Periods to shift for forming percent change.\n",
|
||
|
" | fill_method : str, default 'pad'\n",
|
||
|
" | How to handle NAs before computing percent changes.\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | The number of consecutive NAs to fill before stopping.\n",
|
||
|
" | freq : DateOffset, timedelta, or offset alias string, optional\n",
|
||
|
" | Increment to use from time series API (e.g. 'M' or BDay()).\n",
|
||
|
" | **kwargs\n",
|
||
|
" | Additional keyword arguments are passed into\n",
|
||
|
" | `DataFrame.shift` or `Series.shift`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | chg : Series or DataFrame\n",
|
||
|
" | The same type as the calling object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.diff : Compute the difference of two elements in a Series.\n",
|
||
|
" | DataFrame.diff : Compute the difference of two elements in a DataFrame.\n",
|
||
|
" | Series.shift : Shift the index by some number of periods.\n",
|
||
|
" | DataFrame.shift : Shift the index by some number of periods.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([90, 91, 85])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 90\n",
|
||
|
" | 1 91\n",
|
||
|
" | 2 85\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.pct_change()\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 0.011111\n",
|
||
|
" | 2 -0.065934\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.pct_change(periods=2)\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 -0.055556\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | See the percentage change in a Series where filling NAs with last\n",
|
||
|
" | valid observation forward to next valid.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([90, 91, None, 85])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 90.0\n",
|
||
|
" | 1 91.0\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 85.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.pct_change(fill_method='ffill')\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 0.011111\n",
|
||
|
" | 2 0.000000\n",
|
||
|
" | 3 -0.065934\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | Percentage change in French franc, Deutsche Mark, and Italian lira from\n",
|
||
|
" | 1980-01-01 to 1980-03-01.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\n",
|
||
|
" | ... 'FR': [4.0405, 4.0963, 4.3149],\n",
|
||
|
" | ... 'GR': [1.7246, 1.7482, 1.8519],\n",
|
||
|
" | ... 'IT': [804.74, 810.01, 860.13]},\n",
|
||
|
" | ... index=['1980-01-01', '1980-02-01', '1980-03-01'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | FR GR IT\n",
|
||
|
" | 1980-01-01 4.0405 1.7246 804.74\n",
|
||
|
" | 1980-02-01 4.0963 1.7482 810.01\n",
|
||
|
" | 1980-03-01 4.3149 1.8519 860.13\n",
|
||
|
" | \n",
|
||
|
" | >>> df.pct_change()\n",
|
||
|
" | FR GR IT\n",
|
||
|
" | 1980-01-01 NaN NaN NaN\n",
|
||
|
" | 1980-02-01 0.013810 0.013684 0.006549\n",
|
||
|
" | 1980-03-01 0.053365 0.059318 0.061876\n",
|
||
|
" | \n",
|
||
|
" | Percentage of change in GOOG and APPL stock volume. Shows computing\n",
|
||
|
" | the percentage change between columns.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\n",
|
||
|
" | ... '2016': [1769950, 30586265],\n",
|
||
|
" | ... '2015': [1500923, 40912316],\n",
|
||
|
" | ... '2014': [1371819, 41403351]},\n",
|
||
|
" | ... index=['GOOG', 'APPL'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | 2016 2015 2014\n",
|
||
|
" | GOOG 1769950 1500923 1371819\n",
|
||
|
" | APPL 30586265 40912316 41403351\n",
|
||
|
" | \n",
|
||
|
" | >>> df.pct_change(axis='columns')\n",
|
||
|
" | 2016 2015 2014\n",
|
||
|
" | GOOG NaN -0.151997 -0.086016\n",
|
||
|
" | APPL NaN 0.337604 0.012002\n",
|
||
|
" | \n",
|
||
|
" | pipe(self, func, *args, **kwargs)\n",
|
||
|
" | Apply func(self, \\*args, \\*\\*kwargs).\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | func : function\n",
|
||
|
" | function to apply to the NDFrame.\n",
|
||
|
" | ``args``, and ``kwargs`` are passed into ``func``.\n",
|
||
|
" | Alternatively a ``(callable, data_keyword)`` tuple where\n",
|
||
|
" | ``data_keyword`` is a string indicating the keyword of\n",
|
||
|
" | ``callable`` that expects the NDFrame.\n",
|
||
|
" | args : iterable, optional\n",
|
||
|
" | positional arguments passed into ``func``.\n",
|
||
|
" | kwargs : mapping, optional\n",
|
||
|
" | a dictionary of keyword arguments passed into ``func``.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | object : the return type of ``func``.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.apply\n",
|
||
|
" | DataFrame.applymap\n",
|
||
|
" | Series.map\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | \n",
|
||
|
" | Use ``.pipe`` when chaining together functions that expect\n",
|
||
|
" | Series, DataFrames or GroupBy objects. Instead of writing\n",
|
||
|
" | \n",
|
||
|
" | >>> f(g(h(df), arg1=a), arg2=b, arg3=c)\n",
|
||
|
" | \n",
|
||
|
" | You can write\n",
|
||
|
" | \n",
|
||
|
" | >>> (df.pipe(h)\n",
|
||
|
" | ... .pipe(g, arg1=a)\n",
|
||
|
" | ... .pipe(f, arg2=b, arg3=c)\n",
|
||
|
" | ... )\n",
|
||
|
" | \n",
|
||
|
" | If you have a function that takes the data as (say) the second\n",
|
||
|
" | argument, pass a tuple indicating which keyword expects the\n",
|
||
|
" | data. For example, suppose ``f`` takes its data as ``arg2``:\n",
|
||
|
" | \n",
|
||
|
" | >>> (df.pipe(h)\n",
|
||
|
" | ... .pipe(g, arg1=a)\n",
|
||
|
" | ... .pipe((f, 'arg2'), arg1=a, arg3=c)\n",
|
||
|
" | ... )\n",
|
||
|
" | \n",
|
||
|
" | pop(self, item)\n",
|
||
|
" | Return item and drop from frame. Raise KeyError if not found.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | item : str\n",
|
||
|
" | Column label to be popped\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | popped : Series\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([('falcon', 'bird', 389.0),\n",
|
||
|
" | ... ('parrot', 'bird', 24.0),\n",
|
||
|
" | ... ('lion', 'mammal', 80.5),\n",
|
||
|
" | ... ('monkey', 'mammal', np.nan)],\n",
|
||
|
" | ... columns=('name', 'class', 'max_speed'))\n",
|
||
|
" | >>> df\n",
|
||
|
" | name class max_speed\n",
|
||
|
" | 0 falcon bird 389.0\n",
|
||
|
" | 1 parrot bird 24.0\n",
|
||
|
" | 2 lion mammal 80.5\n",
|
||
|
" | 3 monkey mammal NaN\n",
|
||
|
" | \n",
|
||
|
" | >>> df.pop('class')\n",
|
||
|
" | 0 bird\n",
|
||
|
" | 1 bird\n",
|
||
|
" | 2 mammal\n",
|
||
|
" | 3 mammal\n",
|
||
|
" | Name: class, dtype: object\n",
|
||
|
" | \n",
|
||
|
" | >>> df\n",
|
||
|
" | name max_speed\n",
|
||
|
" | 0 falcon 389.0\n",
|
||
|
" | 1 parrot 24.0\n",
|
||
|
" | 2 lion 80.5\n",
|
||
|
" | 3 monkey NaN\n",
|
||
|
" | \n",
|
||
|
" | rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)\n",
|
||
|
" | Compute numerical data ranks (1 through n) along axis. Equal values are\n",
|
||
|
" | assigned a rank that is the average of the ranks of those values.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | index to direct ranking\n",
|
||
|
" | method : {'average', 'min', 'max', 'first', 'dense'}\n",
|
||
|
" | * average: average rank of group\n",
|
||
|
" | * min: lowest rank in group\n",
|
||
|
" | * max: highest rank in group\n",
|
||
|
" | * first: ranks assigned in order they appear in the array\n",
|
||
|
" | * dense: like 'min', but rank always increases by 1 between groups\n",
|
||
|
" | numeric_only : boolean, default None\n",
|
||
|
" | Include only float, int, boolean data. Valid only for DataFrame or\n",
|
||
|
" | Panel objects\n",
|
||
|
" | na_option : {'keep', 'top', 'bottom'}\n",
|
||
|
" | * keep: leave NA values where they are\n",
|
||
|
" | * top: smallest rank if ascending\n",
|
||
|
" | * bottom: smallest rank if descending\n",
|
||
|
" | ascending : boolean, default True\n",
|
||
|
" | False for ranks by high (1) to low (N)\n",
|
||
|
" | pct : boolean, default False\n",
|
||
|
" | Computes percentage rank of data\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | ranks : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | reindex_like(self, other, method=None, copy=True, limit=None, tolerance=None)\n",
|
||
|
" | Return an object with matching indices as other object.\n",
|
||
|
" | \n",
|
||
|
" | Conform the object to the same index on all axes. Optional\n",
|
||
|
" | filling logic, placing NaN in locations having no value\n",
|
||
|
" | in the previous index. A new object is produced unless the\n",
|
||
|
" | new index is equivalent to the current one and copy=False.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | other : Object of the same data type\n",
|
||
|
" | Its row and column indices are used to define the new indices\n",
|
||
|
" | of this object.\n",
|
||
|
" | method : {None, 'backfill'/'bfill', 'pad'/'ffill', 'nearest'}\n",
|
||
|
" | Method to use for filling holes in reindexed DataFrame.\n",
|
||
|
" | Please note: this is only applicable to DataFrames/Series with a\n",
|
||
|
" | monotonically increasing/decreasing index.\n",
|
||
|
" | \n",
|
||
|
" | * None (default): don't fill gaps\n",
|
||
|
" | * pad / ffill: propagate last valid observation forward to next\n",
|
||
|
" | valid\n",
|
||
|
" | * backfill / bfill: use next valid observation to fill gap\n",
|
||
|
" | * nearest: use nearest valid observations to fill gap\n",
|
||
|
" | \n",
|
||
|
" | copy : bool, default True\n",
|
||
|
" | Return a new object, even if the passed indexes are the same.\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | Maximum number of consecutive labels to fill for inexact matches.\n",
|
||
|
" | tolerance : optional\n",
|
||
|
" | Maximum distance between original and new labels for inexact\n",
|
||
|
" | matches. The values of the index at the matching locations most\n",
|
||
|
" | satisfy the equation ``abs(index[indexer] - target) <= tolerance``.\n",
|
||
|
" | \n",
|
||
|
" | Tolerance may be a scalar value, which applies the same tolerance\n",
|
||
|
" | to all values, or list-like, which applies variable tolerance per\n",
|
||
|
" | element. List-like includes list, tuple, array, Series, and must be\n",
|
||
|
" | the same size as the index and its dtype must exactly match the\n",
|
||
|
" | index's type.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0 (list-like tolerance)\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Same type as caller, but with changed indices on each axis.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.set_index : Set row labels.\n",
|
||
|
" | DataFrame.reset_index : Remove row labels or move them to new columns.\n",
|
||
|
" | DataFrame.reindex : Change to new indices or expand indices.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Same as calling\n",
|
||
|
" | ``.reindex(index=other.index, columns=other.columns,...)``.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df1 = pd.DataFrame([[24.3, 75.7, 'high'],\n",
|
||
|
" | ... [31, 87.8, 'high'],\n",
|
||
|
" | ... [22, 71.6, 'medium'],\n",
|
||
|
" | ... [35, 95, 'medium']],\n",
|
||
|
" | ... columns=['temp_celsius', 'temp_fahrenheit', 'windspeed'],\n",
|
||
|
" | ... index=pd.date_range(start='2014-02-12',\n",
|
||
|
" | ... end='2014-02-15', freq='D'))\n",
|
||
|
" | \n",
|
||
|
" | >>> df1\n",
|
||
|
" | temp_celsius temp_fahrenheit windspeed\n",
|
||
|
" | 2014-02-12 24.3 75.7 high\n",
|
||
|
" | 2014-02-13 31.0 87.8 high\n",
|
||
|
" | 2014-02-14 22.0 71.6 medium\n",
|
||
|
" | 2014-02-15 35.0 95.0 medium\n",
|
||
|
" | \n",
|
||
|
" | >>> df2 = pd.DataFrame([[28, 'low'],\n",
|
||
|
" | ... [30, 'low'],\n",
|
||
|
" | ... [35.1, 'medium']],\n",
|
||
|
" | ... columns=['temp_celsius', 'windspeed'],\n",
|
||
|
" | ... index=pd.DatetimeIndex(['2014-02-12', '2014-02-13',\n",
|
||
|
" | ... '2014-02-15']))\n",
|
||
|
" | \n",
|
||
|
" | >>> df2\n",
|
||
|
" | temp_celsius windspeed\n",
|
||
|
" | 2014-02-12 28.0 low\n",
|
||
|
" | 2014-02-13 30.0 low\n",
|
||
|
" | 2014-02-15 35.1 medium\n",
|
||
|
" | \n",
|
||
|
" | >>> df2.reindex_like(df1)\n",
|
||
|
" | temp_celsius temp_fahrenheit windspeed\n",
|
||
|
" | 2014-02-12 28.0 NaN low\n",
|
||
|
" | 2014-02-13 30.0 NaN low\n",
|
||
|
" | 2014-02-14 NaN NaN NaN\n",
|
||
|
" | 2014-02-15 35.1 NaN medium\n",
|
||
|
" | \n",
|
||
|
" | rename_axis(self, mapper=None, index=None, columns=None, axis=None, copy=True, inplace=False)\n",
|
||
|
" | Set the name of the axis for the index or columns.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | mapper : scalar, list-like, optional\n",
|
||
|
" | Value to set the axis name attribute.\n",
|
||
|
" | index, columns : scalar, list-like, dict-like or function, optional\n",
|
||
|
" | A scalar, list-like, dict-like or functions transformations to\n",
|
||
|
" | apply to that axis' values.\n",
|
||
|
" | \n",
|
||
|
" | Use either ``mapper`` and ``axis`` to\n",
|
||
|
" | specify the axis to target with ``mapper``, or ``index``\n",
|
||
|
" | and/or ``columns``.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | The axis to rename.\n",
|
||
|
" | copy : bool, default True\n",
|
||
|
" | Also copy underlying data.\n",
|
||
|
" | inplace : bool, default False\n",
|
||
|
" | Modifies the object directly, instead of creating a new Series\n",
|
||
|
" | or DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series, DataFrame, or None\n",
|
||
|
" | The same type as the caller or None if `inplace` is True.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.rename : Alter Series index labels or name.\n",
|
||
|
" | DataFrame.rename : Alter DataFrame index labels or name.\n",
|
||
|
" | Index.rename : Set new names on index.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Prior to version 0.21.0, ``rename_axis`` could also be used to change\n",
|
||
|
" | the axis *labels* by passing a mapping or scalar. This behavior is\n",
|
||
|
" | deprecated and will be removed in a future version. Use ``rename``\n",
|
||
|
" | instead.\n",
|
||
|
" | \n",
|
||
|
" | ``DataFrame.rename_axis`` supports two calling conventions\n",
|
||
|
" | \n",
|
||
|
" | * ``(index=index_mapper, columns=columns_mapper, ...)``\n",
|
||
|
" | * ``(mapper, axis={'index', 'columns'}, ...)``\n",
|
||
|
" | \n",
|
||
|
" | The first calling convention will only modify the names of\n",
|
||
|
" | the index and/or the names of the Index object that is the columns.\n",
|
||
|
" | In this case, the parameter ``copy`` is ignored.\n",
|
||
|
" | \n",
|
||
|
" | The second calling convention will modify the names of the\n",
|
||
|
" | the corresponding index if mapper is a list or a scalar.\n",
|
||
|
" | However, if mapper is dict-like or a function, it will use the\n",
|
||
|
" | deprecated behavior of modifying the axis *labels*.\n",
|
||
|
" | \n",
|
||
|
" | We *highly* recommend using keyword arguments to clarify your\n",
|
||
|
" | intent.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([\"dog\", \"cat\", \"monkey\"])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 dog\n",
|
||
|
" | 1 cat\n",
|
||
|
" | 2 monkey\n",
|
||
|
" | dtype: object\n",
|
||
|
" | >>> s.rename_axis(\"animal\")\n",
|
||
|
" | animal\n",
|
||
|
" | 0 dog\n",
|
||
|
" | 1 cat\n",
|
||
|
" | 2 monkey\n",
|
||
|
" | dtype: object\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"num_legs\": [4, 4, 2],\n",
|
||
|
" | ... \"num_arms\": [0, 0, 2]},\n",
|
||
|
" | ... [\"dog\", \"cat\", \"monkey\"])\n",
|
||
|
" | >>> df\n",
|
||
|
" | num_legs num_arms\n",
|
||
|
" | dog 4 0\n",
|
||
|
" | cat 4 0\n",
|
||
|
" | monkey 2 2\n",
|
||
|
" | >>> df = df.rename_axis(\"animal\")\n",
|
||
|
" | >>> df\n",
|
||
|
" | num_legs num_arms\n",
|
||
|
" | animal\n",
|
||
|
" | dog 4 0\n",
|
||
|
" | cat 4 0\n",
|
||
|
" | monkey 2 2\n",
|
||
|
" | >>> df = df.rename_axis(\"limbs\", axis=\"columns\")\n",
|
||
|
" | >>> df\n",
|
||
|
" | limbs num_legs num_arms\n",
|
||
|
" | animal\n",
|
||
|
" | dog 4 0\n",
|
||
|
" | cat 4 0\n",
|
||
|
" | monkey 2 2\n",
|
||
|
" | \n",
|
||
|
" | **MultiIndex**\n",
|
||
|
" | \n",
|
||
|
" | >>> df.index = pd.MultiIndex.from_product([['mammal'],\n",
|
||
|
" | ... ['dog', 'cat', 'monkey']],\n",
|
||
|
" | ... names=['type', 'name'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | limbs num_legs num_arms\n",
|
||
|
" | type name\n",
|
||
|
" | mammal dog 4 0\n",
|
||
|
" | cat 4 0\n",
|
||
|
" | monkey 2 2\n",
|
||
|
" | \n",
|
||
|
" | >>> df.rename_axis(index={'type': 'class'})\n",
|
||
|
" | limbs num_legs num_arms\n",
|
||
|
" | class name\n",
|
||
|
" | mammal dog 4 0\n",
|
||
|
" | cat 4 0\n",
|
||
|
" | monkey 2 2\n",
|
||
|
" | \n",
|
||
|
" | >>> df.rename_axis(columns=str.upper)\n",
|
||
|
" | LIMBS num_legs num_arms\n",
|
||
|
" | type name\n",
|
||
|
" | mammal dog 4 0\n",
|
||
|
" | cat 4 0\n",
|
||
|
" | monkey 2 2\n",
|
||
|
" | \n",
|
||
|
" | resample(self, rule, how=None, axis=0, fill_method=None, closed=None, label=None, convention='start', kind=None, loffset=None, limit=None, base=0, on=None, level=None)\n",
|
||
|
" | Resample time-series data.\n",
|
||
|
" | \n",
|
||
|
" | Convenience method for frequency conversion and resampling of time\n",
|
||
|
" | series. Object must have a datetime-like index (`DatetimeIndex`,\n",
|
||
|
" | `PeriodIndex`, or `TimedeltaIndex`), or pass datetime-like values\n",
|
||
|
" | to the `on` or `level` keyword.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | rule : str\n",
|
||
|
" | The offset string or object representing target conversion.\n",
|
||
|
" | how : str\n",
|
||
|
" | Method for down/re-sampling, default to 'mean' for downsampling.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.18.0\n",
|
||
|
" | The new syntax is ``.resample(...).mean()``, or\n",
|
||
|
" | ``.resample(...).apply(<func>)``\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | Which axis to use for up- or down-sampling. For `Series` this\n",
|
||
|
" | will default to 0, i.e. along the rows. Must be\n",
|
||
|
" | `DatetimeIndex`, `TimedeltaIndex` or `PeriodIndex`.\n",
|
||
|
" | fill_method : str, default None\n",
|
||
|
" | Filling method for upsampling.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.18.0\n",
|
||
|
" | The new syntax is ``.resample(...).<func>()``,\n",
|
||
|
" | e.g. ``.resample(...).pad()``\n",
|
||
|
" | closed : {'right', 'left'}, default None\n",
|
||
|
" | Which side of bin interval is closed. The default is 'left'\n",
|
||
|
" | for all frequency offsets except for 'M', 'A', 'Q', 'BM',\n",
|
||
|
" | 'BA', 'BQ', and 'W' which all have a default of 'right'.\n",
|
||
|
" | label : {'right', 'left'}, default None\n",
|
||
|
" | Which bin edge label to label bucket with. The default is 'left'\n",
|
||
|
" | for all frequency offsets except for 'M', 'A', 'Q', 'BM',\n",
|
||
|
" | 'BA', 'BQ', and 'W' which all have a default of 'right'.\n",
|
||
|
" | convention : {'start', 'end', 's', 'e'}, default 'start'\n",
|
||
|
" | For `PeriodIndex` only, controls whether to use the start or\n",
|
||
|
" | end of `rule`.\n",
|
||
|
" | kind : {'timestamp', 'period'}, optional, default None\n",
|
||
|
" | Pass 'timestamp' to convert the resulting index to a\n",
|
||
|
" | `DateTimeIndex` or 'period' to convert it to a `PeriodIndex`.\n",
|
||
|
" | By default the input representation is retained.\n",
|
||
|
" | loffset : timedelta, default None\n",
|
||
|
" | Adjust the resampled time labels.\n",
|
||
|
" | limit : int, default None\n",
|
||
|
" | Maximum size gap when reindexing with `fill_method`.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.18.0\n",
|
||
|
" | base : int, default 0\n",
|
||
|
" | For frequencies that evenly subdivide 1 day, the \"origin\" of the\n",
|
||
|
" | aggregated intervals. For example, for '5min' frequency, base could\n",
|
||
|
" | range from 0 through 4. Defaults to 0.\n",
|
||
|
" | on : str, optional\n",
|
||
|
" | For a DataFrame, column to use instead of index for resampling.\n",
|
||
|
" | Column must be datetime-like.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | level : str or int, optional\n",
|
||
|
" | For a MultiIndex, level (name or number) to use for\n",
|
||
|
" | resampling. `level` must be datetime-like.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Resampler object\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | groupby : Group by mapping, function, label, or list of labels.\n",
|
||
|
" | Series.resample : Resample a Series.\n",
|
||
|
" | DataFrame.resample: Resample a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | See the `user guide\n",
|
||
|
" | <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#resampling>`_\n",
|
||
|
" | for more.\n",
|
||
|
" | \n",
|
||
|
" | To learn more about the offset strings, please see `this link\n",
|
||
|
" | <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | Start by creating a series with 9 one minute timestamps.\n",
|
||
|
" | \n",
|
||
|
" | >>> index = pd.date_range('1/1/2000', periods=9, freq='T')\n",
|
||
|
" | >>> series = pd.Series(range(9), index=index)\n",
|
||
|
" | >>> series\n",
|
||
|
" | 2000-01-01 00:00:00 0\n",
|
||
|
" | 2000-01-01 00:01:00 1\n",
|
||
|
" | 2000-01-01 00:02:00 2\n",
|
||
|
" | 2000-01-01 00:03:00 3\n",
|
||
|
" | 2000-01-01 00:04:00 4\n",
|
||
|
" | 2000-01-01 00:05:00 5\n",
|
||
|
" | 2000-01-01 00:06:00 6\n",
|
||
|
" | 2000-01-01 00:07:00 7\n",
|
||
|
" | 2000-01-01 00:08:00 8\n",
|
||
|
" | Freq: T, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Downsample the series into 3 minute bins and sum the values\n",
|
||
|
" | of the timestamps falling into a bin.\n",
|
||
|
" | \n",
|
||
|
" | >>> series.resample('3T').sum()\n",
|
||
|
" | 2000-01-01 00:00:00 3\n",
|
||
|
" | 2000-01-01 00:03:00 12\n",
|
||
|
" | 2000-01-01 00:06:00 21\n",
|
||
|
" | Freq: 3T, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Downsample the series into 3 minute bins as above, but label each\n",
|
||
|
" | bin using the right edge instead of the left. Please note that the\n",
|
||
|
" | value in the bucket used as the label is not included in the bucket,\n",
|
||
|
" | which it labels. For example, in the original series the\n",
|
||
|
" | bucket ``2000-01-01 00:03:00`` contains the value 3, but the summed\n",
|
||
|
" | value in the resampled bucket with the label ``2000-01-01 00:03:00``\n",
|
||
|
" | does not include 3 (if it did, the summed value would be 6, not 3).\n",
|
||
|
" | To include this value close the right side of the bin interval as\n",
|
||
|
" | illustrated in the example below this one.\n",
|
||
|
" | \n",
|
||
|
" | >>> series.resample('3T', label='right').sum()\n",
|
||
|
" | 2000-01-01 00:03:00 3\n",
|
||
|
" | 2000-01-01 00:06:00 12\n",
|
||
|
" | 2000-01-01 00:09:00 21\n",
|
||
|
" | Freq: 3T, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Downsample the series into 3 minute bins as above, but close the right\n",
|
||
|
" | side of the bin interval.\n",
|
||
|
" | \n",
|
||
|
" | >>> series.resample('3T', label='right', closed='right').sum()\n",
|
||
|
" | 2000-01-01 00:00:00 0\n",
|
||
|
" | 2000-01-01 00:03:00 6\n",
|
||
|
" | 2000-01-01 00:06:00 15\n",
|
||
|
" | 2000-01-01 00:09:00 15\n",
|
||
|
" | Freq: 3T, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Upsample the series into 30 second bins.\n",
|
||
|
" | \n",
|
||
|
" | >>> series.resample('30S').asfreq()[0:5] # Select first 5 rows\n",
|
||
|
" | 2000-01-01 00:00:00 0.0\n",
|
||
|
" | 2000-01-01 00:00:30 NaN\n",
|
||
|
" | 2000-01-01 00:01:00 1.0\n",
|
||
|
" | 2000-01-01 00:01:30 NaN\n",
|
||
|
" | 2000-01-01 00:02:00 2.0\n",
|
||
|
" | Freq: 30S, dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Upsample the series into 30 second bins and fill the ``NaN``\n",
|
||
|
" | values using the ``pad`` method.\n",
|
||
|
" | \n",
|
||
|
" | >>> series.resample('30S').pad()[0:5]\n",
|
||
|
" | 2000-01-01 00:00:00 0\n",
|
||
|
" | 2000-01-01 00:00:30 0\n",
|
||
|
" | 2000-01-01 00:01:00 1\n",
|
||
|
" | 2000-01-01 00:01:30 1\n",
|
||
|
" | 2000-01-01 00:02:00 2\n",
|
||
|
" | Freq: 30S, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Upsample the series into 30 second bins and fill the\n",
|
||
|
" | ``NaN`` values using the ``bfill`` method.\n",
|
||
|
" | \n",
|
||
|
" | >>> series.resample('30S').bfill()[0:5]\n",
|
||
|
" | 2000-01-01 00:00:00 0\n",
|
||
|
" | 2000-01-01 00:00:30 1\n",
|
||
|
" | 2000-01-01 00:01:00 1\n",
|
||
|
" | 2000-01-01 00:01:30 2\n",
|
||
|
" | 2000-01-01 00:02:00 2\n",
|
||
|
" | Freq: 30S, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Pass a custom function via ``apply``\n",
|
||
|
" | \n",
|
||
|
" | >>> def custom_resampler(array_like):\n",
|
||
|
" | ... return np.sum(array_like) + 5\n",
|
||
|
" | ...\n",
|
||
|
" | >>> series.resample('3T').apply(custom_resampler)\n",
|
||
|
" | 2000-01-01 00:00:00 8\n",
|
||
|
" | 2000-01-01 00:03:00 17\n",
|
||
|
" | 2000-01-01 00:06:00 26\n",
|
||
|
" | Freq: 3T, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | For a Series with a PeriodIndex, the keyword `convention` can be\n",
|
||
|
" | used to control whether to use the start or end of `rule`.\n",
|
||
|
" | \n",
|
||
|
" | Resample a year by quarter using 'start' `convention`. Values are\n",
|
||
|
" | assigned to the first quarter of the period.\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2], index=pd.period_range('2012-01-01',\n",
|
||
|
" | ... freq='A',\n",
|
||
|
" | ... periods=2))\n",
|
||
|
" | >>> s\n",
|
||
|
" | 2012 1\n",
|
||
|
" | 2013 2\n",
|
||
|
" | Freq: A-DEC, dtype: int64\n",
|
||
|
" | >>> s.resample('Q', convention='start').asfreq()\n",
|
||
|
" | 2012Q1 1.0\n",
|
||
|
" | 2012Q2 NaN\n",
|
||
|
" | 2012Q3 NaN\n",
|
||
|
" | 2012Q4 NaN\n",
|
||
|
" | 2013Q1 2.0\n",
|
||
|
" | 2013Q2 NaN\n",
|
||
|
" | 2013Q3 NaN\n",
|
||
|
" | 2013Q4 NaN\n",
|
||
|
" | Freq: Q-DEC, dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | Resample quarters by month using 'end' `convention`. Values are\n",
|
||
|
" | assigned to the last month of the period.\n",
|
||
|
" | \n",
|
||
|
" | >>> q = pd.Series([1, 2, 3, 4], index=pd.period_range('2018-01-01',\n",
|
||
|
" | ... freq='Q',\n",
|
||
|
" | ... periods=4))\n",
|
||
|
" | >>> q\n",
|
||
|
" | 2018Q1 1\n",
|
||
|
" | 2018Q2 2\n",
|
||
|
" | 2018Q3 3\n",
|
||
|
" | 2018Q4 4\n",
|
||
|
" | Freq: Q-DEC, dtype: int64\n",
|
||
|
" | >>> q.resample('M', convention='end').asfreq()\n",
|
||
|
" | 2018-03 1.0\n",
|
||
|
" | 2018-04 NaN\n",
|
||
|
" | 2018-05 NaN\n",
|
||
|
" | 2018-06 2.0\n",
|
||
|
" | 2018-07 NaN\n",
|
||
|
" | 2018-08 NaN\n",
|
||
|
" | 2018-09 3.0\n",
|
||
|
" | 2018-10 NaN\n",
|
||
|
" | 2018-11 NaN\n",
|
||
|
" | 2018-12 4.0\n",
|
||
|
" | Freq: M, dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | For DataFrame objects, the keyword `on` can be used to specify the\n",
|
||
|
" | column instead of the index for resampling.\n",
|
||
|
" | \n",
|
||
|
" | >>> d = dict({'price': [10, 11, 9, 13, 14, 18, 17, 19],\n",
|
||
|
" | ... 'volume': [50, 60, 40, 100, 50, 100, 40, 50]})\n",
|
||
|
" | >>> df = pd.DataFrame(d)\n",
|
||
|
" | >>> df['week_starting'] = pd.date_range('01/01/2018',\n",
|
||
|
" | ... periods=8,\n",
|
||
|
" | ... freq='W')\n",
|
||
|
" | >>> df\n",
|
||
|
" | price volume week_starting\n",
|
||
|
" | 0 10 50 2018-01-07\n",
|
||
|
" | 1 11 60 2018-01-14\n",
|
||
|
" | 2 9 40 2018-01-21\n",
|
||
|
" | 3 13 100 2018-01-28\n",
|
||
|
" | 4 14 50 2018-02-04\n",
|
||
|
" | 5 18 100 2018-02-11\n",
|
||
|
" | 6 17 40 2018-02-18\n",
|
||
|
" | 7 19 50 2018-02-25\n",
|
||
|
" | >>> df.resample('M', on='week_starting').mean()\n",
|
||
|
" | price volume\n",
|
||
|
" | week_starting\n",
|
||
|
" | 2018-01-31 10.75 62.5\n",
|
||
|
" | 2018-02-28 17.00 60.0\n",
|
||
|
" | \n",
|
||
|
" | For a DataFrame with MultiIndex, the keyword `level` can be used to\n",
|
||
|
" | specify on which level the resampling needs to take place.\n",
|
||
|
" | \n",
|
||
|
" | >>> days = pd.date_range('1/1/2000', periods=4, freq='D')\n",
|
||
|
" | >>> d2 = dict({'price': [10, 11, 9, 13, 14, 18, 17, 19],\n",
|
||
|
" | ... 'volume': [50, 60, 40, 100, 50, 100, 40, 50]})\n",
|
||
|
" | >>> df2 = pd.DataFrame(d2,\n",
|
||
|
" | ... index=pd.MultiIndex.from_product([days,\n",
|
||
|
" | ... ['morning',\n",
|
||
|
" | ... 'afternoon']]\n",
|
||
|
" | ... ))\n",
|
||
|
" | >>> df2\n",
|
||
|
" | price volume\n",
|
||
|
" | 2000-01-01 morning 10 50\n",
|
||
|
" | afternoon 11 60\n",
|
||
|
" | 2000-01-02 morning 9 40\n",
|
||
|
" | afternoon 13 100\n",
|
||
|
" | 2000-01-03 morning 14 50\n",
|
||
|
" | afternoon 18 100\n",
|
||
|
" | 2000-01-04 morning 17 40\n",
|
||
|
" | afternoon 19 50\n",
|
||
|
" | >>> df2.resample('D', level=0).sum()\n",
|
||
|
" | price volume\n",
|
||
|
" | 2000-01-01 21 110\n",
|
||
|
" | 2000-01-02 22 140\n",
|
||
|
" | 2000-01-03 32 150\n",
|
||
|
" | 2000-01-04 36 90\n",
|
||
|
" | \n",
|
||
|
" | sample(self, n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)\n",
|
||
|
" | Return a random sample of items from an axis of object.\n",
|
||
|
" | \n",
|
||
|
" | You can use `random_state` for reproducibility.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | n : int, optional\n",
|
||
|
" | Number of items from axis to return. Cannot be used with `frac`.\n",
|
||
|
" | Default = 1 if `frac` = None.\n",
|
||
|
" | frac : float, optional\n",
|
||
|
" | Fraction of axis items to return. Cannot be used with `n`.\n",
|
||
|
" | replace : bool, default False\n",
|
||
|
" | Sample with or without replacement.\n",
|
||
|
" | weights : str or ndarray-like, optional\n",
|
||
|
" | Default 'None' results in equal probability weighting.\n",
|
||
|
" | If passed a Series, will align with target object on index. Index\n",
|
||
|
" | values in weights not found in sampled object will be ignored and\n",
|
||
|
" | index values in sampled object not in weights will be assigned\n",
|
||
|
" | weights of zero.\n",
|
||
|
" | If called on a DataFrame, will accept the name of a column\n",
|
||
|
" | when axis = 0.\n",
|
||
|
" | Unless weights are a Series, weights must be same length as axis\n",
|
||
|
" | being sampled.\n",
|
||
|
" | If weights do not sum to 1, they will be normalized to sum to 1.\n",
|
||
|
" | Missing values in the weights column will be treated as zero.\n",
|
||
|
" | Infinite values not allowed.\n",
|
||
|
" | random_state : int or numpy.random.RandomState, optional\n",
|
||
|
" | Seed for the random number generator (if int), or numpy RandomState\n",
|
||
|
" | object.\n",
|
||
|
" | axis : int or string, optional\n",
|
||
|
" | Axis to sample. Accepts axis number or name. Default is stat axis\n",
|
||
|
" | for given data type (0 for Series and DataFrames, 1 for Panels).\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | A new object of same type as caller containing `n` items randomly\n",
|
||
|
" | sampled from the caller object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | numpy.random.choice: Generates a random sample from a given 1-D numpy\n",
|
||
|
" | array.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'num_legs': [2, 4, 8, 0],\n",
|
||
|
" | ... 'num_wings': [2, 0, 0, 0],\n",
|
||
|
" | ... 'num_specimen_seen': [10, 2, 1, 8]},\n",
|
||
|
" | ... index=['falcon', 'dog', 'spider', 'fish'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | num_legs num_wings num_specimen_seen\n",
|
||
|
" | falcon 2 2 10\n",
|
||
|
" | dog 4 0 2\n",
|
||
|
" | spider 8 0 1\n",
|
||
|
" | fish 0 0 8\n",
|
||
|
" | \n",
|
||
|
" | Extract 3 random elements from the ``Series`` ``df['num_legs']``:\n",
|
||
|
" | Note that we use `random_state` to ensure the reproducibility of\n",
|
||
|
" | the examples.\n",
|
||
|
" | \n",
|
||
|
" | >>> df['num_legs'].sample(n=3, random_state=1)\n",
|
||
|
" | fish 0\n",
|
||
|
" | spider 8\n",
|
||
|
" | falcon 2\n",
|
||
|
" | Name: num_legs, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | A random 50% sample of the ``DataFrame`` with replacement:\n",
|
||
|
" | \n",
|
||
|
" | >>> df.sample(frac=0.5, replace=True, random_state=1)\n",
|
||
|
" | num_legs num_wings num_specimen_seen\n",
|
||
|
" | dog 4 0 2\n",
|
||
|
" | fish 0 0 8\n",
|
||
|
" | \n",
|
||
|
" | Using a DataFrame column as weights. Rows with larger value in the\n",
|
||
|
" | `num_specimen_seen` column are more likely to be sampled.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.sample(n=2, weights='num_specimen_seen', random_state=1)\n",
|
||
|
" | num_legs num_wings num_specimen_seen\n",
|
||
|
" | falcon 2 2 10\n",
|
||
|
" | fish 0 0 8\n",
|
||
|
" | \n",
|
||
|
" | select(self, crit, axis=0)\n",
|
||
|
" | Return data corresponding to axis labels matching criteria.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | Use df.loc[df.index.map(crit)] to select via labels\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | crit : function\n",
|
||
|
" | To be called on each index (label). Should return True or False\n",
|
||
|
" | axis : int\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | selection : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | set_axis(self, labels, axis=0, inplace=None)\n",
|
||
|
" | Assign desired index to given axis.\n",
|
||
|
" | \n",
|
||
|
" | Indexes for column or row labels can be changed by assigning\n",
|
||
|
" | a list-like or Index.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | The signature is now `labels` and `axis`, consistent with\n",
|
||
|
" | the rest of pandas API. Previously, the `axis` and `labels`\n",
|
||
|
" | arguments were respectively the first and second positional\n",
|
||
|
" | arguments.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | labels : list-like, Index\n",
|
||
|
" | The values for the new index.\n",
|
||
|
" | \n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | The axis to update. The value 0 identifies the rows, and 1\n",
|
||
|
" | identifies the columns.\n",
|
||
|
" | \n",
|
||
|
" | inplace : boolean, default None\n",
|
||
|
" | Whether to return a new %(klass)s instance.\n",
|
||
|
" | \n",
|
||
|
" | .. warning::\n",
|
||
|
" | \n",
|
||
|
" | ``inplace=None`` currently falls back to to True, but in a\n",
|
||
|
" | future version, will default to False. Use inplace=True\n",
|
||
|
" | explicitly rather than relying on the default.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | renamed : %(klass)s or None\n",
|
||
|
" | An object of same type as caller if inplace=False, None otherwise.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.rename_axis : Alter the name of the index or columns.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Series**\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2, 3])\n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.set_axis(['a', 'b', 'c'], axis=0, inplace=False)\n",
|
||
|
" | a 1\n",
|
||
|
" | b 2\n",
|
||
|
" | c 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | The original object is not modified.\n",
|
||
|
" | \n",
|
||
|
" | >>> s\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | **DataFrame**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [1, 2, 3], \"B\": [4, 5, 6]})\n",
|
||
|
" | \n",
|
||
|
" | Change the row labels.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.set_axis(['a', 'b', 'c'], axis='index', inplace=False)\n",
|
||
|
" | A B\n",
|
||
|
" | a 1 4\n",
|
||
|
" | b 2 5\n",
|
||
|
" | c 3 6\n",
|
||
|
" | \n",
|
||
|
" | Change the column labels.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.set_axis(['I', 'II'], axis='columns', inplace=False)\n",
|
||
|
" | I II\n",
|
||
|
" | 0 1 4\n",
|
||
|
" | 1 2 5\n",
|
||
|
" | 2 3 6\n",
|
||
|
" | \n",
|
||
|
" | Now, update the labels inplace.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.set_axis(['i', 'ii'], axis='columns', inplace=True)\n",
|
||
|
" | >>> df\n",
|
||
|
" | i ii\n",
|
||
|
" | 0 1 4\n",
|
||
|
" | 1 2 5\n",
|
||
|
" | 2 3 6\n",
|
||
|
" | \n",
|
||
|
" | slice_shift(self, periods=1, axis=0)\n",
|
||
|
" | Equivalent to `shift` without copying data. The shifted data will\n",
|
||
|
" | not include the dropped periods and the shifted axis will be smaller\n",
|
||
|
" | than the original.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | periods : int\n",
|
||
|
" | Number of periods to move, can be positive or negative\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | shifted : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | While the `slice_shift` is faster than `shift`, you may pay for it\n",
|
||
|
" | later during alignment.\n",
|
||
|
" | \n",
|
||
|
" | squeeze(self, axis=None)\n",
|
||
|
" | Squeeze 1 dimensional axis objects into scalars.\n",
|
||
|
" | \n",
|
||
|
" | Series or DataFrames with a single element are squeezed to a scalar.\n",
|
||
|
" | DataFrames with a single column or a single row are squeezed to a\n",
|
||
|
" | Series. Otherwise the object is unchanged.\n",
|
||
|
" | \n",
|
||
|
" | This method is most useful when you don't know if your\n",
|
||
|
" | object is a Series or DataFrame, but you do know it has just a single\n",
|
||
|
" | column. In that case you can safely call `squeeze` to ensure you have a\n",
|
||
|
" | Series.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns', None}, default None\n",
|
||
|
" | A specific axis to squeeze. By default, all length-1 axes are\n",
|
||
|
" | squeezed.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | DataFrame, Series, or scalar\n",
|
||
|
" | The projection after squeezing `axis` or all the axes.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | Series.iloc : Integer-location based indexing for selecting scalars.\n",
|
||
|
" | DataFrame.iloc : Integer-location based indexing for selecting Series.\n",
|
||
|
" | Series.to_frame : Inverse of DataFrame.squeeze for a\n",
|
||
|
" | single-column DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> primes = pd.Series([2, 3, 5, 7])\n",
|
||
|
" | \n",
|
||
|
" | Slicing might produce a Series with a single value:\n",
|
||
|
" | \n",
|
||
|
" | >>> even_primes = primes[primes % 2 == 0]\n",
|
||
|
" | >>> even_primes\n",
|
||
|
" | 0 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> even_primes.squeeze()\n",
|
||
|
" | 2\n",
|
||
|
" | \n",
|
||
|
" | Squeezing objects with more than one value in every axis does nothing:\n",
|
||
|
" | \n",
|
||
|
" | >>> odd_primes = primes[primes % 2 == 1]\n",
|
||
|
" | >>> odd_primes\n",
|
||
|
" | 1 3\n",
|
||
|
" | 2 5\n",
|
||
|
" | 3 7\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> odd_primes.squeeze()\n",
|
||
|
" | 1 3\n",
|
||
|
" | 2 5\n",
|
||
|
" | 3 7\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Squeezing is even more effective when used with DataFrames.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | a b\n",
|
||
|
" | 0 1 2\n",
|
||
|
" | 1 3 4\n",
|
||
|
" | \n",
|
||
|
" | Slicing a single column will produce a DataFrame with the columns\n",
|
||
|
" | having only one value:\n",
|
||
|
" | \n",
|
||
|
" | >>> df_a = df[['a']]\n",
|
||
|
" | >>> df_a\n",
|
||
|
" | a\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 3\n",
|
||
|
" | \n",
|
||
|
" | So the columns can be squeezed down, resulting in a Series:\n",
|
||
|
" | \n",
|
||
|
" | >>> df_a.squeeze('columns')\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 3\n",
|
||
|
" | Name: a, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Slicing a single row from a single column will produce a single\n",
|
||
|
" | scalar DataFrame:\n",
|
||
|
" | \n",
|
||
|
" | >>> df_0a = df.loc[df.index < 1, ['a']]\n",
|
||
|
" | >>> df_0a\n",
|
||
|
" | a\n",
|
||
|
" | 0 1\n",
|
||
|
" | \n",
|
||
|
" | Squeezing the rows produces a single scalar Series:\n",
|
||
|
" | \n",
|
||
|
" | >>> df_0a.squeeze('rows')\n",
|
||
|
" | a 1\n",
|
||
|
" | Name: 0, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Squeezing all axes wil project directly into a scalar:\n",
|
||
|
" | \n",
|
||
|
" | >>> df_0a.squeeze()\n",
|
||
|
" | 1\n",
|
||
|
" | \n",
|
||
|
" | swapaxes(self, axis1, axis2, copy=True)\n",
|
||
|
" | Interchange axes and swap values axes appropriately.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | y : same as input\n",
|
||
|
" | \n",
|
||
|
" | tail(self, n=5)\n",
|
||
|
" | Return the last `n` rows.\n",
|
||
|
" | \n",
|
||
|
" | This function returns last `n` rows from the object based on\n",
|
||
|
" | position. It is useful for quickly verifying data, for example,\n",
|
||
|
" | after sorting or appending rows.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | n : int, default 5\n",
|
||
|
" | Number of rows to select.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | type of caller\n",
|
||
|
" | The last `n` rows of the caller object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.head : The first `n` rows of the caller object.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'animal':['alligator', 'bee', 'falcon', 'lion',\n",
|
||
|
" | ... 'monkey', 'parrot', 'shark', 'whale', 'zebra']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | animal\n",
|
||
|
" | 0 alligator\n",
|
||
|
" | 1 bee\n",
|
||
|
" | 2 falcon\n",
|
||
|
" | 3 lion\n",
|
||
|
" | 4 monkey\n",
|
||
|
" | 5 parrot\n",
|
||
|
" | 6 shark\n",
|
||
|
" | 7 whale\n",
|
||
|
" | 8 zebra\n",
|
||
|
" | \n",
|
||
|
" | Viewing the last 5 lines\n",
|
||
|
" | \n",
|
||
|
" | >>> df.tail()\n",
|
||
|
" | animal\n",
|
||
|
" | 4 monkey\n",
|
||
|
" | 5 parrot\n",
|
||
|
" | 6 shark\n",
|
||
|
" | 7 whale\n",
|
||
|
" | 8 zebra\n",
|
||
|
" | \n",
|
||
|
" | Viewing the last `n` lines (three in this case)\n",
|
||
|
" | \n",
|
||
|
" | >>> df.tail(3)\n",
|
||
|
" | animal\n",
|
||
|
" | 6 shark\n",
|
||
|
" | 7 whale\n",
|
||
|
" | 8 zebra\n",
|
||
|
" | \n",
|
||
|
" | take(self, indices, axis=0, convert=None, is_copy=True, **kwargs)\n",
|
||
|
" | Return the elements in the given *positional* indices along an axis.\n",
|
||
|
" | \n",
|
||
|
" | This means that we are not indexing according to actual values in\n",
|
||
|
" | the index attribute of the object. We are indexing according to the\n",
|
||
|
" | actual position of the element in the object.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | indices : array-like\n",
|
||
|
" | An array of ints indicating which positions to take.\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns', None}, default 0\n",
|
||
|
" | The axis on which to select elements. ``0`` means that we are\n",
|
||
|
" | selecting rows, ``1`` means that we are selecting columns.\n",
|
||
|
" | convert : bool, default True\n",
|
||
|
" | Whether to convert negative indices into positive ones.\n",
|
||
|
" | For example, ``-1`` would map to the ``len(axis) - 1``.\n",
|
||
|
" | The conversions are similar to the behavior of indexing a\n",
|
||
|
" | regular Python list.\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | In the future, negative indices will always be converted.\n",
|
||
|
" | \n",
|
||
|
" | is_copy : bool, default True\n",
|
||
|
" | Whether to return a copy of the original object or not.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | For compatibility with :meth:`numpy.take`. Has no effect on the\n",
|
||
|
" | output.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | taken : same type as caller\n",
|
||
|
" | An array-like containing the elements taken from the object.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.loc : Select a subset of a DataFrame by labels.\n",
|
||
|
" | DataFrame.iloc : Select a subset of a DataFrame by positions.\n",
|
||
|
" | numpy.take : Take elements from an array along an axis.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([('falcon', 'bird', 389.0),\n",
|
||
|
" | ... ('parrot', 'bird', 24.0),\n",
|
||
|
" | ... ('lion', 'mammal', 80.5),\n",
|
||
|
" | ... ('monkey', 'mammal', np.nan)],\n",
|
||
|
" | ... columns=['name', 'class', 'max_speed'],\n",
|
||
|
" | ... index=[0, 2, 3, 1])\n",
|
||
|
" | >>> df\n",
|
||
|
" | name class max_speed\n",
|
||
|
" | 0 falcon bird 389.0\n",
|
||
|
" | 2 parrot bird 24.0\n",
|
||
|
" | 3 lion mammal 80.5\n",
|
||
|
" | 1 monkey mammal NaN\n",
|
||
|
" | \n",
|
||
|
" | Take elements at positions 0 and 3 along the axis 0 (default).\n",
|
||
|
" | \n",
|
||
|
" | Note how the actual indices selected (0 and 1) do not correspond to\n",
|
||
|
" | our selected indices 0 and 3. That's because we are selecting the 0th\n",
|
||
|
" | and 3rd rows, not rows whose indices equal 0 and 3.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.take([0, 3])\n",
|
||
|
" | name class max_speed\n",
|
||
|
" | 0 falcon bird 389.0\n",
|
||
|
" | 1 monkey mammal NaN\n",
|
||
|
" | \n",
|
||
|
" | Take elements at indices 1 and 2 along the axis 1 (column selection).\n",
|
||
|
" | \n",
|
||
|
" | >>> df.take([1, 2], axis=1)\n",
|
||
|
" | class max_speed\n",
|
||
|
" | 0 bird 389.0\n",
|
||
|
" | 2 bird 24.0\n",
|
||
|
" | 3 mammal 80.5\n",
|
||
|
" | 1 mammal NaN\n",
|
||
|
" | \n",
|
||
|
" | We may take elements using negative integers for positive indices,\n",
|
||
|
" | starting from the end of the object, just like with Python lists.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.take([-1, -2])\n",
|
||
|
" | name class max_speed\n",
|
||
|
" | 1 monkey mammal NaN\n",
|
||
|
" | 3 lion mammal 80.5\n",
|
||
|
" | \n",
|
||
|
" | to_clipboard(self, excel=True, sep=None, **kwargs)\n",
|
||
|
" | Copy object to the system clipboard.\n",
|
||
|
" | \n",
|
||
|
" | Write a text representation of object to the system clipboard.\n",
|
||
|
" | This can be pasted into Excel, for example.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | excel : bool, default True\n",
|
||
|
" | - True, use the provided separator, writing in a csv format for\n",
|
||
|
" | allowing easy pasting into excel.\n",
|
||
|
" | - False, write a string representation of the object to the\n",
|
||
|
" | clipboard.\n",
|
||
|
" | \n",
|
||
|
" | sep : str, default ``'\\t'``\n",
|
||
|
" | Field delimiter.\n",
|
||
|
" | **kwargs\n",
|
||
|
" | These parameters will be passed to DataFrame.to_csv.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.to_csv : Write a DataFrame to a comma-separated values\n",
|
||
|
" | (csv) file.\n",
|
||
|
" | read_clipboard : Read text from clipboard and pass to read_table.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Requirements for your platform.\n",
|
||
|
" | \n",
|
||
|
" | - Linux : `xclip`, or `xsel` (with `gtk` or `PyQt4` modules)\n",
|
||
|
" | - Windows : none\n",
|
||
|
" | - OS X : none\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | Copy the contents of a DataFrame to the clipboard.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=['A', 'B', 'C'])\n",
|
||
|
" | >>> df.to_clipboard(sep=',')\n",
|
||
|
" | ... # Wrote the following to the system clipboard:\n",
|
||
|
" | ... # ,A,B,C\n",
|
||
|
" | ... # 0,1,2,3\n",
|
||
|
" | ... # 1,4,5,6\n",
|
||
|
" | \n",
|
||
|
" | We can omit the the index by passing the keyword `index` and setting\n",
|
||
|
" | it to false.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_clipboard(sep=',', index=False)\n",
|
||
|
" | ... # Wrote the following to the system clipboard:\n",
|
||
|
" | ... # A,B,C\n",
|
||
|
" | ... # 1,2,3\n",
|
||
|
" | ... # 4,5,6\n",
|
||
|
" | \n",
|
||
|
" | to_dense(self)\n",
|
||
|
" | Return dense representation of NDFrame (as opposed to sparse).\n",
|
||
|
" | \n",
|
||
|
" | to_excel(self, excel_writer, sheet_name='Sheet1', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None)\n",
|
||
|
" | Write object to an Excel sheet.\n",
|
||
|
" | \n",
|
||
|
" | To write a single object to an Excel .xlsx file it is only necessary to\n",
|
||
|
" | specify a target file name. To write to multiple sheets it is necessary to\n",
|
||
|
" | create an `ExcelWriter` object with a target file name, and specify a sheet\n",
|
||
|
" | in the file to write to.\n",
|
||
|
" | \n",
|
||
|
" | Multiple sheets may be written to by specifying unique `sheet_name`.\n",
|
||
|
" | With all data written to the file it is necessary to save the changes.\n",
|
||
|
" | Note that creating an `ExcelWriter` object with a file name that already\n",
|
||
|
" | exists will result in the contents of the existing file being erased.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | excel_writer : str or ExcelWriter object\n",
|
||
|
" | File path or existing ExcelWriter.\n",
|
||
|
" | sheet_name : str, default 'Sheet1'\n",
|
||
|
" | Name of sheet which will contain DataFrame.\n",
|
||
|
" | na_rep : str, default ''\n",
|
||
|
" | Missing data representation.\n",
|
||
|
" | float_format : str, optional\n",
|
||
|
" | Format string for floating point numbers. For example\n",
|
||
|
" | ``float_format=\"%.2f\"`` will format 0.1234 to 0.12.\n",
|
||
|
" | columns : sequence or list of str, optional\n",
|
||
|
" | Columns to write.\n",
|
||
|
" | header : bool or list of str, default True\n",
|
||
|
" | Write out the column names. If a list of string is given it is\n",
|
||
|
" | assumed to be aliases for the column names.\n",
|
||
|
" | index : bool, default True\n",
|
||
|
" | Write row names (index).\n",
|
||
|
" | index_label : str or sequence, optional\n",
|
||
|
" | Column label for index column(s) if desired. If not specified, and\n",
|
||
|
" | `header` and `index` are True, then the index names are used. A\n",
|
||
|
" | sequence should be given if the DataFrame uses MultiIndex.\n",
|
||
|
" | startrow : int, default 0\n",
|
||
|
" | Upper left cell row to dump data frame.\n",
|
||
|
" | startcol : int, default 0\n",
|
||
|
" | Upper left cell column to dump data frame.\n",
|
||
|
" | engine : str, optional\n",
|
||
|
" | Write engine to use, 'openpyxl' or 'xlsxwriter'. You can also set this\n",
|
||
|
" | via the options ``io.excel.xlsx.writer``, ``io.excel.xls.writer``, and\n",
|
||
|
" | ``io.excel.xlsm.writer``.\n",
|
||
|
" | merge_cells : bool, default True\n",
|
||
|
" | Write MultiIndex and Hierarchical Rows as merged cells.\n",
|
||
|
" | encoding : str, optional\n",
|
||
|
" | Encoding of the resulting excel file. Only necessary for xlwt,\n",
|
||
|
" | other writers support unicode natively.\n",
|
||
|
" | inf_rep : str, default 'inf'\n",
|
||
|
" | Representation for infinity (there is no native representation for\n",
|
||
|
" | infinity in Excel).\n",
|
||
|
" | verbose : bool, default True\n",
|
||
|
" | Display more information in the error logs.\n",
|
||
|
" | freeze_panes : tuple of int (length 2), optional\n",
|
||
|
" | Specifies the one-based bottommost row and rightmost column that\n",
|
||
|
" | is to be frozen.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | to_csv : Write DataFrame to a comma-separated values (csv) file.\n",
|
||
|
" | ExcelWriter : Class for writing DataFrame objects into excel sheets.\n",
|
||
|
" | read_excel : Read an Excel file into a pandas DataFrame.\n",
|
||
|
" | read_csv : Read a comma-separated values (csv) file into DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | For compatibility with :meth:`~DataFrame.to_csv`,\n",
|
||
|
" | to_excel serializes lists and dicts to strings before writing.\n",
|
||
|
" | \n",
|
||
|
" | Once a workbook has been saved it is not possible write further data\n",
|
||
|
" | without rewriting the whole workbook.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | Create, write to and save a workbook:\n",
|
||
|
" | \n",
|
||
|
" | >>> df1 = pd.DataFrame([['a', 'b'], ['c', 'd']],\n",
|
||
|
" | ... index=['row 1', 'row 2'],\n",
|
||
|
" | ... columns=['col 1', 'col 2'])\n",
|
||
|
" | >>> df1.to_excel(\"output.xlsx\") # doctest: +SKIP\n",
|
||
|
" | \n",
|
||
|
" | To specify the sheet name:\n",
|
||
|
" | \n",
|
||
|
" | >>> df1.to_excel(\"output.xlsx\",\n",
|
||
|
" | ... sheet_name='Sheet_name_1') # doctest: +SKIP\n",
|
||
|
" | \n",
|
||
|
" | If you wish to write to more than one sheet in the workbook, it is\n",
|
||
|
" | necessary to specify an ExcelWriter object:\n",
|
||
|
" | \n",
|
||
|
" | >>> df2 = df1.copy()\n",
|
||
|
" | >>> with pd.ExcelWriter('output.xlsx') as writer: # doctest: +SKIP\n",
|
||
|
" | ... df1.to_excel(writer, sheet_name='Sheet_name_1')\n",
|
||
|
" | ... df2.to_excel(writer, sheet_name='Sheet_name_2')\n",
|
||
|
" | \n",
|
||
|
" | To set the library that is used to write the Excel file,\n",
|
||
|
" | you can pass the `engine` keyword (the default engine is\n",
|
||
|
" | automatically chosen depending on the file extension):\n",
|
||
|
" | \n",
|
||
|
" | >>> df1.to_excel('output1.xlsx', engine='xlsxwriter') # doctest: +SKIP\n",
|
||
|
" | \n",
|
||
|
" | to_hdf(self, path_or_buf, key, **kwargs)\n",
|
||
|
" | Write the contained data to an HDF5 file using HDFStore.\n",
|
||
|
" | \n",
|
||
|
" | Hierarchical Data Format (HDF) is self-describing, allowing an\n",
|
||
|
" | application to interpret the structure and contents of a file with\n",
|
||
|
" | no outside information. One HDF file can hold a mix of related objects\n",
|
||
|
" | which can be accessed as a group or as individual objects.\n",
|
||
|
" | \n",
|
||
|
" | In order to add another DataFrame or Series to an existing HDF file\n",
|
||
|
" | please use append mode and a different a key.\n",
|
||
|
" | \n",
|
||
|
" | For more information see the :ref:`user guide <io.hdf5>`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | path_or_buf : str or pandas.HDFStore\n",
|
||
|
" | File path or HDFStore object.\n",
|
||
|
" | key : str\n",
|
||
|
" | Identifier for the group in the store.\n",
|
||
|
" | mode : {'a', 'w', 'r+'}, default 'a'\n",
|
||
|
" | Mode to open file:\n",
|
||
|
" | \n",
|
||
|
" | - 'w': write, a new file is created (an existing file with\n",
|
||
|
" | the same name would be deleted).\n",
|
||
|
" | - 'a': append, an existing file is opened for reading and\n",
|
||
|
" | writing, and if the file does not exist it is created.\n",
|
||
|
" | - 'r+': similar to 'a', but the file must already exist.\n",
|
||
|
" | format : {'fixed', 'table'}, default 'fixed'\n",
|
||
|
" | Possible values:\n",
|
||
|
" | \n",
|
||
|
" | - 'fixed': Fixed format. Fast writing/reading. Not-appendable,\n",
|
||
|
" | nor searchable.\n",
|
||
|
" | - 'table': Table format. Write as a PyTables Table structure\n",
|
||
|
" | which may perform worse but allow more flexible operations\n",
|
||
|
" | like searching / selecting subsets of the data.\n",
|
||
|
" | append : bool, default False\n",
|
||
|
" | For Table formats, append the input data to the existing.\n",
|
||
|
" | data_columns : list of columns or True, optional\n",
|
||
|
" | List of columns to create as indexed data columns for on-disk\n",
|
||
|
" | queries, or True to use all columns. By default only the axes\n",
|
||
|
" | of the object are indexed. See :ref:`io.hdf5-query-data-columns`.\n",
|
||
|
" | Applicable only to format='table'.\n",
|
||
|
" | complevel : {0-9}, optional\n",
|
||
|
" | Specifies a compression level for data.\n",
|
||
|
" | A value of 0 disables compression.\n",
|
||
|
" | complib : {'zlib', 'lzo', 'bzip2', 'blosc'}, default 'zlib'\n",
|
||
|
" | Specifies the compression library to be used.\n",
|
||
|
" | As of v0.20.2 these additional compressors for Blosc are supported\n",
|
||
|
" | (default if no compressor specified: 'blosc:blosclz'):\n",
|
||
|
" | {'blosc:blosclz', 'blosc:lz4', 'blosc:lz4hc', 'blosc:snappy',\n",
|
||
|
" | 'blosc:zlib', 'blosc:zstd'}.\n",
|
||
|
" | Specifying a compression library which is not available issues\n",
|
||
|
" | a ValueError.\n",
|
||
|
" | fletcher32 : bool, default False\n",
|
||
|
" | If applying compression use the fletcher32 checksum.\n",
|
||
|
" | dropna : bool, default False\n",
|
||
|
" | If true, ALL nan rows will not be written to store.\n",
|
||
|
" | errors : str, default 'strict'\n",
|
||
|
" | Specifies how encoding and decoding errors are to be handled.\n",
|
||
|
" | See the errors argument for :func:`open` for a full list\n",
|
||
|
" | of options.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.read_hdf : Read from HDF file.\n",
|
||
|
" | DataFrame.to_parquet : Write a DataFrame to the binary parquet format.\n",
|
||
|
" | DataFrame.to_sql : Write to a sql table.\n",
|
||
|
" | DataFrame.to_feather : Write out feather-format for DataFrames.\n",
|
||
|
" | DataFrame.to_csv : Write out to a csv file.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]},\n",
|
||
|
" | ... index=['a', 'b', 'c'])\n",
|
||
|
" | >>> df.to_hdf('data.h5', key='df', mode='w')\n",
|
||
|
" | \n",
|
||
|
" | We can add another object to the same file:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1, 2, 3, 4])\n",
|
||
|
" | >>> s.to_hdf('data.h5', key='s')\n",
|
||
|
" | \n",
|
||
|
" | Reading from HDF file:\n",
|
||
|
" | \n",
|
||
|
" | >>> pd.read_hdf('data.h5', 'df')\n",
|
||
|
" | A B\n",
|
||
|
" | a 1 4\n",
|
||
|
" | b 2 5\n",
|
||
|
" | c 3 6\n",
|
||
|
" | >>> pd.read_hdf('data.h5', 's')\n",
|
||
|
" | 0 1\n",
|
||
|
" | 1 2\n",
|
||
|
" | 2 3\n",
|
||
|
" | 3 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Deleting file with data:\n",
|
||
|
" | \n",
|
||
|
" | >>> import os\n",
|
||
|
" | >>> os.remove('data.h5')\n",
|
||
|
" | \n",
|
||
|
" | to_json(self, path_or_buf=None, orient=None, date_format=None, double_precision=10, force_ascii=True, date_unit='ms', default_handler=None, lines=False, compression='infer', index=True)\n",
|
||
|
" | Convert the object to a JSON string.\n",
|
||
|
" | \n",
|
||
|
" | Note NaN's and None will be converted to null and datetime objects\n",
|
||
|
" | will be converted to UNIX timestamps.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | path_or_buf : string or file handle, optional\n",
|
||
|
" | File path or object. If not specified, the result is returned as\n",
|
||
|
" | a string.\n",
|
||
|
" | orient : string\n",
|
||
|
" | Indication of expected JSON string format.\n",
|
||
|
" | \n",
|
||
|
" | * Series\n",
|
||
|
" | \n",
|
||
|
" | - default is 'index'\n",
|
||
|
" | - allowed values are: {'split','records','index','table'}\n",
|
||
|
" | \n",
|
||
|
" | * DataFrame\n",
|
||
|
" | \n",
|
||
|
" | - default is 'columns'\n",
|
||
|
" | - allowed values are:\n",
|
||
|
" | {'split','records','index','columns','values','table'}\n",
|
||
|
" | \n",
|
||
|
" | * The format of the JSON string\n",
|
||
|
" | \n",
|
||
|
" | - 'split' : dict like {'index' -> [index],\n",
|
||
|
" | 'columns' -> [columns], 'data' -> [values]}\n",
|
||
|
" | - 'records' : list like\n",
|
||
|
" | [{column -> value}, ... , {column -> value}]\n",
|
||
|
" | - 'index' : dict like {index -> {column -> value}}\n",
|
||
|
" | - 'columns' : dict like {column -> {index -> value}}\n",
|
||
|
" | - 'values' : just the values array\n",
|
||
|
" | - 'table' : dict like {'schema': {schema}, 'data': {data}}\n",
|
||
|
" | describing the data, and the data component is\n",
|
||
|
" | like ``orient='records'``.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | date_format : {None, 'epoch', 'iso'}\n",
|
||
|
" | Type of date conversion. 'epoch' = epoch milliseconds,\n",
|
||
|
" | 'iso' = ISO8601. The default depends on the `orient`. For\n",
|
||
|
" | ``orient='table'``, the default is 'iso'. For all other orients,\n",
|
||
|
" | the default is 'epoch'.\n",
|
||
|
" | double_precision : int, default 10\n",
|
||
|
" | The number of decimal places to use when encoding\n",
|
||
|
" | floating point values.\n",
|
||
|
" | force_ascii : bool, default True\n",
|
||
|
" | Force encoded string to be ASCII.\n",
|
||
|
" | date_unit : string, default 'ms' (milliseconds)\n",
|
||
|
" | The time unit to encode to, governs timestamp and ISO8601\n",
|
||
|
" | precision. One of 's', 'ms', 'us', 'ns' for second, millisecond,\n",
|
||
|
" | microsecond, and nanosecond respectively.\n",
|
||
|
" | default_handler : callable, default None\n",
|
||
|
" | Handler to call if object cannot otherwise be converted to a\n",
|
||
|
" | suitable format for JSON. Should receive a single argument which is\n",
|
||
|
" | the object to convert and return a serialisable object.\n",
|
||
|
" | lines : bool, default False\n",
|
||
|
" | If 'orient' is 'records' write out line delimited json format. Will\n",
|
||
|
" | throw ValueError if incorrect 'orient' since others are not list\n",
|
||
|
" | like.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.19.0\n",
|
||
|
" | \n",
|
||
|
" | compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}\n",
|
||
|
" | \n",
|
||
|
" | A string representing the compression to use in the output file,\n",
|
||
|
" | only used when the first argument is a filename. By default, the\n",
|
||
|
" | compression is inferred from the filename.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | .. versionchanged:: 0.24.0\n",
|
||
|
" | 'infer' option added and set to default\n",
|
||
|
" | index : bool, default True\n",
|
||
|
" | Whether to include the index values in the JSON string. Not\n",
|
||
|
" | including the index (``index=False``) is only supported when\n",
|
||
|
" | orient is 'split' or 'table'.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.23.0\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | read_json\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([['a', 'b'], ['c', 'd']],\n",
|
||
|
" | ... index=['row 1', 'row 2'],\n",
|
||
|
" | ... columns=['col 1', 'col 2'])\n",
|
||
|
" | >>> df.to_json(orient='split')\n",
|
||
|
" | '{\"columns\":[\"col 1\",\"col 2\"],\n",
|
||
|
" | \"index\":[\"row 1\",\"row 2\"],\n",
|
||
|
" | \"data\":[[\"a\",\"b\"],[\"c\",\"d\"]]}'\n",
|
||
|
" | \n",
|
||
|
" | Encoding/decoding a Dataframe using ``'records'`` formatted JSON.\n",
|
||
|
" | Note that index labels are not preserved with this encoding.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_json(orient='records')\n",
|
||
|
" | '[{\"col 1\":\"a\",\"col 2\":\"b\"},{\"col 1\":\"c\",\"col 2\":\"d\"}]'\n",
|
||
|
" | \n",
|
||
|
" | Encoding/decoding a Dataframe using ``'index'`` formatted JSON:\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_json(orient='index')\n",
|
||
|
" | '{\"row 1\":{\"col 1\":\"a\",\"col 2\":\"b\"},\"row 2\":{\"col 1\":\"c\",\"col 2\":\"d\"}}'\n",
|
||
|
" | \n",
|
||
|
" | Encoding/decoding a Dataframe using ``'columns'`` formatted JSON:\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_json(orient='columns')\n",
|
||
|
" | '{\"col 1\":{\"row 1\":\"a\",\"row 2\":\"c\"},\"col 2\":{\"row 1\":\"b\",\"row 2\":\"d\"}}'\n",
|
||
|
" | \n",
|
||
|
" | Encoding/decoding a Dataframe using ``'values'`` formatted JSON:\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_json(orient='values')\n",
|
||
|
" | '[[\"a\",\"b\"],[\"c\",\"d\"]]'\n",
|
||
|
" | \n",
|
||
|
" | Encoding with Table Schema\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_json(orient='table')\n",
|
||
|
" | '{\"schema\": {\"fields\": [{\"name\": \"index\", \"type\": \"string\"},\n",
|
||
|
" | {\"name\": \"col 1\", \"type\": \"string\"},\n",
|
||
|
" | {\"name\": \"col 2\", \"type\": \"string\"}],\n",
|
||
|
" | \"primaryKey\": \"index\",\n",
|
||
|
" | \"pandas_version\": \"0.20.0\"},\n",
|
||
|
" | \"data\": [{\"index\": \"row 1\", \"col 1\": \"a\", \"col 2\": \"b\"},\n",
|
||
|
" | {\"index\": \"row 2\", \"col 1\": \"c\", \"col 2\": \"d\"}]}'\n",
|
||
|
" | \n",
|
||
|
" | to_latex(self, buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, bold_rows=False, column_format=None, longtable=None, escape=None, encoding=None, decimal='.', multicolumn=None, multicolumn_format=None, multirow=None)\n",
|
||
|
" | Render an object to a LaTeX tabular environment table.\n",
|
||
|
" | \n",
|
||
|
" | Render an object to a tabular environment table. You can splice\n",
|
||
|
" | this into a LaTeX document. Requires \\usepackage{booktabs}.\n",
|
||
|
" | \n",
|
||
|
" | .. versionchanged:: 0.20.2\n",
|
||
|
" | Added to Series\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | buf : file descriptor or None\n",
|
||
|
" | Buffer to write to. If None, the output is returned as a string.\n",
|
||
|
" | columns : list of label, optional\n",
|
||
|
" | The subset of columns to write. Writes all columns by default.\n",
|
||
|
" | col_space : int, optional\n",
|
||
|
" | The minimum width of each column.\n",
|
||
|
" | header : bool or list of str, default True\n",
|
||
|
" | Write out the column names. If a list of strings is given,\n",
|
||
|
" | it is assumed to be aliases for the column names.\n",
|
||
|
" | index : bool, default True\n",
|
||
|
" | Write row names (index).\n",
|
||
|
" | na_rep : str, default 'NaN'\n",
|
||
|
" | Missing data representation.\n",
|
||
|
" | formatters : list of functions or dict of {str: function}, optional\n",
|
||
|
" | Formatter functions to apply to columns' elements by position or\n",
|
||
|
" | name. The result of each function must be a unicode string.\n",
|
||
|
" | List must be of length equal to the number of columns.\n",
|
||
|
" | float_format : str, optional\n",
|
||
|
" | Format string for floating point numbers.\n",
|
||
|
" | sparsify : bool, optional\n",
|
||
|
" | Set to False for a DataFrame with a hierarchical index to print\n",
|
||
|
" | every multiindex key at each row. By default, the value will be\n",
|
||
|
" | read from the config module.\n",
|
||
|
" | index_names : bool, default True\n",
|
||
|
" | Prints the names of the indexes.\n",
|
||
|
" | bold_rows : bool, default False\n",
|
||
|
" | Make the row labels bold in the output.\n",
|
||
|
" | column_format : str, optional\n",
|
||
|
" | The columns format as specified in `LaTeX table format\n",
|
||
|
" | <https://en.wikibooks.org/wiki/LaTeX/Tables>`__ e.g. 'rcl' for 3\n",
|
||
|
" | columns. By default, 'l' will be used for all columns except\n",
|
||
|
" | columns of numbers, which default to 'r'.\n",
|
||
|
" | longtable : bool, optional\n",
|
||
|
" | By default, the value will be read from the pandas config\n",
|
||
|
" | module. Use a longtable environment instead of tabular. Requires\n",
|
||
|
" | adding a \\usepackage{longtable} to your LaTeX preamble.\n",
|
||
|
" | escape : bool, optional\n",
|
||
|
" | By default, the value will be read from the pandas config\n",
|
||
|
" | module. When set to False prevents from escaping latex special\n",
|
||
|
" | characters in column names.\n",
|
||
|
" | encoding : str, optional\n",
|
||
|
" | A string representing the encoding to use in the output file,\n",
|
||
|
" | defaults to 'ascii' on Python 2 and 'utf-8' on Python 3.\n",
|
||
|
" | decimal : str, default '.'\n",
|
||
|
" | Character recognized as decimal separator, e.g. ',' in Europe.\n",
|
||
|
" | .. versionadded:: 0.18.0\n",
|
||
|
" | multicolumn : bool, default True\n",
|
||
|
" | Use \\multicolumn to enhance MultiIndex columns.\n",
|
||
|
" | The default will be read from the config module.\n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | multicolumn_format : str, default 'l'\n",
|
||
|
" | The alignment for multicolumns, similar to `column_format`\n",
|
||
|
" | The default will be read from the config module.\n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | multirow : bool, default False\n",
|
||
|
" | Use \\multirow to enhance MultiIndex rows. Requires adding a\n",
|
||
|
" | \\usepackage{multirow} to your LaTeX preamble. Will print\n",
|
||
|
" | centered labels (instead of top-aligned) across the contained\n",
|
||
|
" | rows, separating groups via clines. The default will be read\n",
|
||
|
" | from the pandas config module.\n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | str or None\n",
|
||
|
" | If buf is None, returns the resulting LateX format as a\n",
|
||
|
" | string. Otherwise returns None.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.to_string : Render a DataFrame to a console-friendly\n",
|
||
|
" | tabular output.\n",
|
||
|
" | DataFrame.to_html : Render a DataFrame as an HTML table.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'name': ['Raphael', 'Donatello'],\n",
|
||
|
" | ... 'mask': ['red', 'purple'],\n",
|
||
|
" | ... 'weapon': ['sai', 'bo staff']})\n",
|
||
|
" | >>> df.to_latex(index=False) # doctest: +NORMALIZE_WHITESPACE\n",
|
||
|
" | '\\\\begin{tabular}{lll}\\n\\\\toprule\\n name & mask & weapon\n",
|
||
|
" | \\\\\\\\\\n\\\\midrule\\n Raphael & red & sai \\\\\\\\\\n Donatello &\n",
|
||
|
" | purple & bo staff \\\\\\\\\\n\\\\bottomrule\\n\\\\end{tabular}\\n'\n",
|
||
|
" | \n",
|
||
|
" | to_msgpack(self, path_or_buf=None, encoding='utf-8', **kwargs)\n",
|
||
|
" | Serialize object to input file path using msgpack format.\n",
|
||
|
" | \n",
|
||
|
" | THIS IS AN EXPERIMENTAL LIBRARY and the storage format\n",
|
||
|
" | may not be stable until a future release.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | path : string File path, buffer-like, or None\n",
|
||
|
" | if None, return generated string\n",
|
||
|
" | append : bool whether to append to an existing msgpack\n",
|
||
|
" | (default is False)\n",
|
||
|
" | compress : type of compressor (zlib or blosc), default to None (no\n",
|
||
|
" | compression)\n",
|
||
|
" | \n",
|
||
|
" | to_pickle(self, path, compression='infer', protocol=4)\n",
|
||
|
" | Pickle (serialize) object to file.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | path : str\n",
|
||
|
" | File path where the pickled object will be stored.\n",
|
||
|
" | compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'\n",
|
||
|
" | A string representing the compression to use in the output file. By\n",
|
||
|
" | default, infers from the file extension in specified path.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.20.0\n",
|
||
|
" | protocol : int\n",
|
||
|
" | Int which indicates which protocol should be used by the pickler,\n",
|
||
|
" | default HIGHEST_PROTOCOL (see [1]_ paragraph 12.1.2). The possible\n",
|
||
|
" | values for this parameter depend on the version of Python. For\n",
|
||
|
" | Python 2.x, possible values are 0, 1, 2. For Python>=3.0, 3 is a\n",
|
||
|
" | valid value. For Python >= 3.4, 4 is a valid value. A negative\n",
|
||
|
" | value for the protocol parameter is equivalent to setting its value\n",
|
||
|
" | to HIGHEST_PROTOCOL.\n",
|
||
|
" | \n",
|
||
|
" | .. [1] https://docs.python.org/3/library/pickle.html\n",
|
||
|
" | .. versionadded:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | read_pickle : Load pickled pandas object (or any object) from file.\n",
|
||
|
" | DataFrame.to_hdf : Write DataFrame to an HDF5 file.\n",
|
||
|
" | DataFrame.to_sql : Write DataFrame to a SQL database.\n",
|
||
|
" | DataFrame.to_parquet : Write a DataFrame to the binary parquet format.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> original_df = pd.DataFrame({\"foo\": range(5), \"bar\": range(5, 10)})\n",
|
||
|
" | >>> original_df\n",
|
||
|
" | foo bar\n",
|
||
|
" | 0 0 5\n",
|
||
|
" | 1 1 6\n",
|
||
|
" | 2 2 7\n",
|
||
|
" | 3 3 8\n",
|
||
|
" | 4 4 9\n",
|
||
|
" | >>> original_df.to_pickle(\"./dummy.pkl\")\n",
|
||
|
" | \n",
|
||
|
" | >>> unpickled_df = pd.read_pickle(\"./dummy.pkl\")\n",
|
||
|
" | >>> unpickled_df\n",
|
||
|
" | foo bar\n",
|
||
|
" | 0 0 5\n",
|
||
|
" | 1 1 6\n",
|
||
|
" | 2 2 7\n",
|
||
|
" | 3 3 8\n",
|
||
|
" | 4 4 9\n",
|
||
|
" | \n",
|
||
|
" | >>> import os\n",
|
||
|
" | >>> os.remove(\"./dummy.pkl\")\n",
|
||
|
" | \n",
|
||
|
" | to_sql(self, name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)\n",
|
||
|
" | Write records stored in a DataFrame to a SQL database.\n",
|
||
|
" | \n",
|
||
|
" | Databases supported by SQLAlchemy [1]_ are supported. Tables can be\n",
|
||
|
" | newly created, appended to, or overwritten.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | name : string\n",
|
||
|
" | Name of SQL table.\n",
|
||
|
" | con : sqlalchemy.engine.Engine or sqlite3.Connection\n",
|
||
|
" | Using SQLAlchemy makes it possible to use any DB supported by that\n",
|
||
|
" | library. Legacy support is provided for sqlite3.Connection objects.\n",
|
||
|
" | schema : string, optional\n",
|
||
|
" | Specify the schema (if database flavor supports this). If None, use\n",
|
||
|
" | default schema.\n",
|
||
|
" | if_exists : {'fail', 'replace', 'append'}, default 'fail'\n",
|
||
|
" | How to behave if the table already exists.\n",
|
||
|
" | \n",
|
||
|
" | * fail: Raise a ValueError.\n",
|
||
|
" | * replace: Drop the table before inserting new values.\n",
|
||
|
" | * append: Insert new values to the existing table.\n",
|
||
|
" | \n",
|
||
|
" | index : bool, default True\n",
|
||
|
" | Write DataFrame index as a column. Uses `index_label` as the column\n",
|
||
|
" | name in the table.\n",
|
||
|
" | index_label : string or sequence, default None\n",
|
||
|
" | Column label for index column(s). If None is given (default) and\n",
|
||
|
" | `index` is True, then the index names are used.\n",
|
||
|
" | A sequence should be given if the DataFrame uses MultiIndex.\n",
|
||
|
" | chunksize : int, optional\n",
|
||
|
" | Rows will be written in batches of this size at a time. By default,\n",
|
||
|
" | all rows will be written at once.\n",
|
||
|
" | dtype : dict, optional\n",
|
||
|
" | Specifying the datatype for columns. The keys should be the column\n",
|
||
|
" | names and the values should be the SQLAlchemy types or strings for\n",
|
||
|
" | the sqlite3 legacy mode.\n",
|
||
|
" | method : {None, 'multi', callable}, default None\n",
|
||
|
" | Controls the SQL insertion clause used:\n",
|
||
|
" | \n",
|
||
|
" | * None : Uses standard SQL ``INSERT`` clause (one per row).\n",
|
||
|
" | * 'multi': Pass multiple values in a single ``INSERT`` clause.\n",
|
||
|
" | * callable with signature ``(pd_table, conn, keys, data_iter)``.\n",
|
||
|
" | \n",
|
||
|
" | Details and a sample callable implementation can be found in the\n",
|
||
|
" | section :ref:`insert method <io.sql.method>`.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | ValueError\n",
|
||
|
" | When the table already exists and `if_exists` is 'fail' (the\n",
|
||
|
" | default).\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | read_sql : Read a DataFrame from a table.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | Timezone aware datetime columns will be written as\n",
|
||
|
" | ``Timestamp with timezone`` type with SQLAlchemy if supported by the\n",
|
||
|
" | database. Otherwise, the datetimes will be stored as timezone unaware\n",
|
||
|
" | timestamps local to the original timezone.\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | References\n",
|
||
|
" | ----------\n",
|
||
|
" | .. [1] http://docs.sqlalchemy.org\n",
|
||
|
" | .. [2] https://www.python.org/dev/peps/pep-0249/\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | Create an in-memory SQLite database.\n",
|
||
|
" | \n",
|
||
|
" | >>> from sqlalchemy import create_engine\n",
|
||
|
" | >>> engine = create_engine('sqlite://', echo=False)\n",
|
||
|
" | \n",
|
||
|
" | Create a table from scratch with 3 rows.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})\n",
|
||
|
" | >>> df\n",
|
||
|
" | name\n",
|
||
|
" | 0 User 1\n",
|
||
|
" | 1 User 2\n",
|
||
|
" | 2 User 3\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_sql('users', con=engine)\n",
|
||
|
" | >>> engine.execute(\"SELECT * FROM users\").fetchall()\n",
|
||
|
" | [(0, 'User 1'), (1, 'User 2'), (2, 'User 3')]\n",
|
||
|
" | \n",
|
||
|
" | >>> df1 = pd.DataFrame({'name' : ['User 4', 'User 5']})\n",
|
||
|
" | >>> df1.to_sql('users', con=engine, if_exists='append')\n",
|
||
|
" | >>> engine.execute(\"SELECT * FROM users\").fetchall()\n",
|
||
|
" | [(0, 'User 1'), (1, 'User 2'), (2, 'User 3'),\n",
|
||
|
" | (0, 'User 4'), (1, 'User 5')]\n",
|
||
|
" | \n",
|
||
|
" | Overwrite the table with just ``df1``.\n",
|
||
|
" | \n",
|
||
|
" | >>> df1.to_sql('users', con=engine, if_exists='replace',\n",
|
||
|
" | ... index_label='id')\n",
|
||
|
" | >>> engine.execute(\"SELECT * FROM users\").fetchall()\n",
|
||
|
" | [(0, 'User 4'), (1, 'User 5')]\n",
|
||
|
" | \n",
|
||
|
" | Specify the dtype (especially useful for integers with missing values).\n",
|
||
|
" | Notice that while pandas is forced to store the data as floating point,\n",
|
||
|
" | the database supports nullable integers. When fetching the data with\n",
|
||
|
" | Python, we get back integer scalars.\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame({\"A\": [1, None, 2]})\n",
|
||
|
" | >>> df\n",
|
||
|
" | A\n",
|
||
|
" | 0 1.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | \n",
|
||
|
" | >>> from sqlalchemy.types import Integer\n",
|
||
|
" | >>> df.to_sql('integers', con=engine, index=False,\n",
|
||
|
" | ... dtype={\"A\": Integer()})\n",
|
||
|
" | \n",
|
||
|
" | >>> engine.execute(\"SELECT * FROM integers\").fetchall()\n",
|
||
|
" | [(1,), (None,), (2,)]\n",
|
||
|
" | \n",
|
||
|
" | to_xarray(self)\n",
|
||
|
" | Return an xarray object from the pandas object.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | xarray.DataArray or xarray.Dataset\n",
|
||
|
" | Data in the pandas structure converted to Dataset if the object is\n",
|
||
|
" | a DataFrame, or a DataArray if the object is a Series.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.to_hdf : Write DataFrame to an HDF5 file.\n",
|
||
|
" | DataFrame.to_parquet : Write a DataFrame to the binary parquet format.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | See the `xarray docs <http://xarray.pydata.org/en/stable/>`__\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([('falcon', 'bird', 389.0, 2),\n",
|
||
|
" | ... ('parrot', 'bird', 24.0, 2),\n",
|
||
|
" | ... ('lion', 'mammal', 80.5, 4),\n",
|
||
|
" | ... ('monkey', 'mammal', np.nan, 4)],\n",
|
||
|
" | ... columns=['name', 'class', 'max_speed',\n",
|
||
|
" | ... 'num_legs'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | name class max_speed num_legs\n",
|
||
|
" | 0 falcon bird 389.0 2\n",
|
||
|
" | 1 parrot bird 24.0 2\n",
|
||
|
" | 2 lion mammal 80.5 4\n",
|
||
|
" | 3 monkey mammal NaN 4\n",
|
||
|
" | \n",
|
||
|
" | >>> df.to_xarray()\n",
|
||
|
" | <xarray.Dataset>\n",
|
||
|
" | Dimensions: (index: 4)\n",
|
||
|
" | Coordinates:\n",
|
||
|
" | * index (index) int64 0 1 2 3\n",
|
||
|
" | Data variables:\n",
|
||
|
" | name (index) object 'falcon' 'parrot' 'lion' 'monkey'\n",
|
||
|
" | class (index) object 'bird' 'bird' 'mammal' 'mammal'\n",
|
||
|
" | max_speed (index) float64 389.0 24.0 80.5 nan\n",
|
||
|
" | num_legs (index) int64 2 2 4 4\n",
|
||
|
" | \n",
|
||
|
" | >>> df['max_speed'].to_xarray()\n",
|
||
|
" | <xarray.DataArray 'max_speed' (index: 4)>\n",
|
||
|
" | array([389. , 24. , 80.5, nan])\n",
|
||
|
" | Coordinates:\n",
|
||
|
" | * index (index) int64 0 1 2 3\n",
|
||
|
" | \n",
|
||
|
" | >>> dates = pd.to_datetime(['2018-01-01', '2018-01-01',\n",
|
||
|
" | ... '2018-01-02', '2018-01-02'])\n",
|
||
|
" | >>> df_multiindex = pd.DataFrame({'date': dates,\n",
|
||
|
" | ... 'animal': ['falcon', 'parrot', 'falcon',\n",
|
||
|
" | ... 'parrot'],\n",
|
||
|
" | ... 'speed': [350, 18, 361, 15]}).set_index(['date',\n",
|
||
|
" | ... 'animal'])\n",
|
||
|
" | >>> df_multiindex\n",
|
||
|
" | speed\n",
|
||
|
" | date animal\n",
|
||
|
" | 2018-01-01 falcon 350\n",
|
||
|
" | parrot 18\n",
|
||
|
" | 2018-01-02 falcon 361\n",
|
||
|
" | parrot 15\n",
|
||
|
" | \n",
|
||
|
" | >>> df_multiindex.to_xarray()\n",
|
||
|
" | <xarray.Dataset>\n",
|
||
|
" | Dimensions: (animal: 2, date: 2)\n",
|
||
|
" | Coordinates:\n",
|
||
|
" | * date (date) datetime64[ns] 2018-01-01 2018-01-02\n",
|
||
|
" | * animal (animal) object 'falcon' 'parrot'\n",
|
||
|
" | Data variables:\n",
|
||
|
" | speed (date, animal) int64 350 18 361 15\n",
|
||
|
" | \n",
|
||
|
" | truncate(self, before=None, after=None, axis=None, copy=True)\n",
|
||
|
" | Truncate a Series or DataFrame before and after some index value.\n",
|
||
|
" | \n",
|
||
|
" | This is a useful shorthand for boolean indexing based on index\n",
|
||
|
" | values above or below certain thresholds.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | before : date, string, int\n",
|
||
|
" | Truncate all rows before this index value.\n",
|
||
|
" | after : date, string, int\n",
|
||
|
" | Truncate all rows after this index value.\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, optional\n",
|
||
|
" | Axis to truncate. Truncates the index (rows) by default.\n",
|
||
|
" | copy : boolean, default is True,\n",
|
||
|
" | Return a copy of the truncated section.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | type of caller\n",
|
||
|
" | The truncated Series or DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.loc : Select a subset of a DataFrame by label.\n",
|
||
|
" | DataFrame.iloc : Select a subset of a DataFrame by position.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | If the index being truncated contains only datetime values,\n",
|
||
|
" | `before` and `after` may be specified as strings instead of\n",
|
||
|
" | Timestamps.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame({'A': ['a', 'b', 'c', 'd', 'e'],\n",
|
||
|
" | ... 'B': ['f', 'g', 'h', 'i', 'j'],\n",
|
||
|
" | ... 'C': ['k', 'l', 'm', 'n', 'o']},\n",
|
||
|
" | ... index=[1, 2, 3, 4, 5])\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B C\n",
|
||
|
" | 1 a f k\n",
|
||
|
" | 2 b g l\n",
|
||
|
" | 3 c h m\n",
|
||
|
" | 4 d i n\n",
|
||
|
" | 5 e j o\n",
|
||
|
" | \n",
|
||
|
" | >>> df.truncate(before=2, after=4)\n",
|
||
|
" | A B C\n",
|
||
|
" | 2 b g l\n",
|
||
|
" | 3 c h m\n",
|
||
|
" | 4 d i n\n",
|
||
|
" | \n",
|
||
|
" | The columns of a DataFrame can be truncated.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.truncate(before=\"A\", after=\"B\", axis=\"columns\")\n",
|
||
|
" | A B\n",
|
||
|
" | 1 a f\n",
|
||
|
" | 2 b g\n",
|
||
|
" | 3 c h\n",
|
||
|
" | 4 d i\n",
|
||
|
" | 5 e j\n",
|
||
|
" | \n",
|
||
|
" | For Series, only rows can be truncated.\n",
|
||
|
" | \n",
|
||
|
" | >>> df['A'].truncate(before=2, after=4)\n",
|
||
|
" | 2 b\n",
|
||
|
" | 3 c\n",
|
||
|
" | 4 d\n",
|
||
|
" | Name: A, dtype: object\n",
|
||
|
" | \n",
|
||
|
" | The index values in ``truncate`` can be datetimes or string\n",
|
||
|
" | dates.\n",
|
||
|
" | \n",
|
||
|
" | >>> dates = pd.date_range('2016-01-01', '2016-02-01', freq='s')\n",
|
||
|
" | >>> df = pd.DataFrame(index=dates, data={'A': 1})\n",
|
||
|
" | >>> df.tail()\n",
|
||
|
" | A\n",
|
||
|
" | 2016-01-31 23:59:56 1\n",
|
||
|
" | 2016-01-31 23:59:57 1\n",
|
||
|
" | 2016-01-31 23:59:58 1\n",
|
||
|
" | 2016-01-31 23:59:59 1\n",
|
||
|
" | 2016-02-01 00:00:00 1\n",
|
||
|
" | \n",
|
||
|
" | >>> df.truncate(before=pd.Timestamp('2016-01-05'),\n",
|
||
|
" | ... after=pd.Timestamp('2016-01-10')).tail()\n",
|
||
|
" | A\n",
|
||
|
" | 2016-01-09 23:59:56 1\n",
|
||
|
" | 2016-01-09 23:59:57 1\n",
|
||
|
" | 2016-01-09 23:59:58 1\n",
|
||
|
" | 2016-01-09 23:59:59 1\n",
|
||
|
" | 2016-01-10 00:00:00 1\n",
|
||
|
" | \n",
|
||
|
" | Because the index is a DatetimeIndex containing only dates, we can\n",
|
||
|
" | specify `before` and `after` as strings. They will be coerced to\n",
|
||
|
" | Timestamps before truncation.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.truncate('2016-01-05', '2016-01-10').tail()\n",
|
||
|
" | A\n",
|
||
|
" | 2016-01-09 23:59:56 1\n",
|
||
|
" | 2016-01-09 23:59:57 1\n",
|
||
|
" | 2016-01-09 23:59:58 1\n",
|
||
|
" | 2016-01-09 23:59:59 1\n",
|
||
|
" | 2016-01-10 00:00:00 1\n",
|
||
|
" | \n",
|
||
|
" | Note that ``truncate`` assumes a 0 value for any unspecified time\n",
|
||
|
" | component (midnight). This differs from partial string slicing, which\n",
|
||
|
" | returns any partially matching dates.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['2016-01-05':'2016-01-10', :].tail()\n",
|
||
|
" | A\n",
|
||
|
" | 2016-01-10 23:59:55 1\n",
|
||
|
" | 2016-01-10 23:59:56 1\n",
|
||
|
" | 2016-01-10 23:59:57 1\n",
|
||
|
" | 2016-01-10 23:59:58 1\n",
|
||
|
" | 2016-01-10 23:59:59 1\n",
|
||
|
" | \n",
|
||
|
" | tshift(self, periods=1, freq=None, axis=0)\n",
|
||
|
" | Shift the time index, using the index's frequency if available.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | periods : int\n",
|
||
|
" | Number of periods to move, can be positive or negative\n",
|
||
|
" | freq : DateOffset, timedelta, or time rule string, default None\n",
|
||
|
" | Increment to use from the tseries module or time rule (e.g. 'EOM')\n",
|
||
|
" | axis : int or basestring\n",
|
||
|
" | Corresponds to the axis that contains the Index\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | shifted : NDFrame\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | If freq is not specified then tries to use the freq or inferred_freq\n",
|
||
|
" | attributes of the index. If neither of those attributes exist, a\n",
|
||
|
" | ValueError is thrown\n",
|
||
|
" | \n",
|
||
|
" | tz_convert(self, tz, axis=0, level=None, copy=True)\n",
|
||
|
" | Convert tz-aware axis to target time zone.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | tz : string or pytz.timezone object\n",
|
||
|
" | axis : the axis to convert\n",
|
||
|
" | level : int, str, default None\n",
|
||
|
" | If axis ia a MultiIndex, convert a specific level. Otherwise\n",
|
||
|
" | must be None\n",
|
||
|
" | copy : boolean, default True\n",
|
||
|
" | Also make a copy of the underlying data\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | If the axis is tz-naive.\n",
|
||
|
" | \n",
|
||
|
" | tz_localize(self, tz, axis=0, level=None, copy=True, ambiguous='raise', nonexistent='raise')\n",
|
||
|
" | Localize tz-naive index of a Series or DataFrame to target time zone.\n",
|
||
|
" | \n",
|
||
|
" | This operation localizes the Index. To localize the values in a\n",
|
||
|
" | timezone-naive Series, use :meth:`Series.dt.tz_localize`.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | tz : string or pytz.timezone object\n",
|
||
|
" | axis : the axis to localize\n",
|
||
|
" | level : int, str, default None\n",
|
||
|
" | If axis ia a MultiIndex, localize a specific level. Otherwise\n",
|
||
|
" | must be None\n",
|
||
|
" | copy : boolean, default True\n",
|
||
|
" | Also make a copy of the underlying data\n",
|
||
|
" | ambiguous : 'infer', bool-ndarray, 'NaT', default 'raise'\n",
|
||
|
" | When clocks moved backward due to DST, ambiguous times may arise.\n",
|
||
|
" | For example in Central European Time (UTC+01), when going from\n",
|
||
|
" | 03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at\n",
|
||
|
" | 00:30:00 UTC and at 01:30:00 UTC. In such a situation, the\n",
|
||
|
" | `ambiguous` parameter dictates how ambiguous times should be\n",
|
||
|
" | handled.\n",
|
||
|
" | \n",
|
||
|
" | - 'infer' will attempt to infer fall dst-transition hours based on\n",
|
||
|
" | order\n",
|
||
|
" | - bool-ndarray where True signifies a DST time, False designates\n",
|
||
|
" | a non-DST time (note that this flag is only applicable for\n",
|
||
|
" | ambiguous times)\n",
|
||
|
" | - 'NaT' will return NaT where there are ambiguous times\n",
|
||
|
" | - 'raise' will raise an AmbiguousTimeError if there are ambiguous\n",
|
||
|
" | times\n",
|
||
|
" | nonexistent : str, default 'raise'\n",
|
||
|
" | A nonexistent time does not exist in a particular timezone\n",
|
||
|
" | where clocks moved forward due to DST. Valid valuse are:\n",
|
||
|
" | \n",
|
||
|
" | - 'shift_forward' will shift the nonexistent time forward to the\n",
|
||
|
" | closest existing time\n",
|
||
|
" | - 'shift_backward' will shift the nonexistent time backward to the\n",
|
||
|
" | closest existing time\n",
|
||
|
" | - 'NaT' will return NaT where there are nonexistent times\n",
|
||
|
" | - timedelta objects will shift nonexistent times by the timedelta\n",
|
||
|
" | - 'raise' will raise an NonExistentTimeError if there are\n",
|
||
|
" | nonexistent times\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.24.0\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Same type as the input.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | TypeError\n",
|
||
|
" | If the TimeSeries is tz-aware and tz is not None.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | Localize local times:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series([1],\n",
|
||
|
" | ... index=pd.DatetimeIndex(['2018-09-15 01:30:00']))\n",
|
||
|
" | >>> s.tz_localize('CET')\n",
|
||
|
" | 2018-09-15 01:30:00+02:00 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Be careful with DST changes. When there is sequential data, pandas\n",
|
||
|
" | can infer the DST time:\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(range(7), index=pd.DatetimeIndex([\n",
|
||
|
" | ... '2018-10-28 01:30:00',\n",
|
||
|
" | ... '2018-10-28 02:00:00',\n",
|
||
|
" | ... '2018-10-28 02:30:00',\n",
|
||
|
" | ... '2018-10-28 02:00:00',\n",
|
||
|
" | ... '2018-10-28 02:30:00',\n",
|
||
|
" | ... '2018-10-28 03:00:00',\n",
|
||
|
" | ... '2018-10-28 03:30:00']))\n",
|
||
|
" | >>> s.tz_localize('CET', ambiguous='infer')\n",
|
||
|
" | 2018-10-28 01:30:00+02:00 0\n",
|
||
|
" | 2018-10-28 02:00:00+02:00 1\n",
|
||
|
" | 2018-10-28 02:30:00+02:00 2\n",
|
||
|
" | 2018-10-28 02:00:00+01:00 3\n",
|
||
|
" | 2018-10-28 02:30:00+01:00 4\n",
|
||
|
" | 2018-10-28 03:00:00+01:00 5\n",
|
||
|
" | 2018-10-28 03:30:00+01:00 6\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | In some cases, inferring the DST is impossible. In such cases, you can\n",
|
||
|
" | pass an ndarray to the ambiguous parameter to set the DST explicitly\n",
|
||
|
" | \n",
|
||
|
" | >>> s = pd.Series(range(3), index=pd.DatetimeIndex([\n",
|
||
|
" | ... '2018-10-28 01:20:00',\n",
|
||
|
" | ... '2018-10-28 02:36:00',\n",
|
||
|
" | ... '2018-10-28 03:46:00']))\n",
|
||
|
" | >>> s.tz_localize('CET', ambiguous=np.array([True, True, False]))\n",
|
||
|
" | 2018-10-28 01:20:00+02:00 0\n",
|
||
|
" | 2018-10-28 02:36:00+02:00 1\n",
|
||
|
" | 2018-10-28 03:46:00+01:00 2\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | If the DST transition causes nonexistent times, you can shift these\n",
|
||
|
" | dates forward or backwards with a timedelta object or `'shift_forward'`\n",
|
||
|
" | or `'shift_backwards'`.\n",
|
||
|
" | >>> s = pd.Series(range(2), index=pd.DatetimeIndex([\n",
|
||
|
" | ... '2015-03-29 02:30:00',\n",
|
||
|
" | ... '2015-03-29 03:30:00']))\n",
|
||
|
" | >>> s.tz_localize('Europe/Warsaw', nonexistent='shift_forward')\n",
|
||
|
" | 2015-03-29 03:00:00+02:00 0\n",
|
||
|
" | 2015-03-29 03:30:00+02:00 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> s.tz_localize('Europe/Warsaw', nonexistent='shift_backward')\n",
|
||
|
" | 2015-03-29 01:59:59.999999999+01:00 0\n",
|
||
|
" | 2015-03-29 03:30:00+02:00 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | >>> s.tz_localize('Europe/Warsaw', nonexistent=pd.Timedelta('1H'))\n",
|
||
|
" | 2015-03-29 03:30:00+02:00 0\n",
|
||
|
" | 2015-03-29 03:30:00+02:00 1\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | where(self, cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False, raise_on_error=None)\n",
|
||
|
" | Replace values where the condition is False.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | cond : boolean NDFrame, array-like, or callable\n",
|
||
|
" | Where `cond` is True, keep the original value. Where\n",
|
||
|
" | False, replace with corresponding value from `other`.\n",
|
||
|
" | If `cond` is callable, it is computed on the NDFrame and\n",
|
||
|
" | should return boolean NDFrame or array. The callable must\n",
|
||
|
" | not change input NDFrame (though pandas doesn't check it).\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.1\n",
|
||
|
" | A callable can be used as cond.\n",
|
||
|
" | \n",
|
||
|
" | other : scalar, NDFrame, or callable\n",
|
||
|
" | Entries where `cond` is False are replaced with\n",
|
||
|
" | corresponding value from `other`.\n",
|
||
|
" | If other is callable, it is computed on the NDFrame and\n",
|
||
|
" | should return scalar or NDFrame. The callable must not\n",
|
||
|
" | change input NDFrame (though pandas doesn't check it).\n",
|
||
|
" | \n",
|
||
|
" | .. versionadded:: 0.18.1\n",
|
||
|
" | A callable can be used as other.\n",
|
||
|
" | \n",
|
||
|
" | inplace : boolean, default False\n",
|
||
|
" | Whether to perform the operation in place on the data.\n",
|
||
|
" | axis : int, default None\n",
|
||
|
" | Alignment axis if needed.\n",
|
||
|
" | level : int, default None\n",
|
||
|
" | Alignment level if needed.\n",
|
||
|
" | errors : str, {'raise', 'ignore'}, default `raise`\n",
|
||
|
" | Note that currently this parameter won't affect\n",
|
||
|
" | the results and will always coerce to a suitable dtype.\n",
|
||
|
" | \n",
|
||
|
" | - `raise` : allow exceptions to be raised.\n",
|
||
|
" | - `ignore` : suppress exceptions. On error return original object.\n",
|
||
|
" | \n",
|
||
|
" | try_cast : boolean, default False\n",
|
||
|
" | Try to cast the result back to the input type (if possible).\n",
|
||
|
" | raise_on_error : boolean, default True\n",
|
||
|
" | Whether to raise on invalid data types (e.g. trying to where on\n",
|
||
|
" | strings).\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | Use `errors`.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | wh : same type as caller\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | :func:`DataFrame.mask` : Return an object of same shape as\n",
|
||
|
" | self.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | The where method is an application of the if-then idiom. For each\n",
|
||
|
" | element in the calling DataFrame, if ``cond`` is ``True`` the\n",
|
||
|
" | element is used; otherwise the corresponding element from the DataFrame\n",
|
||
|
" | ``other`` is used.\n",
|
||
|
" | \n",
|
||
|
" | The signature for :func:`DataFrame.where` differs from\n",
|
||
|
" | :func:`numpy.where`. Roughly ``df1.where(m, df2)`` is equivalent to\n",
|
||
|
" | ``np.where(m, df1, df2)``.\n",
|
||
|
" | \n",
|
||
|
" | For further details and examples see the ``where`` documentation in\n",
|
||
|
" | :ref:`indexing <indexing.where_mask>`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> s = pd.Series(range(5))\n",
|
||
|
" | >>> s.where(s > 0)\n",
|
||
|
" | 0 NaN\n",
|
||
|
" | 1 1.0\n",
|
||
|
" | 2 2.0\n",
|
||
|
" | 3 3.0\n",
|
||
|
" | 4 4.0\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.mask(s > 0)\n",
|
||
|
" | 0 0.0\n",
|
||
|
" | 1 NaN\n",
|
||
|
" | 2 NaN\n",
|
||
|
" | 3 NaN\n",
|
||
|
" | 4 NaN\n",
|
||
|
" | dtype: float64\n",
|
||
|
" | \n",
|
||
|
" | >>> s.where(s > 1, 10)\n",
|
||
|
" | 0 10\n",
|
||
|
" | 1 10\n",
|
||
|
" | 2 2\n",
|
||
|
" | 3 3\n",
|
||
|
" | 4 4\n",
|
||
|
" | dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame(np.arange(10).reshape(-1, 2), columns=['A', 'B'])\n",
|
||
|
" | >>> m = df % 3 == 0\n",
|
||
|
" | >>> df.where(m, -df)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 0 -1\n",
|
||
|
" | 1 -2 3\n",
|
||
|
" | 2 -4 -5\n",
|
||
|
" | 3 6 -7\n",
|
||
|
" | 4 -8 9\n",
|
||
|
" | >>> df.where(m, -df) == np.where(m, df, -df)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 True True\n",
|
||
|
" | 1 True True\n",
|
||
|
" | 2 True True\n",
|
||
|
" | 3 True True\n",
|
||
|
" | 4 True True\n",
|
||
|
" | >>> df.where(m, -df) == df.mask(~m, -df)\n",
|
||
|
" | A B\n",
|
||
|
" | 0 True True\n",
|
||
|
" | 1 True True\n",
|
||
|
" | 2 True True\n",
|
||
|
" | 3 True True\n",
|
||
|
" | 4 True True\n",
|
||
|
" | \n",
|
||
|
" | xs(self, key, axis=0, level=None, drop_level=True)\n",
|
||
|
" | Return cross-section from the Series/DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | This method takes a `key` argument to select data at a particular\n",
|
||
|
" | level of a MultiIndex.\n",
|
||
|
" | \n",
|
||
|
" | Parameters\n",
|
||
|
" | ----------\n",
|
||
|
" | key : label or tuple of label\n",
|
||
|
" | Label contained in the index, or partially in a MultiIndex.\n",
|
||
|
" | axis : {0 or 'index', 1 or 'columns'}, default 0\n",
|
||
|
" | Axis to retrieve cross-section on.\n",
|
||
|
" | level : object, defaults to first n levels (n=1 or len(key))\n",
|
||
|
" | In case of a key partially contained in a MultiIndex, indicate\n",
|
||
|
" | which levels are used. Levels can be referred by label or position.\n",
|
||
|
" | drop_level : bool, default True\n",
|
||
|
" | If False, returns object with same levels as self.\n",
|
||
|
" | \n",
|
||
|
" | Returns\n",
|
||
|
" | -------\n",
|
||
|
" | Series or DataFrame\n",
|
||
|
" | Cross-section from the original Series or DataFrame\n",
|
||
|
" | corresponding to the selected index levels.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.loc : Access a group of rows and columns\n",
|
||
|
" | by label(s) or a boolean array.\n",
|
||
|
" | DataFrame.iloc : Purely integer-location based indexing\n",
|
||
|
" | for selection by position.\n",
|
||
|
" | \n",
|
||
|
" | Notes\n",
|
||
|
" | -----\n",
|
||
|
" | `xs` can not be used to set values.\n",
|
||
|
" | \n",
|
||
|
" | MultiIndex Slicers is a generic way to get/set values on\n",
|
||
|
" | any level or levels.\n",
|
||
|
" | It is a superset of `xs` functionality, see\n",
|
||
|
" | :ref:`MultiIndex Slicers <advanced.mi_slicers>`.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> d = {'num_legs': [4, 4, 2, 2],\n",
|
||
|
" | ... 'num_wings': [0, 0, 2, 2],\n",
|
||
|
" | ... 'class': ['mammal', 'mammal', 'mammal', 'bird'],\n",
|
||
|
" | ... 'animal': ['cat', 'dog', 'bat', 'penguin'],\n",
|
||
|
" | ... 'locomotion': ['walks', 'walks', 'flies', 'walks']}\n",
|
||
|
" | >>> df = pd.DataFrame(data=d)\n",
|
||
|
" | >>> df = df.set_index(['class', 'animal', 'locomotion'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | num_legs num_wings\n",
|
||
|
" | class animal locomotion\n",
|
||
|
" | mammal cat walks 4 0\n",
|
||
|
" | dog walks 4 0\n",
|
||
|
" | bat flies 2 2\n",
|
||
|
" | bird penguin walks 2 2\n",
|
||
|
" | \n",
|
||
|
" | Get values at specified index\n",
|
||
|
" | \n",
|
||
|
" | >>> df.xs('mammal')\n",
|
||
|
" | num_legs num_wings\n",
|
||
|
" | animal locomotion\n",
|
||
|
" | cat walks 4 0\n",
|
||
|
" | dog walks 4 0\n",
|
||
|
" | bat flies 2 2\n",
|
||
|
" | \n",
|
||
|
" | Get values at several indexes\n",
|
||
|
" | \n",
|
||
|
" | >>> df.xs(('mammal', 'dog'))\n",
|
||
|
" | num_legs num_wings\n",
|
||
|
" | locomotion\n",
|
||
|
" | walks 4 0\n",
|
||
|
" | \n",
|
||
|
" | Get values at specified index and level\n",
|
||
|
" | \n",
|
||
|
" | >>> df.xs('cat', level=1)\n",
|
||
|
" | num_legs num_wings\n",
|
||
|
" | class locomotion\n",
|
||
|
" | mammal walks 4 0\n",
|
||
|
" | \n",
|
||
|
" | Get values at several indexes and levels\n",
|
||
|
" | \n",
|
||
|
" | >>> df.xs(('bird', 'walks'),\n",
|
||
|
" | ... level=[0, 'locomotion'])\n",
|
||
|
" | num_legs num_wings\n",
|
||
|
" | animal\n",
|
||
|
" | penguin 2 2\n",
|
||
|
" | \n",
|
||
|
" | Get values at specified column and axis\n",
|
||
|
" | \n",
|
||
|
" | >>> df.xs('num_wings', axis=1)\n",
|
||
|
" | class animal locomotion\n",
|
||
|
" | mammal cat walks 0\n",
|
||
|
" | dog walks 0\n",
|
||
|
" | bat flies 2\n",
|
||
|
" | bird penguin walks 2\n",
|
||
|
" | Name: num_wings, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Data descriptors inherited from pandas.core.generic.NDFrame:\n",
|
||
|
" | \n",
|
||
|
" | at\n",
|
||
|
" | Access a single value for a row/column label pair.\n",
|
||
|
" | \n",
|
||
|
" | Similar to ``loc``, in that both provide label-based lookups. Use\n",
|
||
|
" | ``at`` if you only need to get or set a single value in a DataFrame\n",
|
||
|
" | or Series.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | KeyError\n",
|
||
|
" | When label does not exist in DataFrame\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.iat : Access a single value for a row/column pair by integer\n",
|
||
|
" | position.\n",
|
||
|
" | DataFrame.loc : Access a group of rows and columns by label(s).\n",
|
||
|
" | Series.at : Access a single value using a label.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],\n",
|
||
|
" | ... index=[4, 5, 6], columns=['A', 'B', 'C'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B C\n",
|
||
|
" | 4 0 2 3\n",
|
||
|
" | 5 0 4 1\n",
|
||
|
" | 6 10 20 30\n",
|
||
|
" | \n",
|
||
|
" | Get value at specified row/column pair\n",
|
||
|
" | \n",
|
||
|
" | >>> df.at[4, 'B']\n",
|
||
|
" | 2\n",
|
||
|
" | \n",
|
||
|
" | Set value at specified row/column pair\n",
|
||
|
" | \n",
|
||
|
" | >>> df.at[4, 'B'] = 10\n",
|
||
|
" | >>> df.at[4, 'B']\n",
|
||
|
" | 10\n",
|
||
|
" | \n",
|
||
|
" | Get value within a Series\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[5].at['B']\n",
|
||
|
" | 4\n",
|
||
|
" | \n",
|
||
|
" | blocks\n",
|
||
|
" | Internal property, property synonym for as_blocks().\n",
|
||
|
" | \n",
|
||
|
" | .. deprecated:: 0.21.0\n",
|
||
|
" | \n",
|
||
|
" | iat\n",
|
||
|
" | Access a single value for a row/column pair by integer position.\n",
|
||
|
" | \n",
|
||
|
" | Similar to ``iloc``, in that both provide integer-based lookups. Use\n",
|
||
|
" | ``iat`` if you only need to get or set a single value in a DataFrame\n",
|
||
|
" | or Series.\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | IndexError\n",
|
||
|
" | When integer position is out of bounds\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.at : Access a single value for a row/column label pair.\n",
|
||
|
" | DataFrame.loc : Access a group of rows and columns by label(s).\n",
|
||
|
" | DataFrame.iloc : Access a group of rows and columns by integer position(s).\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | >>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],\n",
|
||
|
" | ... columns=['A', 'B', 'C'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | A B C\n",
|
||
|
" | 0 0 2 3\n",
|
||
|
" | 1 0 4 1\n",
|
||
|
" | 2 10 20 30\n",
|
||
|
" | \n",
|
||
|
" | Get value at specified row/column pair\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iat[1, 2]\n",
|
||
|
" | 1\n",
|
||
|
" | \n",
|
||
|
" | Set value at specified row/column pair\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iat[1, 2] = 10\n",
|
||
|
" | >>> df.iat[1, 2]\n",
|
||
|
" | 10\n",
|
||
|
" | \n",
|
||
|
" | Get value within a series\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[0].iat[1]\n",
|
||
|
" | 2\n",
|
||
|
" | \n",
|
||
|
" | iloc\n",
|
||
|
" | Purely integer-location based indexing for selection by position.\n",
|
||
|
" | \n",
|
||
|
" | ``.iloc[]`` is primarily integer position based (from ``0`` to\n",
|
||
|
" | ``length-1`` of the axis), but may also be used with a boolean\n",
|
||
|
" | array.\n",
|
||
|
" | \n",
|
||
|
" | Allowed inputs are:\n",
|
||
|
" | \n",
|
||
|
" | - An integer, e.g. ``5``.\n",
|
||
|
" | - A list or array of integers, e.g. ``[4, 3, 0]``.\n",
|
||
|
" | - A slice object with ints, e.g. ``1:7``.\n",
|
||
|
" | - A boolean array.\n",
|
||
|
" | - A ``callable`` function with one argument (the calling Series, DataFrame\n",
|
||
|
" | or Panel) and that returns valid output for indexing (one of the above).\n",
|
||
|
" | This is useful in method chains, when you don't have a reference to the\n",
|
||
|
" | calling object, but would like to base your selection on some value.\n",
|
||
|
" | \n",
|
||
|
" | ``.iloc`` will raise ``IndexError`` if a requested indexer is\n",
|
||
|
" | out-of-bounds, except *slice* indexers which allow out-of-bounds\n",
|
||
|
" | indexing (this conforms with python/numpy *slice* semantics).\n",
|
||
|
" | \n",
|
||
|
" | See more at ref:`Selection by Position <indexing.integer>`.\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.iat : Fast integer location scalar accessor.\n",
|
||
|
" | DataFrame.loc : Purely label-location based indexer for selection by label.\n",
|
||
|
" | Series.iloc : Purely integer-location based indexing for\n",
|
||
|
" | selection by position.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | \n",
|
||
|
" | >>> mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},\n",
|
||
|
" | ... {'a': 100, 'b': 200, 'c': 300, 'd': 400},\n",
|
||
|
" | ... {'a': 1000, 'b': 2000, 'c': 3000, 'd': 4000 }]\n",
|
||
|
" | >>> df = pd.DataFrame(mydict)\n",
|
||
|
" | >>> df\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 1 2 3 4\n",
|
||
|
" | 1 100 200 300 400\n",
|
||
|
" | 2 1000 2000 3000 4000\n",
|
||
|
" | \n",
|
||
|
" | **Indexing just the rows**\n",
|
||
|
" | \n",
|
||
|
" | With a scalar integer.\n",
|
||
|
" | \n",
|
||
|
" | >>> type(df.iloc[0])\n",
|
||
|
" | <class 'pandas.core.series.Series'>\n",
|
||
|
" | >>> df.iloc[0]\n",
|
||
|
" | a 1\n",
|
||
|
" | b 2\n",
|
||
|
" | c 3\n",
|
||
|
" | d 4\n",
|
||
|
" | Name: 0, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | With a list of integers.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[[0]]\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 1 2 3 4\n",
|
||
|
" | >>> type(df.iloc[[0]])\n",
|
||
|
" | <class 'pandas.core.frame.DataFrame'>\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[[0, 1]]\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 1 2 3 4\n",
|
||
|
" | 1 100 200 300 400\n",
|
||
|
" | \n",
|
||
|
" | With a `slice` object.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[:3]\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 1 2 3 4\n",
|
||
|
" | 1 100 200 300 400\n",
|
||
|
" | 2 1000 2000 3000 4000\n",
|
||
|
" | \n",
|
||
|
" | With a boolean mask the same length as the index.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[[True, False, True]]\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 1 2 3 4\n",
|
||
|
" | 2 1000 2000 3000 4000\n",
|
||
|
" | \n",
|
||
|
" | With a callable, useful in method chains. The `x` passed\n",
|
||
|
" | to the ``lambda`` is the DataFrame being sliced. This selects\n",
|
||
|
" | the rows whose index label even.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[lambda x: x.index % 2 == 0]\n",
|
||
|
" | a b c d\n",
|
||
|
" | 0 1 2 3 4\n",
|
||
|
" | 2 1000 2000 3000 4000\n",
|
||
|
" | \n",
|
||
|
" | **Indexing both axes**\n",
|
||
|
" | \n",
|
||
|
" | You can mix the indexer types for the index and columns. Use ``:`` to\n",
|
||
|
" | select the entire axis.\n",
|
||
|
" | \n",
|
||
|
" | With scalar integers.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[0, 1]\n",
|
||
|
" | 2\n",
|
||
|
" | \n",
|
||
|
" | With lists of integers.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[[0, 2], [1, 3]]\n",
|
||
|
" | b d\n",
|
||
|
" | 0 2 4\n",
|
||
|
" | 2 2000 4000\n",
|
||
|
" | \n",
|
||
|
" | With `slice` objects.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[1:3, 0:3]\n",
|
||
|
" | a b c\n",
|
||
|
" | 1 100 200 300\n",
|
||
|
" | 2 1000 2000 3000\n",
|
||
|
" | \n",
|
||
|
" | With a boolean array whose length matches the columns.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[:, [True, False, True, False]]\n",
|
||
|
" | a c\n",
|
||
|
" | 0 1 3\n",
|
||
|
" | 1 100 300\n",
|
||
|
" | 2 1000 3000\n",
|
||
|
" | \n",
|
||
|
" | With a callable function that expects the Series or DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.iloc[:, lambda df: [0, 2]]\n",
|
||
|
" | a c\n",
|
||
|
" | 0 1 3\n",
|
||
|
" | 1 100 300\n",
|
||
|
" | 2 1000 3000\n",
|
||
|
" | \n",
|
||
|
" | is_copy\n",
|
||
|
" | Return the copy.\n",
|
||
|
" | \n",
|
||
|
" | ix\n",
|
||
|
" | A primarily label-location based indexer, with integer position\n",
|
||
|
" | fallback.\n",
|
||
|
" | \n",
|
||
|
" | Warning: Starting in 0.20.0, the .ix indexer is deprecated, in\n",
|
||
|
" | favor of the more strict .iloc and .loc indexers.\n",
|
||
|
" | \n",
|
||
|
" | ``.ix[]`` supports mixed integer and label based access. It is\n",
|
||
|
" | primarily label based, but will fall back to integer positional\n",
|
||
|
" | access unless the corresponding axis is of integer type.\n",
|
||
|
" | \n",
|
||
|
" | ``.ix`` is the most general indexer and will support any of the\n",
|
||
|
" | inputs in ``.loc`` and ``.iloc``. ``.ix`` also supports floating\n",
|
||
|
" | point label schemes. ``.ix`` is exceptionally useful when dealing\n",
|
||
|
" | with mixed positional and label based hierarchical indexes.\n",
|
||
|
" | \n",
|
||
|
" | However, when an axis is integer based, ONLY label based access\n",
|
||
|
" | and not positional access is supported. Thus, in such cases, it's\n",
|
||
|
" | usually better to be explicit and use ``.iloc`` or ``.loc``.\n",
|
||
|
" | \n",
|
||
|
" | See more at :ref:`Advanced Indexing <advanced>`.\n",
|
||
|
" | \n",
|
||
|
" | loc\n",
|
||
|
" | Access a group of rows and columns by label(s) or a boolean array.\n",
|
||
|
" | \n",
|
||
|
" | ``.loc[]`` is primarily label based, but may also be used with a\n",
|
||
|
" | boolean array.\n",
|
||
|
" | \n",
|
||
|
" | Allowed inputs are:\n",
|
||
|
" | \n",
|
||
|
" | - A single label, e.g. ``5`` or ``'a'``, (note that ``5`` is\n",
|
||
|
" | interpreted as a *label* of the index, and **never** as an\n",
|
||
|
" | integer position along the index).\n",
|
||
|
" | - A list or array of labels, e.g. ``['a', 'b', 'c']``.\n",
|
||
|
" | - A slice object with labels, e.g. ``'a':'f'``.\n",
|
||
|
" | \n",
|
||
|
" | .. warning:: Note that contrary to usual python slices, **both** the\n",
|
||
|
" | start and the stop are included\n",
|
||
|
" | \n",
|
||
|
" | - A boolean array of the same length as the axis being sliced,\n",
|
||
|
" | e.g. ``[True, False, True]``.\n",
|
||
|
" | - A ``callable`` function with one argument (the calling Series, DataFrame\n",
|
||
|
" | or Panel) and that returns valid output for indexing (one of the above)\n",
|
||
|
" | \n",
|
||
|
" | See more at :ref:`Selection by Label <indexing.label>`\n",
|
||
|
" | \n",
|
||
|
" | Raises\n",
|
||
|
" | ------\n",
|
||
|
" | KeyError:\n",
|
||
|
" | when any items are not found\n",
|
||
|
" | \n",
|
||
|
" | See Also\n",
|
||
|
" | --------\n",
|
||
|
" | DataFrame.at : Access a single value for a row/column label pair.\n",
|
||
|
" | DataFrame.iloc : Access group of rows and columns by integer position(s).\n",
|
||
|
" | DataFrame.xs : Returns a cross-section (row(s) or column(s)) from the\n",
|
||
|
" | Series/DataFrame.\n",
|
||
|
" | Series.loc : Access group of values using labels.\n",
|
||
|
" | \n",
|
||
|
" | Examples\n",
|
||
|
" | --------\n",
|
||
|
" | **Getting values**\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],\n",
|
||
|
" | ... index=['cobra', 'viper', 'sidewinder'],\n",
|
||
|
" | ... columns=['max_speed', 'shield'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra 1 2\n",
|
||
|
" | viper 4 5\n",
|
||
|
" | sidewinder 7 8\n",
|
||
|
" | \n",
|
||
|
" | Single label. Note this returns the row as a Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['viper']\n",
|
||
|
" | max_speed 4\n",
|
||
|
" | shield 5\n",
|
||
|
" | Name: viper, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | List of labels. Note using ``[[]]`` returns a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[['viper', 'sidewinder']]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | viper 4 5\n",
|
||
|
" | sidewinder 7 8\n",
|
||
|
" | \n",
|
||
|
" | Single label for row and column\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['cobra', 'shield']\n",
|
||
|
" | 2\n",
|
||
|
" | \n",
|
||
|
" | Slice with labels for row and single label for column. As mentioned\n",
|
||
|
" | above, note that both the start and stop of the slice are included.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['cobra':'viper', 'max_speed']\n",
|
||
|
" | cobra 1\n",
|
||
|
" | viper 4\n",
|
||
|
" | Name: max_speed, dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Boolean list with the same length as the row axis\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[[False, False, True]]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | sidewinder 7 8\n",
|
||
|
" | \n",
|
||
|
" | Conditional that returns a boolean Series\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[df['shield'] > 6]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | sidewinder 7 8\n",
|
||
|
" | \n",
|
||
|
" | Conditional that returns a boolean Series with column labels specified\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[df['shield'] > 6, ['max_speed']]\n",
|
||
|
" | max_speed\n",
|
||
|
" | sidewinder 7\n",
|
||
|
" | \n",
|
||
|
" | Callable that returns a boolean Series\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[lambda df: df['shield'] == 8]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | sidewinder 7 8\n",
|
||
|
" | \n",
|
||
|
" | **Setting values**\n",
|
||
|
" | \n",
|
||
|
" | Set value for all items matching the list of labels\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[['viper', 'sidewinder'], ['shield']] = 50\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra 1 2\n",
|
||
|
" | viper 4 50\n",
|
||
|
" | sidewinder 7 50\n",
|
||
|
" | \n",
|
||
|
" | Set value for an entire row\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['cobra'] = 10\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra 10 10\n",
|
||
|
" | viper 4 50\n",
|
||
|
" | sidewinder 7 50\n",
|
||
|
" | \n",
|
||
|
" | Set value for an entire column\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[:, 'max_speed'] = 30\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra 30 10\n",
|
||
|
" | viper 30 50\n",
|
||
|
" | sidewinder 30 50\n",
|
||
|
" | \n",
|
||
|
" | Set value for rows matching callable condition\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[df['shield'] > 35] = 0\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra 30 10\n",
|
||
|
" | viper 0 0\n",
|
||
|
" | sidewinder 0 0\n",
|
||
|
" | \n",
|
||
|
" | **Getting values on a DataFrame with an index that has integer labels**\n",
|
||
|
" | \n",
|
||
|
" | Another example using integers for the index\n",
|
||
|
" | \n",
|
||
|
" | >>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],\n",
|
||
|
" | ... index=[7, 8, 9], columns=['max_speed', 'shield'])\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | 7 1 2\n",
|
||
|
" | 8 4 5\n",
|
||
|
" | 9 7 8\n",
|
||
|
" | \n",
|
||
|
" | Slice with integer labels for rows. As mentioned above, note that both\n",
|
||
|
" | the start and stop of the slice are included.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[7:9]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | 7 1 2\n",
|
||
|
" | 8 4 5\n",
|
||
|
" | 9 7 8\n",
|
||
|
" | \n",
|
||
|
" | **Getting values with a MultiIndex**\n",
|
||
|
" | \n",
|
||
|
" | A number of examples using a DataFrame with a MultiIndex\n",
|
||
|
" | \n",
|
||
|
" | >>> tuples = [\n",
|
||
|
" | ... ('cobra', 'mark i'), ('cobra', 'mark ii'),\n",
|
||
|
" | ... ('sidewinder', 'mark i'), ('sidewinder', 'mark ii'),\n",
|
||
|
" | ... ('viper', 'mark ii'), ('viper', 'mark iii')\n",
|
||
|
" | ... ]\n",
|
||
|
" | >>> index = pd.MultiIndex.from_tuples(tuples)\n",
|
||
|
" | >>> values = [[12, 2], [0, 4], [10, 20],\n",
|
||
|
" | ... [1, 4], [7, 1], [16, 36]]\n",
|
||
|
" | >>> df = pd.DataFrame(values, columns=['max_speed', 'shield'], index=index)\n",
|
||
|
" | >>> df\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra mark i 12 2\n",
|
||
|
" | mark ii 0 4\n",
|
||
|
" | sidewinder mark i 10 20\n",
|
||
|
" | mark ii 1 4\n",
|
||
|
" | viper mark ii 7 1\n",
|
||
|
" | mark iii 16 36\n",
|
||
|
" | \n",
|
||
|
" | Single label. Note this returns a DataFrame with a single index.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['cobra']\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | mark i 12 2\n",
|
||
|
" | mark ii 0 4\n",
|
||
|
" | \n",
|
||
|
" | Single index tuple. Note this returns a Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[('cobra', 'mark ii')]\n",
|
||
|
" | max_speed 0\n",
|
||
|
" | shield 4\n",
|
||
|
" | Name: (cobra, mark ii), dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Single label for row and column. Similar to passing in a tuple, this\n",
|
||
|
" | returns a Series.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc['cobra', 'mark i']\n",
|
||
|
" | max_speed 12\n",
|
||
|
" | shield 2\n",
|
||
|
" | Name: (cobra, mark i), dtype: int64\n",
|
||
|
" | \n",
|
||
|
" | Single tuple. Note using ``[[]]`` returns a DataFrame.\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[[('cobra', 'mark ii')]]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra mark ii 0 4\n",
|
||
|
" | \n",
|
||
|
" | Single tuple for the index with a single label for the column\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[('cobra', 'mark i'), 'shield']\n",
|
||
|
" | 2\n",
|
||
|
" | \n",
|
||
|
" | Slice from index tuple to single label\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[('cobra', 'mark i'):'viper']\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra mark i 12 2\n",
|
||
|
" | mark ii 0 4\n",
|
||
|
" | sidewinder mark i 10 20\n",
|
||
|
" | mark ii 1 4\n",
|
||
|
" | viper mark ii 7 1\n",
|
||
|
" | mark iii 16 36\n",
|
||
|
" | \n",
|
||
|
" | Slice from index tuple to index tuple\n",
|
||
|
" | \n",
|
||
|
" | >>> df.loc[('cobra', 'mark i'):('viper', 'mark ii')]\n",
|
||
|
" | max_speed shield\n",
|
||
|
" | cobra mark i 12 2\n",
|
||
|
" | mark ii 0 4\n",
|
||
|
" | sidewinder mark i 10 20\n",
|
||
|
" | mark ii 1 4\n",
|
||
|
" | viper mark ii 7 1\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Data and other attributes inherited from pandas.core.generic.NDFrame:\n",
|
||
|
" | \n",
|
||
|
" | timetuple = None\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Methods inherited from pandas.core.base.PandasObject:\n",
|
||
|
" | \n",
|
||
|
" | __sizeof__(self)\n",
|
||
|
" | Generates the total memory usage for an object that returns\n",
|
||
|
" | either a value or Series of values\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Methods inherited from pandas.core.base.StringMixin:\n",
|
||
|
" | \n",
|
||
|
" | __bytes__(self)\n",
|
||
|
" | Return a string representation for a particular object.\n",
|
||
|
" | \n",
|
||
|
" | Invoked by bytes(obj) in py3 only.\n",
|
||
|
" | Yields a bytestring in both py2/py3.\n",
|
||
|
" | \n",
|
||
|
" | __repr__(self)\n",
|
||
|
" | Return a string representation for a particular object.\n",
|
||
|
" | \n",
|
||
|
" | Yields Bytestring in Py2, Unicode String in py3.\n",
|
||
|
" | \n",
|
||
|
" | __str__(self)\n",
|
||
|
" | Return a string representation for a particular Object\n",
|
||
|
" | \n",
|
||
|
" | Invoked by str(df) in both py2/py3.\n",
|
||
|
" | Yields Bytestring in Py2, Unicode String in py3.\n",
|
||
|
" | \n",
|
||
|
" | ----------------------------------------------------------------------\n",
|
||
|
" | Methods inherited from pandas.core.accessor.DirNamesMixin:\n",
|
||
|
" | \n",
|
||
|
" | __dir__(self)\n",
|
||
|
" | Provide method name lookup and completion\n",
|
||
|
" | Only provide 'public' methods\n",
|
||
|
"\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"help(pd.Series)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### Index and Data Lists\n",
|
||
|
"\n",
|
||
|
"We can create a Series from Python lists (also from NumPy arrays)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 14,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"myindex = ['USA','Canada','Mexico']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 15,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"mydata = [1776,1867,1821]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 16,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"myser = pd.Series(data=mydata)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 17,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"0 1776\n",
|
||
|
"1 1867\n",
|
||
|
"2 1821\n",
|
||
|
"dtype: int64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 17,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"myser"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 18,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"USA 1776\n",
|
||
|
"Canada 1867\n",
|
||
|
"Mexico 1821\n",
|
||
|
"dtype: int64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 18,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"pd.Series(data=mydata,index=myindex)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 23,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"ran_data = np.random.randint(0,100,4)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 24,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"array([39, 35, 37, 23])"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 24,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"ran_data"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 26,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"names = ['Andrew','Bobo','Claire','David']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 27,
|
||
|
"metadata": {},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"ages = pd.Series(ran_data,names)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 28,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Andrew 39\n",
|
||
|
"Bobo 35\n",
|
||
|
"Claire 37\n",
|
||
|
"David 23\n",
|
||
|
"dtype: int32"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 28,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"ages"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"### From a Dictionary"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 29,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"ages = {'Sammy':5,'Frank':10,'Spike':7}"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 30,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"{'Frank': 10, 'Sammy': 5, 'Spike': 7}"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 30,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"ages"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 31,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Sammy 5\n",
|
||
|
"Frank 10\n",
|
||
|
"Spike 7\n",
|
||
|
"dtype: int64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 31,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"pd.Series(ages)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"# Key Ideas of a Series"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Named Index"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 32,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Imaginary Sales Data for 1st and 2nd Quarters for Global Company\n",
|
||
|
"q1 = {'Japan': 80, 'China': 450, 'India': 200, 'USA': 250}\n",
|
||
|
"q2 = {'Brazil': 100,'China': 500, 'India': 210,'USA': 260}"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 33,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Convert into Pandas Series\n",
|
||
|
"sales_Q1 = pd.Series(q1)\n",
|
||
|
"sales_Q2 = pd.Series(q2)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 34,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Japan 80\n",
|
||
|
"China 450\n",
|
||
|
"India 200\n",
|
||
|
"USA 250\n",
|
||
|
"dtype: int64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 34,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sales_Q1"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 35,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"80"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 35,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# Call values based on Named Index\n",
|
||
|
"sales_Q1['Japan']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 36,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"80"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 36,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# Integer Based Location information also retained!\n",
|
||
|
"sales_Q1[0]"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"**Be careful with potential errors!**"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 37,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Wrong Name\n",
|
||
|
"# sales_Q1['France']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 38,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Accidental Extra Space\n",
|
||
|
"# sales_Q1['USA ']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 39,
|
||
|
"metadata": {
|
||
|
"collapsed": true
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"# Capitalization Mistake\n",
|
||
|
"# sales_Q1['usa']"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Operations"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 40,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Index(['Japan', 'China', 'India', 'USA'], dtype='object')"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 40,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# Grab just the index keys\n",
|
||
|
"sales_Q1.keys()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 41,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Japan 160\n",
|
||
|
"China 900\n",
|
||
|
"India 400\n",
|
||
|
"USA 500\n",
|
||
|
"dtype: int64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 41,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# Can Perform Operations Broadcasted across entire Series\n",
|
||
|
"sales_Q1 * 2"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 42,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Brazil 1.0\n",
|
||
|
"China 5.0\n",
|
||
|
"India 2.1\n",
|
||
|
"USA 2.6\n",
|
||
|
"dtype: float64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 42,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sales_Q2 / 100"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"## Between Series"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 43,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Brazil NaN\n",
|
||
|
"China 950.0\n",
|
||
|
"India 410.0\n",
|
||
|
"Japan NaN\n",
|
||
|
"USA 510.0\n",
|
||
|
"dtype: float64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 43,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# Notice how Pandas informs you of mismatch with NaN\n",
|
||
|
"sales_Q1 + sales_Q2"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 44,
|
||
|
"metadata": {},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"Brazil 100.0\n",
|
||
|
"China 950.0\n",
|
||
|
"India 410.0\n",
|
||
|
"Japan 80.0\n",
|
||
|
"USA 510.0\n",
|
||
|
"dtype: float64"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 44,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"# You can fill these with any value you want\n",
|
||
|
"sales_Q1.add(sales_Q2,fill_value=0)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"That is all we need to know about Series, up next, DataFrames!"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"anaconda-cloud": {},
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 3",
|
||
|
"language": "python",
|
||
|
"name": "python3"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 3
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython3",
|
||
|
"version": "3.6.6"
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 1
|
||
|
}
|