Skip to content

Commit

Permalink
DOC: Remove computation.rst in favor of better docstrings (pandas-dev…
Browse files Browse the repository at this point in the history
…#46170)

* DOC: Remove computation.rst in favor of better docstrings:

* Remove other ref
  • Loading branch information
mroeschke authored Feb 28, 2022
1 parent 367f8a1 commit 21a3b2f
Show file tree
Hide file tree
Showing 9 changed files with 74 additions and 225 deletions.
212 changes: 0 additions & 212 deletions doc/source/user_guide/computation.rst

This file was deleted.

1 change: 0 additions & 1 deletion doc/source/user_guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ Guides
boolean
visualization
style
computation
groupby
window
timeseries
Expand Down
14 changes: 10 additions & 4 deletions doc/source/user_guide/window.rst
Original file line number Diff line number Diff line change
Expand Up @@ -427,10 +427,16 @@ can even be omitted:
.. note::

Missing values are ignored and each entry is computed using the pairwise
complete observations. Please see the :ref:`covariance section
<computation.covariance>` for :ref:`caveats
<computation.covariance.caveats>` associated with this method of
calculating covariance and correlation matrices.
complete observations.

Assuming the missing data are missing at random this results in an estimate
for the covariance matrix which is unbiased. However, for many applications
this estimate may not be acceptable because the estimated covariance matrix
is not guaranteed to be positive semi-definite. This could lead to
estimated correlations having absolute values which are greater than one,
and/or a non-invertible covariance matrix. See `Estimation of covariance
matrices <https://en.wikipedia.org/w/index.php?title=Estimation_of_covariance_matrices>`_
for more details.

.. ipython:: python
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.6.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ New features
- :ref:`Added <groupby.multiindex>` multiple levels to groupby (:issue:`103`)
- :ref:`Allow <basics.sorting>` multiple columns in ``by`` argument of ``DataFrame.sort_index`` (:issue:`92`, :issue:`362`)
- :ref:`Added <indexing.basics.get_value>` fast ``get_value`` and ``put_value`` methods to DataFrame (:issue:`360`)
- :ref:`Added <computation.covariance>` ``cov`` instance methods to Series and DataFrame (:issue:`194`, :issue:`362`)
- Added ``cov`` instance methods to Series and DataFrame (:issue:`194`, :issue:`362`)
- :ref:`Added <visualization.barplot>` ``kind='bar'`` option to ``DataFrame.plot`` (:issue:`348`)
- :ref:`Added <basics.idxmin>` ``idxmin`` and ``idxmax`` to Series and DataFrame (:issue:`286`)
- :ref:`Added <io.clipboard>` ``read_clipboard`` function to parse DataFrame from clipboard (:issue:`300`)
Expand Down
4 changes: 2 additions & 2 deletions doc/source/whatsnew/v0.6.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Version 0.6.1 (December 13, 2011)
New features
~~~~~~~~~~~~
- Can append single rows (as Series) to a DataFrame
- Add Spearman and Kendall rank :ref:`correlation <computation.correlation>`
- Add Spearman and Kendall rank correlation
options to Series.corr and DataFrame.corr (:issue:`428`)
- :ref:`Added <indexing.basics.get_value>` ``get_value`` and ``set_value`` methods to
Series, DataFrame, and Panel for very low-overhead access (>2x faster in many
Expand All @@ -19,7 +19,7 @@ New features
- Implement new :ref:`SparseArray <sparse.array>` and ``SparseList``
data structures. SparseSeries now derives from SparseArray (:issue:`463`)
- :ref:`Better console printing options <basics.console_output>` (:issue:`453`)
- Implement fast :ref:`data ranking <computation.ranking>` for Series and
- Implement fast data ranking for Series and
DataFrame, fast versions of scipy.stats.rankdata (:issue:`428`)
- Implement ``DataFrame.from_items`` alternate
constructor (:issue:`444`)
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.8.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ Other new features
- Add :ref:`'kde' <visualization.kde>` plot option for density plots
- Support for converting DataFrame to R data.frame through rpy2
- Improved support for complex numbers in Series and DataFrame
- Add :ref:`pct_change <computation.pct_change>` method to all data structures
- Add ``pct_change`` method to all data structures
- Add max_colwidth configuration option for DataFrame console output
- :ref:`Interpolate <missing_data.interpolate>` Series values using index values
- Can select multiple columns from GroupBy
Expand Down
40 changes: 38 additions & 2 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -9592,6 +9592,14 @@ def corr(
DataFrame or Series.
Series.corr : Compute the correlation between two Series.
Notes
-----
Pearson, Kendall and Spearman correlation are currently computed using pairwise complete observations.
* `Pearson correlation coefficient <https://en.wikipedia.org/wiki/Pearson_correlation_coefficient>`_
* `Kendall rank correlation coefficient <https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient>`_
* `Spearman's rank correlation coefficient <https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient>`_
Examples
--------
>>> def histogram_intersection(a, b):
Expand All @@ -9603,7 +9611,14 @@ def corr(
dogs cats
dogs 1.0 0.3
cats 0.3 1.0
"""
>>> df = pd.DataFrame([(1, 1), (2, np.nan), (np.nan, 3), (4, 4)],
... columns=['dogs', 'cats'])
>>> df.corr(min_periods=3)
dogs cats
dogs 1.0 NaN
cats NaN 1.0
""" # noqa:E501
numeric_df = self._get_numeric_data()
cols = numeric_df.columns
idx = cols.copy()
Expand Down Expand Up @@ -9797,7 +9812,28 @@ def corrwith(self, other, axis: Axis = 0, drop=False, method="pearson") -> Serie
See Also
--------
DataFrame.corr : Compute pairwise correlation of columns.
"""
Examples
--------
>>> index = ["a", "b", "c", "d", "e"]
>>> columns = ["one", "two", "three", "four"]
>>> df1 = pd.DataFrame(np.arange(20).reshape(5, 4), index=index, columns=columns)
>>> df2 = pd.DataFrame(np.arange(16).reshape(4, 4), index=index[:4], columns=columns)
>>> df1.corrwith(df2)
one 1.0
two 1.0
three 1.0
four 1.0
dtype: float64
>>> df2.corrwith(df1, axis=1)
a 1.0
b 1.0
c 1.0
d 1.0
e NaN
dtype: float64
""" # noqa:E501
axis = self._get_axis_number(axis)
this = self._get_numeric_data()

Expand Down
14 changes: 13 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -8522,6 +8522,18 @@ def rank(
3 spider 8.0
4 snake NaN
Ties are assigned the mean of the ranks (by default) for the group.
>>> s = pd.Series(range(5), index=list("abcde"))
>>> s["d"] = s["b"]
>>> s.rank()
a 1.0
b 2.5
c 4.0
d 2.5
e 5.0
dtype: float64
The following example shows how the method behaves with the above
parameters:
Expand Down Expand Up @@ -10251,7 +10263,7 @@ def pct_change(
periods : int, default 1
Periods to shift for forming percent change.
fill_method : str, default 'pad'
How to handle NAs before computing percent changes.
How to handle NAs **before** computing percent changes.
limit : int, default None
The number of consecutive NAs to fill before stopping.
freq : DateOffset, timedelta, or str, optional
Expand Down
10 changes: 9 additions & 1 deletion pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2566,6 +2566,14 @@ def corr(self, other, method="pearson", min_periods=None) -> float:
DataFrame.corrwith : Compute pairwise correlation with another
DataFrame or Series.
Notes
-----
Pearson, Kendall and Spearman correlation are currently computed using pairwise complete observations.
* `Pearson correlation coefficient <https://en.wikipedia.org/wiki/Pearson_correlation_coefficient>`_
* `Kendall rank correlation coefficient <https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient>`_
* `Spearman's rank correlation coefficient <https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient>`_
Examples
--------
>>> def histogram_intersection(a, b):
Expand All @@ -2575,7 +2583,7 @@ def corr(self, other, method="pearson", min_periods=None) -> float:
>>> s2 = pd.Series([.3, .6, .0, .1])
>>> s1.corr(s2, method=histogram_intersection)
0.3
"""
""" # noqa:E501
this, other = self.align(other, join="inner", copy=False)
if len(this) == 0:
return np.nan
Expand Down

0 comments on commit 21a3b2f

Please sign in to comment.