SeldonIO · mauicv · May 18, 2023 · May 15, 2023 · May 15, 2023 · May 15, 2023
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -45,6 +45,107 @@ When documenting Python classes, we adhere to the convention of including docstr
 rather than as a class level docstring. Docstrings should only be included at the class-level if a class does
 not posess an `__init__` method, for example because it is a static class.
 
+#### Conventions
+
+- Names of variables, functions, classes and modules should be written between single back-ticks.
+     - ``` A `numpy` scalar type that ```
+     - ``` `X` ```
+     - ``` `extrapolate_constant_perc` ```
+
+- Simple mathematical equations should be written between single back-ticks to facilitate readability in the console.
+     - ``` A callable that takes an `N x F` tensor, for ```
+     - ``` `x >= v, fun(x) >= target` ```
+
+- Complex math should be written in LaTeX.
+    - ``` function where :math:`link(output - expected\_value) = sum(\phi)` ```
+
+- Other `alibi_detect` objects should be cross-referenced using references of the form `` :role:`~object` ``, where 
+`role` is one of the roles listed in the [sphinx documentation](https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#cross-referencing-python-objects), 
+and `object` is the full path of the object to reference. For example, the `MMDDrift` detectors's `predict` method 
+would be referenced with `` :meth:`~alibi_detect.cd.mmd.MMDDrift.predict` ``. This will render as `MMDDrift.predict()` and 
+link to the relevant API docs page. The same convention can be used to reference objects from other libraries, providing the
+library is included in `intersphinx_mapping` in `doc/source/conf.py`. If the `~` is removed, the absolute object location will be
+rendered.
+
+- Variable values or examples of setting an argument to a specific values should be written in double back-ticks
+to facilitate readability as they are rendered in a block with orange font-color.
+   - ``` is set to ``True`` ```
+   - ``` A list of features for which to plot the ALE curves or ``'all'`` for all features. ```
+   - ``` The search is greedy if ``beam_size=1`` ```
+   - ``` if the result uses ``segment_labels=(1, 2, 3)`` and ``partial_index=1``, this will return ``[1, 2]``. ```
+
+- Listing the possible values an argument can take.
+   - ``` Possible values are: ``'all'`` | ``'row'`` | ``None``. ```
+
+- Returning the name of the variable and its description - standard convention and renders well. Writing the 
+variable types should be avoided as it would be duplicated from variables typing.
+```
+Returns
+-------
+raw
+    Array of perturbed text instances.
+data
+    Matrix with 1s and 0s indicating whether a word in the text has not been perturbed for each sample.
+```
+
+- Returning only the description. When the name of the variable is not returned, sphinx wrongly interprets the 
+description as the variable name which will render the text in italic. If the text exceeds one line, ``` \ ``` need 
+to be included after each line to avoid introducing bullet points at the beginning of each row. Moreover, if for 
+example the name of a variable is included between single back-ticks, the italic font is canceled for all the words
+with the exception of the ones inbetween single back-ticks.
+```
+Returns
+-------
+If the user has specified grouping, then the input object is subsampled and an object of the same \
+type is returned. Otherwise, a `shap_utils.Data` object containing the result of a k-means algorithm \
+is wrapped in a `shap_utils.DenseData` object and returned. The samples are weighted according to the \
+frequency of the occurrence of the clusters in the original data.
+```
+
+- Returning an object which contains multiple attributes and each attribute is described individually. 
+In this case the attribute name is written between single back-ticks and the type, if provided, would be written in 
+double back-ticks.
+```
+Returns
+-------
+`Explanation` object containing the anchor explaining the instance with additional metadata as attributes. \
+Contains the following data-related attributes
+
+ - `anchor` : ``List[str]`` - a list of words in the proposed anchor.
+
+ - `precision` : ``float`` - the fraction of times the sampled instances where the anchor holds yields \
+ the same prediction as the original instance. The precision will always be  threshold for a valid anchor.
+
+ - `coverage` : ``float`` - the fraction of sampled instances the anchor applies to.
+```
+
+- Documenting a dictionary follows the same principle the as above but the key should be written between 
+double back-ticks.
+```
+Default perturbation options for ``'similarity'`` sampling
+
+    - ``'sample_proba'`` : ``float`` - probability of a word to be masked.
+
+    - ``'top_n'`` : ``int`` - number of similar words to sample for perturbations.
+
+    - ``'temperature'`` : ``float`` - sample weight hyper-parameter if `use_proba=True`.
+
+    - ``'use_proba'`` : ``bool`` - whether to sample according to the words similarity.
+```
+
+- Attributes are commented inline to avoid duplication.
+```
+class ReplayBuffer:
+    """
+    Circular experience replay buffer for `CounterfactualRL` (DDPG) ... in performance.
+    """
+    X: np.ndarray  #: Inputs buffer.
+    Y_m: np.ndarray  #: Model's prediction buffer.
+    ...
+```
+
+For more standard conventions, please check the [numpydocs style guide](https://numpydoc.readthedocs.io/en/stable/format.html).
+
 ## Building documentation
 We use `sphinx` for building documentation. You can call `make build_docs` from the project root,
 the docs will be built under `doc/_build/html`.

diff --git a/alibi_detect/ad/adversarialae.py b/alibi_detect/ad/adversarialae.py
@@ -290,9 +290,9 @@ def predict(self, X: np.ndarray, batch_size: int = int(1e10), return_instance_sc
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the adversarial predictions and instance level adversarial scores.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the adversarial predictions and instance level adversarial scores.
         """
         adv_score = self.score(X, batch_size=batch_size)
 

diff --git a/alibi_detect/ad/model_distillation.py b/alibi_detect/ad/model_distillation.py
@@ -207,9 +207,9 @@ def predict(self, X: np.ndarray, batch_size: int = int(1e10), return_instance_sc
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the adversarial predictions and instance level adversarial scores.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the adversarial predictions and instance level adversarial scores.
         """
         score = self.score(X, batch_size=batch_size)
 

diff --git a/alibi_detect/base.py b/alibi_detect/base.py
@@ -56,7 +56,7 @@ def concept_drift_dict():
 
 
 class BaseDetector(ABC):
-    """ Base class for outlier, adversarial and drift detection algorithms. """
+    """Base class for outlier, adversarial and drift detection algorithms."""
 
     def __init__(self):
         self.meta = copy.deepcopy(DEFAULT_META)
@@ -204,10 +204,12 @@ class Detector(Protocol):
 
     Used for typing legacy save and load functionality in `alibi_detect.saving._tensorflow.saving.py`.
 
-    Note:
+    Note
+    ----
         This exists to distinguish between detectors with and without support for config saving and loading. Once all
         detector support this then this protocol will be removed.
     """
+
     meta: Dict
 
     def predict(self) -> Any: ...
@@ -219,6 +221,7 @@ class ConfigurableDetector(Detector, Protocol):
 
     Used for typing save and load functionality in `alibi_detect.saving.saving`.
     """
+
     def get_config(self) -> dict: ...
 
     @classmethod
@@ -233,6 +236,7 @@ class StatefulDetectorOnline(ConfigurableDetector, Protocol):
 
     Used for typing save and load functionality in `alibi_detect.saving.saving`.
     """
+
     t: int = 0
 
     def save_state(self, filepath: Union[str, os.PathLike]): ...

diff --git a/alibi_detect/cd/base.py b/alibi_detect/cd/base.py
@@ -135,10 +135,12 @@ def __init__(
     def preprocess(self, x: Union[np.ndarray, list]) -> Tuple[Union[np.ndarray, list], Union[np.ndarray, list]]:
         """
         Data preprocessing before computing the drift scores.
+
         Parameters
         ----------
         x
             Batch of instances.
+
         Returns
         -------
         Preprocessed reference data and new instances.
@@ -174,7 +176,7 @@ def get_splits(
 
         Returns
         -------
-        Combined reference and test instances with labels and optionally a list with tuples of
+        Combined reference and test instances with labels and optionally a list with tuples of \
         train and test indices for optionally different folds.
         """
         # create dataset and labels
@@ -268,12 +270,12 @@ def predict(self, x: Union[np.ndarray, list], return_p_val: bool = True,
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the p-value, performance of the classifier
-        relative to its expectation under the no-change null, the out-of-fold classifier model
-        prediction probabilities on the reference and test data as well as the associated reference
-        and test instances of the out-of-fold predictions, and the trained model.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the p-value, performance of the classifier \
+            relative to its expectation under the no-change null, the out-of-fold classifier model \
+            prediction probabilities on the reference and test data as well as the associated reference \
+            and test instances of the out-of-fold predictions, and the trained model.
         """
         # compute drift scores
         p_val, dist, probs_ref, probs_test, x_ref_oof, x_test_oof = self.score(x)
@@ -394,10 +396,12 @@ def __init__(
     def preprocess(self, x: Union[np.ndarray, list]) -> Tuple[Union[np.ndarray, list], Union[np.ndarray, list]]:
         """
         Data preprocessing before computing the drift scores.
+
         Parameters
         ----------
         x
             Batch of instances.
+
         Returns
         -------
         Preprocessed reference data and new instances.
@@ -418,17 +422,18 @@ def get_splits(self, x_ref: Union[np.ndarray, list], x: Union[np.ndarray, list])
         """
         Split reference and test data into two splits -- one of which to learn test locations
         and parameters and one to use for tests.
+
         Parameters
         ----------
         x_ref
             Data used as reference distribution.
         x
             Batch of instances.
+
         Returns
         -------
-        Tuple containing split train data and tuple containing split test data
+        Tuple containing split train data and tuple containing split test data.
         """
-
         n_ref, n_cur = len(x_ref), len(x)
         perm_ref, perm_cur = np.random.permutation(n_ref), np.random.permutation(n_cur)
         idx_ref_tr, idx_ref_te = perm_ref[:int(n_ref * self.train_size)], perm_ref[int(n_ref * self.train_size):]
@@ -468,9 +473,9 @@ def predict(self, x: Union[np.ndarray, list], return_p_val: bool = True,
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the detector's metadata.
-        'data' contains the drift prediction and optionally the p-value, threshold, MMD metric and
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the detector's metadata.
+            - ``'data'`` contains the drift prediction and optionally the p-value, threshold, MMD metric and \
             trained kernel.
         """
         # compute drift scores
@@ -586,10 +591,12 @@ def __init__(
     def preprocess(self, x: Union[np.ndarray, list]) -> Tuple[np.ndarray, np.ndarray]:
         """
         Data preprocessing before computing the drift scores.
+
         Parameters
         ----------
         x
             Batch of instances.
+
         Returns
         -------
         Preprocessed reference data and new instances.
@@ -626,9 +633,9 @@ def predict(self, x: Union[np.ndarray, list], return_p_val: bool = True, return_
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the p-value, threshold and MMD metric.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the p-value, threshold and MMD metric.
         """
         # compute drift scores
         p_val, dist, distance_threshold = self.score(x)
@@ -748,10 +755,12 @@ def __init__(
     def preprocess(self, x: Union[np.ndarray, list]) -> Tuple[np.ndarray, np.ndarray]:
         """
         Data preprocessing before computing the drift scores.
+
         Parameters
         ----------
         x
             Batch of instances.
+
         Returns
         -------
         Preprocessed reference data and new instances.
@@ -786,9 +795,9 @@ def predict(self, x: Union[np.ndarray, list], return_p_val: bool = True, return_
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the p-value, threshold and LSDD metric.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the p-value, threshold and LSDD metric.
         """
         # compute drift scores
         p_val, dist, distance_threshold = self.score(x)
@@ -979,10 +988,10 @@ def predict(self, x: Union[np.ndarray, list], drift_type: str = 'batch',
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the feature level p-values,
-         threshold after multivariate correction if needed and test statistics.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the feature level p-values, threshold after \
+            multivariate correction if needed and test statistics.
         """
         # compute drift scores
         p_vals, dist = self.score(x)
@@ -1136,10 +1145,12 @@ def __init__(
     def preprocess(self, x: Union[np.ndarray, list]) -> Tuple[np.ndarray, np.ndarray]:
         """
         Data preprocessing before computing the drift scores.
+
         Parameters
         ----------
         x
             Batch of instances.
+
         Returns
         -------
         Preprocessed reference data and new instances.
@@ -1181,10 +1192,10 @@ def predict(self,  # type: ignore[override]
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the p-value, threshold, conditional MMD test statistic
-        and coupling matrices.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the p-value, threshold, conditional MMD test \
+            statistic and coupling matrices.
         """
         # compute drift scores
         p_val, dist, distance_threshold, coupling = self.score(x, c)

diff --git a/alibi_detect/cd/base_online.py b/alibi_detect/cd/base_online.py
@@ -184,9 +184,9 @@ def predict(self, x_t: Union[np.ndarray, Any], return_test_stat: bool = True,
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the test-statistic and threshold.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the test-statistic and threshold.
         """
         # Compute test stat and check for drift
         test_stat = self.score(x_t)
@@ -441,9 +441,9 @@ def predict(self, x_t: Union[np.ndarray, Any], return_test_stat: bool = True,
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries.
-        'meta' has the model's metadata.
-        'data' contains the drift prediction and optionally the test-statistic and threshold.
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries.
+            - ``'meta'`` has the model's metadata.
+            - ``'data'`` contains the drift prediction and optionally the test-statistic and threshold.
         """
         # Compute test stat and check for drift
         test_stats = self.score(x_t)

diff --git a/alibi_detect/cd/classifier.py b/alibi_detect/cd/classifier.py
@@ -206,11 +206,9 @@ def predict(self, x: Union[np.ndarray, list],  return_p_val: bool = True,
 
         Returns
         -------
-        Dictionary containing 'meta' and 'data' dictionaries
-
-         - 'meta' - has the model's metadata.
-
-         - 'data' - contains the drift prediction and optionally the p-value, performance of the classifier \
+        Dictionary containing ``'meta'`` and ``'data'`` dictionaries
+            - ``'meta'`` - has the model's metadata.
+            - ``'data'`` - contains the drift prediction and optionally the p-value, performance of the classifier \
         relative to its expectation under the no-change null, the out-of-fold classifier model \
         prediction probabilities on the reference and test data, and the trained model. \
         """