Skip to content

Latest commit

 

History

History
114 lines (101 loc) · 8.93 KB

CONTRIBUTING.md

File metadata and controls

114 lines (101 loc) · 8.93 KB

Thanks for contributing to TTK Data!

This repository hosts a list of data sets and example pipelines (primarily stored as ParaView pvsm state files).

This repository has multiple purposes:

  • Each entry is used to produce a screenshot for the Gallery page of TTK's website
  • Each entry serves as a reproducible example with ParaView, as documented on the Tutorial page of TTK's website
  • Each entry is automatically tested by our continuous integration. At the moment, only code features which are used in ttk-data's state files are automatically tested upon code pull requests.
  • Each entry is described in detail in TTK's Example website, for novice users who want to get started with TTK and Python.

Please find below a few guidelines that we invite you to consider before making a pull request.

0. Getting set up with Github

Please find below generic recommendations for setting up your fork of TTK's data repository.

  • Setting up your Github account:
    • Create an account on Github.
    • If applicable, we recommand to upgrade (for free) this account to a Github Pro account, through the Github education program.
  • Forking TTK's main repository:
    • Go to TTK's data repository and click on the "Fork" button (top right corner).
    • In the remainder, let us designate by @PUBLIC the URL of TTK's data repository (i.e. @PUBLIC = https://github.com/topology-tool-kit/ttk-data)
    • Similarly, let us designate by @FORK the URL of your public fork of TTK's public repository.
  • Creating your private TTK repository:
    • We recommand to use a private repository for the development of unpublished features.
    • For this, create and setup a private repository (e.g. ttk-data-yourusername). Let us call it @PRIVATE.
    • Clone your @PRIVATE repository locally and enter the following commands:
    $ git clone @PRIVATE ttk-data-yourusername
    $ cd ttk-data-yourusername
    $ git remote add ttk-public @PUBLIC
    $ git remote add ttk-fork @FORK
    
  • Daily usage:
    • At this point, you can regularly keep your private repository up-to-date by entering the following command in it:
    $ git pull ttk-public dev
    
    • When developing a new unpublished feature, we recommand to create a new branch on your @PRIVATE repository.
    • When this feature is ready to be made public (e.g. after publication of the corresponding research), push the corresponding branch to your @FORK. This will enable you to open a pull-request (PR) to the TTK's data repository.

1. Adding a new data set

  • If you add a new data set to the repository, please update the README file by adding an entry for the data set, specifying its provenance.

2. Adding a new example

  • If you develop some new features in TTK (either by creating a new module or extending an existing one), we strongly invite you to produce an entry in the ttk-data repository (by pull request). Each entry should be provided as follows:

a. ParaView pvsm state file

  • Please add the new state file in the states/ directory.
  • Edit the state file with a text editor and modify any full paths to the input files to make them relative.

b. Python script

  • This Python script should be an extremely simplified version of the pipeline encoded in the above pvsm state file:
    • This script should be immediately understandable by any first year student
    • It is meant to be run in batch mode, directly from the python interpreter.
  • This file should be located in the python/ directory and have the same name as the state file, but with the py extension instead of the pvsm extension.
  • This script can be automatically generated by ParaView:
    • Once the state file is opened in ParaView, click on File, Save State... and make sure to select the entry Python state file (*.py) from the extension drop down menu.
    • In the next dialog window (Python State Options), make sure to check the box Skip Rendering Components.
  • Edit the automatically generated (and verbose) script:
    • Insert the following line at the top: #!/usr/bin/env python
    • After the line from paraview.simple import *, remove all the lines before and after the section entitled #setup the data processing pipelines (see the automatically generated comments). The main idea here is to provide a minimalist and simple Python script which only includes the loading of the input data and the key steps of the data analysis pipeline.
    • Insert new lines at the end of the script to store the outputs of the pipeline with SaveData() (see other pre-existing Python scripts):
      • Please prefer, when applicable, the csv extension, which is more versatile and convenient for post-processing by a novice python user.
    • Remove any verbose optional code which obscures the clear understanding of the script:
      • Remove the regitrationName option from the constructor of each Python object
      • Remove any optional property
      • Remove any pipeline "dead leaf", i.e. any terminal, unnecessary pipeline object which is:
        • not reused by another object
        • or not saved to disk (used in SaveData())
      • Remove any instance of FindSource (and replace appropriately future occurences of its output)
      • Remove the construction of graphical primitives (tTKIcospheresFromPoints, Tube, GenerateSurfaceNormals) which are not needed in batch mode. In batch mode, the raw data (input of these graphical primitives) is usually more convenient to handle.
      • The simplified script should be straightforward to understand
    • Run your python script and double check if its output is correct in ParaView!
    • Format your Python script using Black. You can also integrate it into your local Git repository using pre-commit hooks to ensure that only well-formatted Python code is commited.
    • Run if possible the TTK CI to get the script output hashes and add them to the corresponding platform file in the python/hashes directory (more information in the relevant README file in the python directory).

c. MkDocs file

  • This file should be located in the docs/ directory and have the same name as the state file and the Python script, but with the md extension (instead of pvsm or py).

  • Please create a new entry by copying an already existing one.

  • Each new entry should be organized as follows:

    • A screenshot
    • A pipeline description, which includes, for each TTK filter involved in the example, a pointer to its Doxygen documentation.
    • The ParaView command line to reproduce the screenshot
    • The example Python code (automatically inserted)
    • A quick description of the pipeline inputs
      • Each input should contain a link to the actual file on ttk-data
    • A quick description of the pipeline outputs
    • Pointers to the Doxygen documentation of all the TTK filters involved in the example.
  • Note that the output webpage can be visualized locally by entering the command mkdocs serve in the ttk-data directory (installation instructions for pip users are included below, otherwise please refer to your system's package manager).

  • Once your example is merged to ttk-data, please open a pull request to the main ttk repository, to insert pointers to your example in the doxygen documentation and ParaView documentation of all the TTK filters involved in your example. See for instance the section "Online examples" in:

  • In the three cases above (pvsm ParaView state file, py Python script, .md MkDocs entry), we invite you to checkout the other examples already included in ttk-data for inspiration.

Running mkdocs locally (pip users)

Set up and activate a virtual environment:

virtualenv venv
source venv/bin/activate
pip install -r requirements.txt

Run local server, which updates automatically on change; by default it runs on http://localhost:8000/

mkdocs serve