Thanks for contributing to TTK Data!
This repository hosts a list of data sets and example pipelines (primarily stored as ParaView pvsm
state files).
This repository has multiple purposes:
- Each entry is used to produce a screenshot for the Gallery page of TTK's website
- Each entry serves as a reproducible example with ParaView, as documented on the Tutorial page of TTK's website
- Each entry is automatically tested by our continuous integration. At the moment, only code features which are used in ttk-data's state files are automatically tested upon code pull requests.
- Each entry is described in detail in TTK's Example website, for novice users who want to get started with TTK and Python.
Please find below a few guidelines that we invite you to consider before making a pull request.
Please find below generic recommendations for setting up your fork of TTK's data repository.
- Setting up your Github account:
- Create an account on Github.
- If applicable, we recommand to upgrade (for free) this account to a Github Pro account, through the Github education program.
- Forking TTK's main repository:
- Go to TTK's data repository and click on the "Fork" button (top right corner).
- In the remainder, let us designate by
@PUBLIC
the URL of TTK's data repository (i.e.@PUBLIC = https://github.com/topology-tool-kit/ttk-data
) - Similarly, let us designate by
@FORK
the URL of your public fork of TTK's public repository.
- Creating your private TTK repository:
- We recommand to use a private repository for the development of unpublished features.
- For this, create and setup a private repository (e.g.
ttk-data-yourusername
). Let us call it@PRIVATE
. - Clone your
@PRIVATE
repository locally and enter the following commands:
$ git clone @PRIVATE ttk-data-yourusername $ cd ttk-data-yourusername $ git remote add ttk-public @PUBLIC $ git remote add ttk-fork @FORK
- Daily usage:
- At this point, you can regularly keep your private repository up-to-date by entering the following command in it:
$ git pull ttk-public dev
- When developing a new unpublished feature, we recommand to create a new branch on your
@PRIVATE
repository. - When this feature is ready to be made public (e.g. after publication of the corresponding research), push the corresponding branch to your
@FORK
. This will enable you to open a pull-request (PR) to the TTK's data repository.
- If you add a new data set to the repository, please update the README file by adding an entry for the data set, specifying its provenance.
- If you develop some new features in TTK (either by creating a new module or extending an existing one), we strongly invite you to produce an entry in the ttk-data repository (by pull request). Each entry should be provided as follows:
- Please add the new state file in the
states/
directory. - Edit the state file with a text editor and modify any full paths to the input files to make them relative.
- This Python script should be an extremely simplified version of the pipeline encoded in the above
pvsm
state file:- This script should be immediately understandable by any first year student
- It is meant to be run in batch mode, directly from the python interpreter.
- This file should be located in the
python/
directory and have the same name as the state file, but with thepy
extension instead of thepvsm
extension. - This script can be automatically generated by ParaView:
- Once the state file is opened in ParaView, click on
File
,Save State...
and make sure to select the entryPython state file (*.py)
from the extension drop down menu. - In the next dialog window (
Python State Options
), make sure to check the boxSkip Rendering Components
.
- Once the state file is opened in ParaView, click on
- Edit the automatically generated (and verbose) script:
- Insert the following line at the top:
#!/usr/bin/env python
- After the line
from paraview.simple import *
, remove all the lines before and after the section entitled#setup the data processing pipelines
(see the automatically generated comments). The main idea here is to provide a minimalist and simple Python script which only includes the loading of the input data and the key steps of the data analysis pipeline. - Insert new lines at the end of the script to store the outputs of the pipeline with
SaveData()
(see other pre-existing Python scripts):- Please prefer, when applicable, the
csv
extension, which is more versatile and convenient for post-processing by a novice python user.
- Please prefer, when applicable, the
- Remove any verbose optional code which obscures the clear understanding of the script:
- Remove the
regitrationName
option from the constructor of each Python object - Remove any optional property
- Remove any pipeline "dead leaf", i.e. any terminal, unnecessary pipeline object which is:
- not reused by another object
- or not saved to disk (used in
SaveData()
)
- Remove any instance of
FindSource
(and replace appropriately future occurences of its output) - Remove the construction of graphical primitives (
tTKIcospheresFromPoints
,Tube
,GenerateSurfaceNormals
) which are not needed in batch mode. In batch mode, the raw data (input of these graphical primitives) is usually more convenient to handle. - The simplified script should be straightforward to understand
- Remove the
- Run your python script and double check if its output is correct in ParaView!
- Format your Python script using Black. You can also integrate it into your local Git repository using pre-commit hooks to ensure that only well-formatted Python code is commited.
- Run if possible the TTK CI to get the script output hashes and add
them to the corresponding platform file in the
python/hashes
directory (more information in the relevant README file in thepython
directory).
- Insert the following line at the top:
-
This file should be located in the
docs/
directory and have the same name as the state file and the Python script, but with themd
extension (instead ofpvsm
orpy
). -
Please create a new entry by copying an already existing one.
-
Each new entry should be organized as follows:
- A screenshot
- A pipeline description, which includes, for each TTK filter involved in the example, a pointer to its Doxygen documentation.
- The ParaView command line to reproduce the screenshot
- The example Python code (automatically inserted)
- A quick description of the pipeline inputs
- Each input should contain a link to the actual file on ttk-data
- A quick description of the pipeline outputs
- Pointers to the Doxygen documentation of all the TTK filters involved in the example.
-
Note that the output webpage can be visualized locally by entering the command
mkdocs serve
in the ttk-data directory (installation instructions for pip users are included below, otherwise please refer to your system's package manager). -
Once your example is merged to ttk-data, please open a pull request to the main ttk repository, to insert pointers to your example in the doxygen documentation and ParaView documentation of all the TTK filters involved in your example. See for instance the section "Online examples" in:
-
In the three cases above (
pvsm
ParaView state file,py
Python script,.md
MkDocs entry), we invite you to checkout the other examples already included in ttk-data for inspiration.
Set up and activate a virtual environment:
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
Run local server, which updates automatically on change; by default it runs on http://localhost:8000/
mkdocs serve