[Review] @baptklein - paper and docs #45
Labels
documentation
Improvements or additions to documentation
JOSS review
Issues related to the review of JOSS submissions
In this submission, the authors present a Gaussian Process code for modelling multi-wavelength astronomical time series. This code combines the efficiency of the GpyTorch library with the flexibility of Spectral Mixture kernels and, thereby, could be an interesting option for modelling large data sets. The code itself reads well, the a detailed API is provided in the Docs. However, in the submitted version of the code and docs, the authors have not, in my opinion, been able to demonstrate the relevance of the code to model real multi-wavelength astronomical time series. In particular, some of the tutorials provided in the docs do not work in their current state, and are only designed to model a simple sine-wave fit, for which GPR are not best suited. Please find below a list of comments / suggestions that I would like the authors to address before I consider reviewing this submission.
Paper
Overall, the article would benefit from including a more thorough description of the code as well as more detailed examples of application. I have noted a few suggestions that I list below.
The authors have designed the code for multi-wavelength astronomical time series, but, in the core of the paper, they do not really describe why multi-wavelength observations are relevant to study their objects of interest. In particular, why do we need GPR to model these light curves. For example, there is no mention of the modelling of stellar variability signals (e.g. activity), for example in the search for exoplanets (transit photometry or Doppler spectroscopy). This could be a highly relevant application of multi-wavelength GP modelling, since stellar activity signals are wavelength dependent.
The authors do not really demonstrate why pgmuvi is particularly adapted for multi-wavelength observations. What are the assumptions of model? A reminder of the mathematical framework would also be appreciated, even if this includes briefly describing GpyTorch. More importantly, I was not convinced by the comparison with other GPR codes (lines 65 to 80). This part lacks a more quantitative comparison with other codes (e.g. computation time and complexity of algorithm). The authors mention that similar results could be obtained with tinygp. Could the authors describe, in more details, (i) for which type of problems pgmuvi should be preferred and (ii) why the learning curve is easier in pgmuvi?
As said above, the applications of the code are limited to projects that have not been published (nor even submitted). Since the tutorials are limited to 1D- sine-wave fitting examples, the validation of the code on real data sets is not yet demonstrated. Could the authors please provide detailed examples of multi-wavelength data sets modelling with pgmuvi (if possible in the paper and in the tutorials – see the section about the Docs).
More minor comments:
Docs
Since the authors did not provide tutorials in the github repository, I consulted the Docs associated with the project. The “Installation and Quickstart” instructions are clear, and the API is well detailed. The “How to use pgmuvi” part is clearly explained and works well (I would recommend to remove the full output of Cell [5] to make it more easily readable from the user). However, as described above, I have a few comments on the relevance of the different tutorials provided.
The “Bayesian inference of the Light Curve PSD” section is simply not working in its current state. From cell [8], the mcmc returns an error. Could the authors find and correct the issues associated with this?
In both tutorials, we are modelling a simple sine-wave curve. I reckon that this tests is essential to make the code user friendly, and I am very happy that the authors provided this test. However, after reading the papers associated with this submission, I believe that this code is designed to do more than a sine-wave fit. Could the authors provide me (or the user) with the following applications:
(1) A relevantly-thought multi-wavelength case (which is what the code has been designed for)
(2) A non-strictly-periodic signal (e.g. quasi-periodic). It would be also interesting to provide a test with highly uneven sampling to test the capacity of the code to find the right periodicities for realistic ground-based observations.
(3) I would also suggest to add an example where the GP is coupled with a deterministic model (i.e. a non-zero mean function). For example, it could be a linear trend or a sine-wave curve. GPs are often coupled with a deterministic mean function where physical parameters can be assessed.
(4) Finally, could I easily use the pgmuvi to generate mock data sets from a GP with a given covariance kernel? If yes, could the authors explain how.
I think that, for consistency, it could be relevant to add installation instructions and tutorials in the pgmuvi github repository, as well as in the Docs page.
The text was updated successfully, but these errors were encountered: