Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can we simplify how our tests are defined and run? #1292

Closed
drammock opened this issue Apr 13, 2023 · 7 comments
Closed

can we simplify how our tests are defined and run? #1292

drammock opened this issue Apr 13, 2023 · 7 comments
Labels
kind: maintenance Improving maintainability and reducing technical debt needs: discussion Needs discussion before an implementation can be made

Comments

@drammock
Copy link
Collaborator

drammock commented Apr 13, 2023

I'm having trouble wrapping my head around our testing setup, which makes it hard to add new tests (both for interactive local testing and for CIs). I think what we have now is (simplified):

├── .github/workflows/tests.yml
│   ├── run-pytest        -> runs on various OSes and Python versions; calls `pytest` directly
│   ├── build-site        -> runs on 3 OSes; calls `sphinx-build`;
│   │                        checks for (un)expected build warnings
│   ├── new-a11y-CI-test  -> ***MISSING / NOT WORKING YET***
│   └── audit             -> old lighthouse job; "needs" build-site; runs on ubuntu only;
│                            creates small site (just kitchen sink), runs `sphinx-build`,
│                            and runs the lighthouse action on that static site.
│
├── noxfile.py
│   ├── docs  ->  calls `stb compile` and `sphinx-build`
│   └── a11y  ->  calls `playwright install`, `nox -s docs`, and `pytest -k a11y`
│
└── tests/
    ├── test_build.py  -> calls `sphinx-build` via a factory function
    └── test_a11y.py   -> sets up a webserver and calls `axe.run()` a bunch of times

(NOTE: the "a11y" tests in the above diagram refer to work in #1260 that isn't in main yet.

The mix-and-match of stb versus sphinx-build, calling pytest or sphinx-build or stb directly vs calling the nox job... it all has my head spinning; currently we don't use nox in our CI config (except for the "compile MO files" step of the run-pytest job? why??)

Thoughts/advice from anyone are very welcome.

Originally posted by @drammock in #1260 (comment)

@drammock drammock added needs: discussion Needs discussion before an implementation can be made kind: maintenance Improving maintainability and reducing technical debt labels Apr 13, 2023
@drammock
Copy link
Collaborator Author

Just to put a few stakes in the ground:

  1. if we're going to use nox then (probably?) we should use it everywhere that we can, because (probably) that will simplify the testing code. I think this means calling nox sessions in our CI jobs (rather than calls to pytest, stb or sphinx-build).

  2. if we're going to keep using stb then if possible we should always use stb instead of calling sphinx-build directly.

  3. I don't especially love nox and I would be OK if it went away. IMO the dev environment ought to be capable of both building the docs and running the tests. The extra "overhead" of remembering to activate the right environment is for me no big deal, compared to the extra time whenever a new dependency is added and nox needs to rebuild its envs (but doesn't know that because it isn't reading our changelogs). TL;DR it's not clear to me that nox saves me time/effort.

  4. I don't fully understand how stb works but it seems to take care of a lot of things (like running webpack) that I'm not very experienced with, so probably best to keep using it. TL;DR I'm pretty sure stb saves me some time/effort.

@trallard
Copy link
Collaborator

trallard commented Apr 13, 2023

Thanks for opening this @drammock - I am +1 on standardising and hopefully unifying how CI and local dev tasks are performed.

I don't especially love nox and I would be OK if it went away. IMO the dev environment ought to be capable of both building the docs and running the tests. The extra "overhead" of remembering to activate the right environment is for me no big deal, compared to the extra time whenever a new dependency is added and nox needs to rebuild its envs (but doesn't know that because it isn't reading our changelogs). TL;DR it's not clear to me that nox saves me time/effort.

While I have been using nox for some automation tasks over Makefiles (mostly to get the Python syntactic sugar), I acknowledge that using it for development/packaging stretches it beyond its capabilities.

A decent alternative I have found for these cases is hatch - I have adopted this in other packages, and it has been a smooth transition. Hatch also comes with the added benefit that it ensures that the dependencies are always up to date. There is also a discussion around improving/automating the release process, so migrating to a more automated release process #1169, and we can leverage some of the features from hatch (versioning, building, artefacts checking) so it might help us tackle two pain points at the same time?

Of course this would need broader consensus and associated discussion

I don't fully understand how stb works, but it seems to take care of a lot of things

I am in the same boat here, so I did some digging. Note I am far from an expert on how stb works, but it seems to take care of Sphinx-based themes' development (and build) aspect. So it deals with things like nodeenv setup and other node-related and CSS/SASS tasks (which is nicer than having to deal with Python and node dependencies separately/manually).

So I assume it is only needed when there are changes on JS, styles, and all those webpack-y things (so it is technically not needed if, for example, we only make docs content changes and need to rebuild the docs - like in build-site).

😱 As I was writing this, I realised one of my sources of confusion was that nox -s docs and nox -s docs-live both use stb AND sphinx-build (unless explicitly skipped), so until now, building the docs AND compiling the assets was one and the same task and I did not think about it in depth? nox abstracts this but maybe the sessions should be called compile and build? (that seems a tangent topic tho)

My best guess is that sometimes we call stb or sphinx-build vs nox -s ... to separate these concerns or remove these abstractions? Or maybe the call to pytest in /.github/tests/yml pre-dates the addition of nox

phew that was a long one, so TLD;R

+1 on adopting a more intuitive way to call our "dev" and "build" tasks, ideally consistent through CI and local use

@12rambau
Copy link
Collaborator

My turn:

I agree we should rationalize things here. and multiple points have been raised in the previous messages:

what's the difference between stb and sphinx-build and why do we use both ?

To make things easy, "stb" is our compiler and deals with all our assets and "sphinx-build" is only here to buid our documentation.
stb can also be used to build and serve the documentation but we decided to keep the sphinx-build command from our side to mimick more closely what will happens on people computers when they use the "make html" command. In my opinion there is no easy solution here we need to keep both. I did some digging into the nox docs-live session in #1265 and it's not clear to me what change people want to listen to (I personnaly never use this session, the normal build is fast enough). so my take here is we will continue to use both but I'm not sure docs-live is really useful.

change our packaging strategy ?

I think we are mixing things here. The packaging of the lib is not causing any trouble from my side, the github action works smoothly and we are not facing any problem with setuptools. I think switching to hatch or poetry will just increase our burden for no practical advantages. #1169 is more about improving the github action to do everything in one command because I'm lazy.

the elephant in the room: nox

I love nox (or tox). I hate dealing with virtual environment and remember where they are. I use nox (or tox) in all the libs I'm working in and I enven build application dashboard with it now (Jupyter based). Considering its spread in the Python community I think it would be a mistake to go away from it. I think our main issue is the rebuild detection process. We tweek it in _should_install and I think it would be better to simply rely on the built-in method that work smoothly. Both pypi and webpack will just say "nothing to see here".

side note: I'm initially a C++ developer so I'm use to compilation time. maybe that makes me more patient than the usual Python developer 😄

Also nox is designed to be a "super pre-commit". Anything that is too long to be in pre-commits should go there and you launch it with 1 single command: nox then you go grab a coffee and come back when it's done. I think we went a bit to far and some of the commands should go elsewhere.

finally the test structure

Yes everything is duplicated in nox and the CI. I think it would be easier to maintain to have everything in noxfile.py and launch specific sessions in the test.yml. nox is designed for that.

my personnal hidden gem

When I did the spring cleaning in #1271 I was not comfortable with all the translation step in nox. That's typically something that should not be there as it cannot be launch with all the other session as it needs posargs. To me this should be a cron job in our actions (that's what I do in my repositories) that refresh the translation when necessary.

PS: I recently switch from a mac computer to a windows PC and I realize the proof-reader is not able to switch automatically from english to french (my default) so until I find a solution I apologize for the millions of typo.

@drammock
Copy link
Collaborator Author

stb can also be used to build and serve the documentation but we decided to keep the sphinx-build command from our side to mimick more closely what will happens on people computers when they use the "make html" command.

I'm curious who is the "we" and when it was decided. Would be nice to have a "decision log" or something similar (a simple GH repo wiki page would be good enough I think). That said: I'm looking closer at stb and it appears that it does not provide a "static doc build" command, so we can't actually replace sphinx-build with stb ... anyway.

I did some digging into the nox docs-live session in #1265... my take here is we will continue to use both but I'm not sure docs-live is really useful.

When docs-live was working normally I used it frequently so yes let's please keep it.

I think our main issue [with nox] is the rebuild detection process. We tweek it in _should_install and I think it would be better to simply rely on the built-in method that work smoothly. Both pypi and webpack will just say "nothing to see here".

Do you mean that nox has built-in way of deciding whether to reinstall the env, and we're messing with it?

I think we went a bit to far [with nox] and some of the commands should go elsewhere.

Specifically which ones should go elsewhere? (and where is elsewhere?)

When I did the spring cleaning in #1271 I was not comfortable with all the translation step in nox. That's typically something that should not be there as it cannot be launch with all the other session as it needs posargs. To me this should be a cron job in our actions (that's what I do in my repositories) that refresh the translation when necessary.

I don't really know how babel works so I probably can't offer alternative solutions off the top of my head. But I do know that there's a step in GH actions file that says "compile the MO files" and it's the only step that uses nox. So if we can skip using nox for that, it will at least simplify the GH Actions a little!

@choldgraf
Copy link
Collaborator

A few thoughts from me, I'll try to keep it short and split out by sub topic!

Nox

  • i personally use nox a lot and find that it really helps reduce cognitive load and toil.
  • i also don't even really know how to use virtual environments independently (I'm probably on the lower end of the developer skills spectrum, but maybe that's a good thing to ensure our developer practices are accessible)
  • if i recall, our nox build actually changes nox's default and does not rebuild the env each time. We could change that back (but then build times may increase a bit)
  • or we could do a simple hash of our project.toml file and trigger a rebuild whenever it changes?
  • also worth noting that there's nothing stopping someone from maintaining their own local dev environment and running commands manually rather than using nox.
  • Proposal: time box ourselves to 30 minutes to see if it is trivial to do a "check the hash of pyproject.toml to auto reinstall the environment if needed". If it's not trivial, then tell nox to reinstall every time.
  • proposal: for the more complex stuff like language translation, try to automate this and either remove from nox or note that we don't expect the translation nox functions to be used by humans, they're just a way to help automation.

Standardizing in cd

  • agreed it would be better if we just used nox in the CD as well. Or tox or whatever but I do find it's easier if there's just one way to do things.
  • proposal: use nox liberally in our CI/CD

Theme builder

  • yep the main benefit from this is standardizing asset compilation and the node environment installation, as this was a huge pain before STB existed.
  • and just to reiterate - the point of STB is not to help you build documentation, it is to help you prepare a theme for release. It has some quality of life features like the live server but it's core functionality is compilation of assets and packaging. That's why we often use it for compilation and sphinx-build for building
  • the "live server" preview also has some nice quality of life improvements because it creates the docs fresh in a temp directory
  • but sphinx is so slow at building that TBH I've found the auto reloading to not be that helpful for iteration anyway
  • proposal: maybe we should just use the theme builder only for this compilation step and for the purpose of publishing

Babel

  • regarding translations I'd be happy if those were moved to a GitHub action via a cron job (even if they still used nox). I just use nox to automate common actions in general , so that i don't have to keep looking up how to do them, before automating it at a later moment. So +1 from me

Build tool

  • I also am not sure what would be the benefit of hatch or poetry over our current build step, but I'm open to hearing suggestions if people think there is a clear rationale for them

Decision log

  • I suspect this decision was made in a PR and in it's review. Unless we specify what kinds of decisions must follow a separate process, I suspect almost all of our decisions will continue to happen in issues/PRs and whoever reviews them will be the decider. We can change that for some things if we want to have more formal governance. But it's a balance between making decisions formally and inclusively, and making them efficiently without too much effort.

Finally I suggest that we identify any topics that require ongoing conversation, and create issues for them with dedicated conversation

@drammock
Copy link
Collaborator Author

proposal: use nox liberally in our CI/CD

+1

also worth noting that there's nothing stopping someone from maintaining their own local dev environment and running commands manually rather than using nox.

That ought to be easy, but it is not (cf. #1149)

STB has some quality of life features like the live server but it's core functionality is compilation of assets and packaging [...] but sphinx is so slow at building that TBH I've found the auto reloading to not be that helpful for iteration anyway

ha, I'm used to MNE-Python docs, to me the rebuild of our theme docs feels fast(ish).

regarding translations I'd be happy if those were moved to a GitHub action via a cron job (even if they still used nox)

+1

I suspect almost all of our decisions will continue to happen in issues/PRs and whoever reviews them will be the decider.

I think that is appropriate for the current size of the team. I'm not suggesting an entirely new process; rather I'm suggesting that a bit gets tacked on the end: where someone notices that a decision may be controversial (maybe there is no clear "best" or "right" way) or have wide consequences, and if so they add to the decision log a very short summary of what the decision is, the rationale, and a link to the discussion (PR).

trallard added a commit that referenced this issue May 24, 2024
This PR addresses some of the concerns/issues raised in #1292
@trallard
Copy link
Collaborator

I believe most of the points raised here have been addressed by #1759
The only outstanding refactor/improvement missing is moving translation workflows to tox, so I will close this issue. We can re-open if anyone else disagrees, or we need to bring in more discussions here.

ivanov pushed a commit to ivanov/pydata-sphinx-theme that referenced this issue Jun 5, 2024
This PR addresses some of the concerns/issues raised in pydata#1292
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: maintenance Improving maintainability and reducing technical debt needs: discussion Needs discussion before an implementation can be made
Projects
None yet
Development

No branches or pull requests

4 participants