Please add to quickstart "how to use" section #3

Israel-Laguan · 2024-09-13T06:03:27Z

First of all thanks for the great work!

Context

I was trying to use with my new build 2 p40 in ubuntu 24.04 (Pop_OS), and it seems to run, I had to check the code and found a endpoint with some info:

Great, I found also some errors

Also the one named /gui was not working.

Request

Besides the existing quickstart would be nice a couple of ways to help users to use the program. Some stuff as I user I would find useful:

cURL example to check gppm is working, for example

curl http://localhost:5001/get_llamacpp_subprocesses

Reference of available endpoints and examples of use
An example of using one single p40 and guide until call to llama.cpp instance with a query like "how many squares are on a chessboard?". I notice this is partially mentioned already, just would be nice the "now query llama.cpp with a prompt" section or similar.
An example of using 2 or more p40 and maybe to change a config, then inspect the change.
Just in case is needed, how to disable/uninstall.

Other considerations

It worked fine, I think not needed but I share my PC specs:
CPU: AMD Ryzen 3 3200G
RAM: 8 RAM (3200MHz)
GPU(s): 2 Nvidia P40
OS: pop_OS 24

The text was updated successfully, but these errors were encountered:

crashr · 2024-09-14T09:45:45Z

Hello Israel-Laguan,
thank you very much for that feedback!

I can briefly try to explain the dilemma I'm in with gppm. When I started a few months ago, I hit on it at exactly the right time. There was no ready-made solution to get the power consumption of P40 under control, but sasha0552 had just released nvidia-pstate, on which gppm is essentially based. I wanted a solution for very specific scenarios. I wanted to enable cheap computers with as many P40s as possible, which were very cheap at the time, to run a whole bunch of llama.cpp instances at the same time. The aim was to make this usable for teams, very small companies (liek a hand ull of people or so) etc. or agentic setups where inference runs almost continuously and there is little idle time. When I introduced gppm on Reddit, many P40 users used it immediately, although they actually only needed a much simpler functionality. This is exactly what sasha0552 recognized and published nvidia-pstated.

nvidia-pstated does something that I have already done here a29a3ea. You can observe whether the GPU is working (or wants to work) or not and then switch the performance state accordingly. I have even experimented with switching the performance state very frequently, even between individual tokens. However, the whole thing didn't meet with much interest, which is why I didn't develop it any further. When there was no more feedback from users, I assumed that most of them had switched to nvidia-pstated in the meantime. For users with a small number of GPUs or where the GPU does nothing 99% of the time, it probably makes sense because of the low complexity. I have only developed gppm according to my personal needs from that point on. This has 1. led to the current architecture, which I find somewhat suboptimal and 2. also to the fact that I have put little effort into the documentation and have not further developed half-finished things like the gui but have not removed them either. But I'll be happy to make up for that if it helps, thanks again for the feedback.

As far as the architecture of gppm is concerned, I can briefly outline the roadmap. Originally, gppm stood for “GPU Power and Performance Manager”. However, I now also use it to launch and manage my entire stack. That's why I've renamed it. gppm is now a recursive acronym and stands for “gppm power process manager”. Just like GNU stands for “GNU's Not Unix”. I don't like that so much, because I prefer a concrete tool for a concrete purpose. That's why I'm going to split this into two projects in future, which can be used in combination or individually.

Israel-Laguan · 2024-09-14T14:52:55Z

I think your solution is solid for the time being. Maybe you can try to leave the project without rough edges and move on, but the project is good. It just needs some polish.

Israel-Laguan changed the title ~~Please add to quickstart how to use it~~ Please add to quickstart "how to use" section Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please add to quickstart "how to use" section #3

Please add to quickstart "how to use" section #3

Israel-Laguan commented Sep 13, 2024 •

edited

Loading

crashr commented Sep 14, 2024

Israel-Laguan commented Sep 14, 2024

Please add to quickstart "how to use" section #3

Please add to quickstart "how to use" section #3

Comments

Israel-Laguan commented Sep 13, 2024 • edited Loading

Context

Request

Other considerations

crashr commented Sep 14, 2024

Israel-Laguan commented Sep 14, 2024

Israel-Laguan commented Sep 13, 2024 •

edited

Loading