Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP32-P4 optimizations? (TFMIC-38) #95

Open
nicklasb opened this issue Aug 25, 2024 · 2 comments
Open

ESP32-P4 optimizations? (TFMIC-38) #95

nicklasb opened this issue Aug 25, 2024 · 2 comments

Comments

@nicklasb
Copy link

nicklasb commented Aug 25, 2024

Hi,

The ESP32-P4 holds a lot of promise, perhaps it might even make it possible to infer a "real" DL-model like Yolo in a respectable time. And thus, I have some thoughts on what I would like to see happen with this library:

Memory
The most glaring thing currently with regards to the S3, is that because ML-models doesn't fit in ram, the tensor arena in its entirety ends up in PSRAM.
Initially it seemed like the MicroAllocater could be used to allocate a non-persistent part of the tensor arena to SRAM. But it seemed to have been protected in some C++ way, and I couldn't make that work earlier. It would appear to me even now this must be slowing down inference by several times.
Either way, the memory handling must evolve significantly (I mean, there is TCM, don't let PSRAM spoil that)
Because If this would stay the same on the P4, it would nullify most benefits, as it would just wait on PSRAM access all the time. It would just be like 10% faster, which would be sad.

XAI Extensions
If the memory starts moving, the extensions, and the FPU(!) would start to really matter, are there any plans on a custom kernel that exploits those?

That was just my 5 ¢. But I feel it is important for your customers to know a little bit what is planned down the line now. If the ESP32-P4 would make the ESP32 seriously usable for a real YOLO model, for example, it would be a game changer with a lot of use cases where you now have to involve an RPI 5 or something instead. Which while less attractive to me from a product development and deployment standpoint, I know will do the job.

@github-actions github-actions bot changed the title ESP32-P4 optimizations? ESP32-P4 optimizations? (TFMIC-38) Aug 25, 2024
@nicklasb
Copy link
Author

nicklasb commented Aug 25, 2024

Maybe this should be in the esp-nn repo. But it feels like that is more about the XAI extensions. :-)

Which I just saw have been answered there. :-)

@nicklasb
Copy link
Author

Closing as this is being discussed in other threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@nicklasb and others