Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending Pose Estimation Model for 3D Objects: Customization and Challenges #111

Open
manojs8473 opened this issue Oct 11, 2023 · 1 comment
Assignees
Labels
good first issue Good for newcomers question Further information is requested

Comments

@manojs8473
Copy link

Hello!

First of all, thank you for delivering this incredible work!
I'm interested in customizing the current model to estimate the 3D pose of objects like a baseball bat or tennis racket in the hands of the actor in addition to the 3D pose of the human body, which the model already does successfully. I have a few questions and doubts regarding this task:

Customizing Skeleton Hierarchy: Is it possible to customize the current skeleton hierarchy and add new bones or edges to represent the bat or racket? I assume this would be necessary to include these objects in the pose estimation.

Architectural Changes: What sort of changes will be required in the architecture of the model to accommodate the estimation of 3D pose for objects? Are there any specific layers or components that need to be modified or added?

Training Data Volume: Could you provide insights into the volume of data that the model would require for training to achieve good accuracy in estimating the 3D pose of both the human body and objects like baseball bats and tennis rackets?

Your comments and suggestions on how to approach this customization would be immensely appreciated. Thank you!

@AmmarkoV AmmarkoV self-assigned this Dec 8, 2023
@AmmarkoV AmmarkoV added good first issue Good for newcomers question Further information is requested labels Dec 8, 2023
@AmmarkoV
Copy link
Collaborator

AmmarkoV commented Dec 8, 2023

Hello!
Thank you for your kind words!

And sorry about the delay responding, I am currently writing my PhD thesis while for the last months I have been abroad for almost two months for project meetings, conferences + a secondment in Italy so I was not logged in Github and did not see the issues. I received the 2FA warning and logged in after some time and show the issue today! :(

A lot of excellent questions
First of all for object 3D pose you will first need to train an RGB -> 2D heatmap estimator that produces 2D "joint" data for the objects of your choice.

For a tennis racket for example 5 points, the handle the top of the racket, the sides and its center
For a Baseball bat 3 points the handle, the top of the bat and its middle. etc.
Although there now exist foundation models such as SAM, mask RCNNs etc that would automatically segment the racket, baseball etc you will still need some landmarks to incorporate them in the 3D pose solution.

You can easily extend the BVH file to accommodate extra geometry :
If you look at https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/master/dataset/headerWithHeadAndOneMotion.bvh
and take a look at
http://www.dcs.shef.ac.uk/intranet/research/public/resmes/CS0111.pdf
I think you can easily extend the BVH armature with such a shape.

In terms of the MocapNET model you will need to include the new "joints" of the racket/baseball to the NSRM matrices
The description on how to make the descriptor is here : http://users.ics.forth.gr/~argyros/mypapers/2021_11_BMVC_Qammaz.pdf . The architecture could remain the same in my opinion it should scale to one more joint with no problems

MocapNET is typically trained on 3M pose samples. Having a BVH source like the one I use
https://drive.google.com/file/d/1Zt-MycqhMylfBUqgmW9sLBclNNxoNGqV/view?usp=drive_link
You will need to write a program that goes into each BVH file and applies the extra joints for your "Tool" be that a racket, hammer, baseball etc.. You will then have a dataset with enough samples

Unfortunately FORTH which is the license holder for this work, prevents me from sharing the training code for the network, however I think with the Python code shared here : https://github.com/FORTH-ModelBasedTracker/MocapNET/tree/mnet4/src/python/mnet4

  • a little ChatGPT help :D for the missing parts you can be successful in implementing what you propose!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants