-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using stored checkpoint in a java program #141
Comments
That would be nontrivial. Checkpoints are stored using Torch serialization so you'd either have to implement a decoder in Java or write checkpoints from Lua in some other format. You'd also need to implement the forward pass in Java for whatever Torch modules your network contains, and these would need to be binary compatible with the Torch implementations. Can I ask what use case you have in mind? There may be an easier way. |
yeah, sure. |
If you are mainly concerned about performance then rewriting in Java does not seem like a good solution; it would be much simpler to optimize the existing Lua sampling. There is certainly low-hanging fruit for optimization, such as this pull request: #138 Sampling speed is fundamentally limited by the model itself; generating each character requires some large matrix multiplies. You should make sure that your BLAS implementation is properly set up. As a last resort you can also try training smaller models; depending on the dataset and training you may find that smaller models perform just as well as larger models. |
Yeah, I'll try to train some smaller models too and will see if the performance is same with a boost in speed. |
Can the stored checkpoints be used by a java program and then sampling can be done in java itself ? If yes, can someone please share some ideas about this ?
The text was updated successfully, but these errors were encountered: