Releases: tony-framework/TonY
Releases · tony-framework/TonY
new TonY release
v0.3.33 move System.exit(exitCode) out of try block so that finally block can…
Fix tensorboard port conflict
version bump (#462) Co-authored-by: Cheng Ren <[email protected]>
Enable port reuse option
version upgrade (#460) Co-authored-by: Cheng Ren <[email protected]>
Bug fix: Tony source archive extraction during parallel builds
- Fixed archive extracting issues with parallel workflows using unique naming (#455 )
Bug Fixes
fixes the bug where the retry fails, if the previous failures was due to heartbeat failure.
Miscellaneous Fixes
Changelog:
- Add github workflow to auto release to maven central on tags(#439)
- Add github action based build workflow (#438)
- Fix retry in case of untracked task failures (#436)
- Update README.md (#433)
- Add job_id env in container (#431)
- Added fix to return successful training with some failed worker jobs (#428)
- Exposing container diagnostic information for failed task to the tonyhistory files (#429)
- Fixing AM RPC port binding issue (#427)
- Support to specify application type (#426)
- Enable debugging with tony am (#425)
- Exposing container logs URL in tony portal (#424)
- Refine README.md about TonY in gcp (#423)
- Add support for CUDA 10.1, add generic CUDA script and update documen… (#419)
- Update install_gpu_cu10.sh (#417)
- Change default allocation timeout to -1 to avoid incompatible behavior. Bump version (#416)
- Add wait timeout for allocated container registration (#413)
- Add option to stop application on failure of certain job (#414)
- Handle untracked job type crash by failing the application for #405 (#406)
- TonY should be able to launch jobs using dependency graph (#403)
Identifies Killed Apps in TonY Portal
See #399 for more details
MXNet support, scalability improvements and bug fixes
- Scalability improvements to support 200+ concurrent executors.
- UX fix to print out am log url and logging
- Support MXNet
- Fix port race condition problem
Adds Sorting/Filtering to TonY Portal
See #354 for more details.
Released on Maven Central.
Made fetching final task status more robust
See #350 for more details. Released on Maven Central.