Skip to content

Commit

Permalink
fix typos
Browse files Browse the repository at this point in the history
Signed-off-by: omahs <[email protected]>
  • Loading branch information
omahs committed Nov 2, 2024
1 parent 35e5bd7 commit 34cab21
Show file tree
Hide file tree
Showing 5 changed files with 5 additions and 5 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ We use ``black`` as our style guide. To fix your format run `pip install pre-com
1. Methods should be atomic. A method shouldn't be longer than 75 lines, e.g. can be fit into the computer screen without scrolling.
1. If a method has arguments that don't fit into one line, each argument should be in its own line for readability.
1. Add ``__init__.py`` for every folder.
1. F-strings are prefered to formatted strings.
1. F-strings are preferred to formatted strings.
1. Loggers are preferred to print. Use the logger from NeMo via ``from nemo.utils import logging``
1. Private functions (functions start with ``_``) shouldn't be called outside its host file.
1. If a comment lasts multiple lines, use ``'''`` instead of ``#``.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ NeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit ha

The NeMo-Aligner toolkit is built using the NeMo Framework, which enables scalable training across thousands of GPUs using tensor, data, and pipeline parallelism for all alignment components. Additionally, our checkpoints are cross-compatible with the NeMo ecosystem, facilitating inference deployment and further customization (https://github.com/NVIDIA/NeMo-Aligner).

The toolkit is currently in it's early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models.
The toolkit is currently in its early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models.

## Key Features

Expand Down
2 changes: 1 addition & 1 deletion docs/RLHFTraining.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,5 +49,5 @@ We have many optimizations available for performance that you can enable.
* `model.ppo.length_params.max_length`: This sets the max tokens to generate in rollouts, it defaults to half the sequence length of the model. But if you know your model is not too verbose (e.g. after supervised fine tuning) then you can set it to a lower number.

#### Critic performance optimization hyperparameters
* `trainer.ppo.combine_rm_and_critic_server`: When enabled, inference requests to the critic server will also return the rewards. This saves the need of having to run a seperate reward model server..
* `trainer.ppo.combine_rm_and_critic_server`: When enabled, inference requests to the critic server will also return the rewards. This saves the need of having to run a separate reward model server..
* `model.offload_adam_states`: When enabled, offload the distributed adam optimizer states onto CPU during inference. This allows us to save memory during inference for a bigger `trainer.ppo.inference_micro_batch_size`. No effect if the optimizer is not distributed adam.
2 changes: 1 addition & 1 deletion docs/user-guide/rlhf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ We test the scaling of our TRT-LLM integration by running Llama3 70B Actor and L
+------------------+-------------------+-----------------------------+----------------------+--------------------+
.. note::
for 64x32 config we used a ``rollout_micro_batch_size`` of 16 instead of 8 due to the additional memory from the the distributed optimizer.
for 64x32 config we used a ``rollout_micro_batch_size`` of 16 instead of 8 due to the additional memory from the distributed optimizer.
We also support running RLHF on Llama3.1 405B Actor and Reward Model. The following numbers are generated with ``num_rollout_samples=128``, ``global_batch_size=128``, reshard turned off, engine offloading set to False.
Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/steerlm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,7 @@ Run Inference
web_server=False \
port=1427
Please wait for the server to be ready before proceeeding.
Please wait for the server to be ready before proceeding.

#. Create Python helper functions:

Expand Down

0 comments on commit 34cab21

Please sign in to comment.