Skip to content

Release 21.09 (September 30th 2021)

Compare
Choose a tag to compare
@jiazhihao jiazhihao released this 06 Oct 14:57
· 658 commits to master since this release

Frontend Supports

  • Changing PyBind11 as the default Python frontend in FlexFlow.

Control Replication

Distributed training

  • FlexFlow now uses NCCL AllReduce for gradients synchronization by default. To switch to distributed parameter server, set FF_USE_NCCL=OFF in cmake.

Distributed inference

  • Passing comp_node = comp_node = CompMode::INFERENCE as an additional argument to model.compile will run a DNN model in the inference model
  • Various bug fixes and performance improvements for distributed inference in FlexFlow.

Operators

  • Additional operators include AggregateSpec, Multi-Head Attention

Machine Model

  • FlexFlow now support a new machine model for more precisely modeling network topology and simulating traffics at the granularity of individual packages