v0.11.0
Fluid
Release v0.11.0 includes a new feature PaddlePaddle Fluid. Fluid is designed to allow users to program like PyTorch and TensorFlow Eager Execution. In these systems, there is no longer the concept model and applications do not include a symbolic description of a graph of operators nor a sequence of layers. Instead, applications look exactly like a usual program that describes a process of training or inference. The difference between Fluid and PyTorch or Eager Execution is that Fluid doesn't rely on Python's control-flow, if-then-else
nor for
. Instead, Fluid provides its C++ implementations and their Python binding using the with
statement. For an example
In v0.11.0, we provides a C++ class Executor
to run a Fluid program. Executor works like an interpreter. In future version, we will improve Executor
into a debugger like GDB, and we might provide some compilers, which, for example, takes an application like the above one, and outputs an equivalent C++ source program, which can be compiled using nvcc
to generate binaries that use CUDA, or using icc
to generate binaries that make full use of Intel CPUs.
New Features
- Release
Fluid
. - Add C-API for model inference
- Use fluid API to create a simple GAN demo.
- Add develop guide about performance tunning.
- Add retry when download
paddle.v2.dataset
. - Linking protobuf-lite not protobuf in C++. Reduce the binary size.
- Feature Elastic Deep Learning (EDL) released.
- A new style cmake functions for Paddle. It is based on Bazel API.
- Automatically download and compile with Intel® MKLML library as CBLAS when build
WITH_MKL=ON
. - Intel® MKL-DNN on PaddlePaddle:
- Complete 11 MKL-DNN layers: Convolution, Fully connectivity, Pooling, ReLU, Tanh, ELU, Softmax, BatchNorm, AddTo, Concat, LRN.
- Complete 3 MKL-DNN networks: VGG-19, ResNet-50, GoogleNet
- Benchmark on Intel Skylake 6148 CPU: 2~3x training speedup compared with MKLML.
- Add the
softsign
activation. - Add the dot product layer.
- Add the L2 distance layer.
- Add the sub-nested sequence layer.
- Add the kmax sequence score layer.
- Add the sequence slice layer.
- Add the row convolution layer
- Add mobile friendly webpages.
Improvements
- Build and install using a single
whl
package. - Custom evaluating in V2 API.
- Change
PADDLE_ONLY_CPU
toPADDLE_WITH_GPU
, since we will support many kinds of devices. - Remove buggy BarrierStat.
- Clean and remove unused functions in paddle::Parameter.
- Remove ProtoDataProvider.
- Huber loss supports both regression and classification.
- Add the
stride
parameter for sequence pooling layers. - Enable v2 API use cudnn batch normalization automatically.
- The BN layer's parameter can be shared by a fixed the parameter name.
- Support variable-dimension input feature for 2D convolution operation.
- Refine cmake about CUDA to automatically detect GPU architecture.
- Improved website navigation.