Merge pull request #12 from fszewczyk/doxygen

Doxygen
fszewczyk · Nov 8, 2023 · 7c09af8 · 7c09af8
2 parents 0bff0af + 93a06a2
commit 7c09af8
Show file tree

Hide file tree

Showing 7 changed files with 406 additions and 51 deletions.
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -0,0 +1,18 @@
+name: Documentation
+
+on:
+  push:
+    branches:
+      - master
+      - doxygen
+
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: DenverCoder1/[email protected]
+        with:
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          branch: gh-pages
+          folder: docs/html
+          config_file: Doxyfile
diff --git a/docs/Doxyfile → Doxyfile b/docs/Doxyfile → Doxyfile
@@ -42,7 +42,7 @@ DOXYFILE_ENCODING      = UTF-8
 # title of most generated pages and in a few other places.
 # The default value is: My Project.
 
-PROJECT_NAME           = "Shkyera Tensor"
+PROJECT_NAME           = "Shkyera Grad"
 
 # The PROJECT_NUMBER tag can be used to enter a project or revision number. This
 # could be handy for archiving the generated documentation or if some version
@@ -54,7 +54,7 @@ PROJECT_NUMBER         = 0.0.1
 # for a project that appears at the top of each page and should give viewer a
 # quick idea about the purpose of the project. Keep the description short.
 
-PROJECT_BRIEF          = "Header-only C++ library for Deep Learning"
+PROJECT_BRIEF          = "micrograd, but in C++ and better"
 
 # With the PROJECT_LOGO tag one can specify a logo or an icon that is included
 # in the documentation. The maximum height of the logo should not exceed 55
@@ -918,7 +918,10 @@ WARN_LOGFILE           =
 # Note: If this tag is empty the current directory is searched.
 
 INPUT                  = "README.md" \
-                         "include/src" 
+                         "docs/tutorials/Cheatsheet.md" \
+                         "docs/tutorials/GetStarted.md" \  
+                         "examples/README.md" \   
+                         "include/src" \
 
 # This tag can be used to specify the character encoding of the source files
 # that doxygen parses. Internally doxygen uses the UTF-8 encoding. Doxygen uses

diff --git a/README.md b/README.md
@@ -7,6 +7,7 @@ micrograd, but in C++ and better.
 </i>
 <p></p>
 
+[![Documentation](https://github.com/fszewczyk/shkyera-grad/actions/workflows/docs.yml/badge.svg)](https://fszewczyk.github.io/shkyera-grad/index.html)
 [![LinuxBuild](https://github.com/fszewczyk/shkyera-grad/actions/workflows/linux.yml/badge.svg)](https://github.com/fszewczyk/shkyera-grad/actions/workflows/linux.yml)
 [![MacOSBuild](https://github.com/fszewczyk/shkyera-grad/actions/workflows/macos.yml/badge.svg)](https://github.com/fszewczyk/shkyera-grad/actions/workflows/macos.yml)
 [![WindowsBuild](https://github.com/fszewczyk/shkyera-grad/actions/workflows/windows.yml/badge.svg)](https://github.com/fszewczyk/shkyera-grad/actions/workflows/windows.yml)
@@ -16,21 +17,22 @@ micrograd, but in C++ and better.
 
 This is a small header-only library of a scalar-valued autograd based on [Andrej Karpathy's micrograd](https://github.com/karpathy/micrograd). It provides a high-level, PyTorch-like API for creating and training simple neural networks.
 
+It supports multiple optimizers, such as Adam or SGD, all the most common activation functions and basic types of neural layers. All of it wrapped in a simple, header-only library.
+
 ## Usage
 
-Make sure your compiler supports C++17. Shkyera Grad is a header-only library, so the only thing you need to do is to include it in your project.
+Check out oour [Get Started Guide](https://fszewczyk.github.io/shkyera-grad/md_docs_tutorials_GetStarted.html) to learn the basics of _Shkyera Engine_.
 
-```cpp
-#include "include/ShkyeraGrad.hpp"
-```
+## Showcase
 
-Check out the [examples](examples/README.md) for a quick start on Shkyera Grad. In the meantime, here's a neural network that learns the XOR function.
+Here's a small example showcasing a feed-forward network learning the XOR function. Check out the `examples/` folder for more examples.
 
 ```cpp
-#include "include/ShkyeraGrad.hpp"
+#include "shkyera-grad/include/ShkyeraGrad.hpp"
 
 int main() {
     using namespace shkyera;
+    using T = Type::float32;
 
     std::vector<Vec32> xs;
     std::vector<Vec32> ys;
@@ -41,32 +43,38 @@ int main() {
     xs.push_back(Vec32::of({0, 1})); ys.push_back(Vec32::of({1}));
     xs.push_back(Vec32::of({1, 1})); ys.push_back(Vec32::of({0}));
 
-    auto mlp = SequentialBuilder<Type::float32>::begin()
-                .add(Linear32::create(2, 15))
-                .add(ReLU32::create())
-                .add(Dropout32::create(15, 5, 0.2))
-                .add(ReLU32::create())
-                .add(Linear32::create(5, 1))
-                .add(Sigmoid32::create())
-                .build();
+    auto network = SequentialBuilder<Type::float32>::begin()
+                    .add(Linear32::create(2, 15))
+                    .add(ReLU32::create())
+                    .add(Linear32::create(15, 5))
+                    .add(ReLU32::create())
+                    .add(Linear32::create(5, 1))
+                    .add(Sigmoid32::create())
+                    .build();
 
-    Optimizer32 optimizer = Optimizer<Type::float32>(mlp->parameters(), 0.1);
-    Loss::Function32 lossFunction = Loss::MSE<Type::float32>;
 
-    // ------ TRAINING THE NETWORK ------- //
-    for (size_t epoch = 0; epoch < 100; epoch++) {
+    auto optimizer = Adam32(network->parameters(), 0.05);
+    auto lossFunction = Loss::MSE<T>;
+
+    for (size_t epoch = 0; epoch < 100; epoch++) { // We train for 100 epochs
         auto epochLoss = Val32::create(0);
 
-        optimizer.reset();
-        for (size_t sample = 0; sample < xs.size(); ++sample) {
-            Vec32 pred = mlp->forward(xs[sample]);
-            auto loss = lossFunction(pred, ys[sample]);
+        optimizer.reset();                                      // Reset the gradients
+        for (size_t sample = 0; sample < xs.size(); ++sample) { // We go through each sample
+            Vec32 pred = network->forward(xs[sample]);          // We get some prediction
+            auto loss = lossFunction(pred, ys[sample]);         // And calculate its error
 
-            epochLoss = epochLoss + loss;
+            epochLoss = epochLoss + loss; // Store the loss for feedback
         }
-        optimizer.step();
+        optimizer.step(); // Update the parameters
+
+        auto averageLoss = epochLoss / Val32::create(xs.size());
+        std::cout << "Epoch: " << epoch + 1 << " Loss: " << averageLoss->getValue() << std::endl;
+    }
 
-        std::cout << "Epoch: " << epoch + 1 << " Loss: " << epochLoss->getValue() << std::endl;
+    for (size_t sample = 0; sample < xs.size(); ++sample) { // Go through each example
+        Vec32 pred = network->forward(xs[sample]);          // Predict result
+        std::cout << xs[sample] << " -> " << pred[0] << "\t| True: " << ys[sample][0] << std::endl;
     }
 }
 ```
diff --git a/docs/tutorials/Cheatsheet.md b/docs/tutorials/Cheatsheet.md
@@ -0,0 +1,73 @@
+# Cheatsheet
+
+This page contains all the info you need to develop your models using Shkyera Grad.
+
+## Types
+
+Almost all of the classes in _Shkyera Grad_ are implemented using templates. To simplify creation of these objects, we introduced a standard way to instantiate objects with floating-point template parameters, i.e.
+
+```cpp
+Linear32 = Linear<float>
+Optimizer32 = Optimizer<Type::float32>>
+Loss::MSE64 = Loss::MSE<double>
+Adam64 = Adam<Type::f64>
+
+{Class}32 = {Class}<Type::float32> = {Class}<float>
+{Class}64 = {Class}<Type::float64> = {Class}<double>
+```
+
+## Layers
+
+Here's a full list of available layers:
+
+```cpp
+auto linear = Linear32::create(inputSize, outputSize);
+auto dropout = Dropout32::create(inputSize, outputSize, dropoutRate);
+```
+
+## Optimizers
+
+These are all implemented optimizers:
+
+```cpp
+auto simple = Optimizer32(network->parameters(), learningRate);
+auto sgdWithMomentum = SGD32(network->parameters(), learningRate, momentum = 0.9);
+auto adam = Adam32(network->parameters(), learningRate, beta1 = 0.9, beta2=0.999, epsilon=1e-8);
+```
+
+## Loss functions
+
+Optimization can be performed according to these predefined loss functions:
+
+```cpp
+auto L1 = Loss::MAE32;
+auto L2 = Loss::MSE32;
+auto crossEntropy = Loss::CrossEntropy32;
+```
+
+## Generic Training Loop
+
+Simply copy-pase this code to quickly train your network:
+
+```cpp
+using T = Type::float32; // feel free to change it to float64
+
+auto optimizer = Adam<T>(network->parameters(), 0.05);
+auto lossFunction = Loss::MSE<T>;
+
+for (size_t epoch = 0; epoch < 100; epoch++) {
+    auto epochLoss = Value<T>::create(0);
+
+    optimizer.reset();
+    for (size_t sample = 0; sample < xs.size(); ++sample) {
+        Vector<T> pred = network->forward(xs[sample]);
+        auto loss = lossFunction(pred, ys[sample]);
+
+        epochLoss = epochLoss + loss;
+    }
+    optimizer.step();
+
+    auto averageLoss = epochLoss / Value<T>::create(xs.size());
+    std::cout << "Epoch: " << epoch + 1 << " Loss: " << averageLoss->getValue() << std::endl;
+}
+```