-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #132 from FluxML/dev
for a 0.1.11 release
- Loading branch information
Showing
3 changed files
with
94 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
name = "MLJFlux" | ||
uuid = "094fc8d1-fd35-5302-93ea-dabda2abf845" | ||
authors = ["Anthony D. Blaom <[email protected]>", "Ayush Shridhar <[email protected]>"] | ||
version = "0.1.10" | ||
version = "0.1.11" | ||
|
||
[deps] | ||
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597" | ||
|
@@ -15,8 +15,8 @@ Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" | |
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c" | ||
|
||
[compat] | ||
CategoricalArrays = "^0.8.1,^0.9" | ||
ColorTypes = "^0.10.3" | ||
CategoricalArrays = "^0.10" | ||
ColorTypes = "^0.10.3, 0.11" | ||
ComputationalResources = "^0.3.2" | ||
Flux = "^0.10.4, ^0.11, 0.12" | ||
LossFunctions = "^0.5, ^0.6" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
@article{Blaom2020, | ||
doi = {10.21105/joss.02704}, | ||
url = {https://doi.org/10.21105/joss.02704}, | ||
year = {2020}, | ||
publisher = {The Open Journal}, | ||
volume = {5}, | ||
number = {55}, | ||
pages = {2704}, | ||
author = {Anthony D. Blaom and Franz Kiraly and Thibaut Lienart and Yiannis Simillides and Diego Arenas and Sebastian J. Vollmer}, | ||
title = {{MLJ}: A Julia package for composable machine learning}, | ||
journal = {Journal of Open Source Software} | ||
} | ||
|
||
@article{Innes2018, | ||
doi = {10.21105/joss.00602}, | ||
url = {https://doi.org/10.21105/joss.00602}, | ||
year = {2018}, | ||
publisher = {The Open Journal}, | ||
volume = {3}, | ||
number = {25}, | ||
pages = {602}, | ||
author = {Mike Innes}, | ||
title = {Flux: Elegant machine learning with Julia}, | ||
journal = {Journal of Open Source Software} | ||
} | ||
|
||
@article{Julia-2017, | ||
title={Julia: A fresh approach to numerical computing}, | ||
author={Bezanson, Jeff and Edelman, Alan and Karpinski, Stefan and Shah, Viral B}, | ||
journal={SIAM {R}eview}, | ||
volume={59}, | ||
number={1}, | ||
pages={65--98}, | ||
year={2017}, | ||
publisher={SIAM}, | ||
doi={10.1137/141000671} | ||
} | ||
|
||
@online{FastAI.jl, | ||
title = {Best practices for deep learning in Julia, inspired by fastai}, | ||
url = {https://github.com/FluxML/FastAI.jl}, | ||
} | ||
|
||
@online{MLJFlux.jl, | ||
author={Shridhar, Ayush and Bloam, Anthony}, | ||
title = {An interface to the deep learning package Flux.jl from the MLJ.jl toolbox }, | ||
url = {https://github.com/FluxML/MLJFlux.jl}, | ||
} | ||
|
||
@online{JuliaBenchmarks, | ||
title = {Julia Micro-Benchmarks}, | ||
url = {https://julialang.org/benchmarks/}, | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
title: 'MLJFlux: Deep learning interface to the MLJ toolbox' | ||
tags: | ||
- Julia | ||
- Machine learning | ||
- Deep learning | ||
authors: | ||
- name: Ayush Shridhar | ||
orcid: 0000-0003-3550-691X | ||
affiliation: 1 | ||
- name: Anthony Blaom | ||
affiliation: 2 | ||
affiliations: | ||
- name: International Institute of Information Technology, Bhubaneswar, India | ||
index: 1 | ||
- name: Department of Computer Science, University of Auckland | ||
index: 2 | ||
date: 26 March 2021 | ||
bibliography: paper.bib | ||
--- | ||
|
||
# Introduction | ||
|
||
We present _MLJFlux.jl_ [@MLJFlux.jl], an interface between the _MLJ_ machine learning toolbox [@Blaom2020] and the _Flux.jl_ deep learning framework [@Innes2018] written in the _Julia_ programming language[@Julia-2017]. MLJFlux makes it possible to implement supervised deep learning models while adhering to the MLJ workflow. This means that users familiar with the MLJ design can write their models in Flux with a few slight modifications and perform all tasks provided by the MLJ model spec. The interface also provides options to train the model on different hardware and warm start the model after changing some specific hyper-parameters. | ||
|
||
Julia solves the _"two language problem"_ in scientific computing, where high-level languages such as Python or Ruby are easy to use but often slow and low-level languages are fast but difficult to use. Using _just-in-time_ compilation and _multiple dispatch_, Julia makes the best of both worlds by matching the performance of low-level languages while also being adaptable[@JuliaBenchmarks] . | ||
|
||
While there has been significant work towards enabling the creation and delivery of deep learning models with _FastAI.jl_ [@FastAI.jl], the focus has been extensively on neural network paradigms. It provides convenience functions for loading data, modeling and tuning but the process still involves writing the preprocessing pipelines, loss functions, optimizers and other hyper-parameters. MLJFlux.jl tries to remove the need for this boilerplate code by leveraging the MLJ design which makes it ideal for basic prototyping and experimentation. | ||
|
||
# Statement of Need | ||
|
||
While MLJ supports multiple statistical models, it lacks support for any kind of deep learning model. MLJFlux adds support for this by interfacing MLJ with the Flux.jl deep learning framework. Converting a Flux model into an MLJ spec can be done by wrapping it in the appropriate MLJFlux container and specifying other hyper-parameters such as the loss function, optimizer, epochs, and so on. This can now be used like any other MLJ model. MLJFlux models implement the MLJ warm restart interface, which means training can be restarted from where it was left off, when the number of epochs is increased or the optimiser settings (e.g., learning rate) are modified. Consequently, an MLJFlux model can also be wrapped as an MLJ _IteratedModel_, making early stopping, model snapshots, callbacks, cyclic learning rates, and other controls available. | ||
|
||
MLJFlux provides four containers that we can enclose our Flux model in. Each model is derived from either `MLJModelInterface.Probabilistic` or `MLJModelInterface.Deterministic` to follow the MLJ design, depending on the type of task. At the core of each wrapper is a _builder_ attribute that specifies the neural network architecture (Flux model) given the shape of the data. | ||
|
||
MLJFlux has been written with ease of use in mind. The core idea is to allow rapid modeling, tuning and visualization of deep learning models via Flux.jl by reusing the already mature and efficient functionalities offered by MLJ. | ||
|
||
# References |