Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing weight initializers to follow standard #196

Merged
merged 28 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
a1066ba
initial work for initializers
MartinuzziFrancesco Jan 13, 2024
7530157
exports
MartinuzziFrancesco Jan 13, 2024
ba9ec64
fixing types and starting tests
MartinuzziFrancesco Jan 18, 2024
3a5a622
start of input_layers, start of streamline to new api
MartinuzziFrancesco Jan 20, 2024
ab3337c
made ESN work with new initilaizers, started separation of different …
MartinuzziFrancesco Jan 21, 2024
8cdc646
sparse layer
Jay-sanjay Jan 21, 2024
2d96405
HybridESN working, modified docs and readme to follow changes
MartinuzziFrancesco Jan 21, 2024
8b6ceb1
sparse layer
Jay-sanjay Jan 21, 2024
d8f5822
informed layer
Jay-sanjay Jan 21, 2024
9273407
bernoulli sample layer
Jay-sanjay Jan 21, 2024
dc3905f
irrational sample layer
Jay-sanjay Jan 21, 2024
2d3cc4d
added delay line backward reservoir
Jay-sanjay Jan 31, 2024
95c7870
added cycle jumps reservoir
Jay-sanjay Jan 31, 2024
2e89a38
added simple cycle reservoir
Jay-sanjay Jan 31, 2024
6ecceb2
added pseudo svd reservoir
Jay-sanjay Jan 31, 2024
4c3925b
small changes to inits, start of streamline
MartinuzziFrancesco Feb 4, 2024
ebde232
various test fixes
MartinuzziFrancesco Feb 12, 2024
a3c764e
some work on the DeepESN
MartinuzziFrancesco Feb 14, 2024
49865f0
fixed DeepESN and added tests
MartinuzziFrancesco Feb 21, 2024
0e405f2
merging main
MartinuzziFrancesco Feb 21, 2024
331b651
fixed formatting
MartinuzziFrancesco Feb 26, 2024
a56e043
resolved merge conflicts
MartinuzziFrancesco Feb 27, 2024
bd2b9a0
format
MartinuzziFrancesco Feb 27, 2024
edfff16
julia bump
MartinuzziFrancesco Feb 27, 2024
8c9056f
bump random
MartinuzziFrancesco Feb 27, 2024
00bafa5
bumps for WI and Distributions
MartinuzziFrancesco Feb 27, 2024
15b7d45
bump Distributions
MartinuzziFrancesco Feb 27, 2024
c028bcd
rm echostatenetwork.jl
MartinuzziFrancesco Feb 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,27 +13,32 @@ LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MLJLinearModels = "6ee0df7b-362f-4a72-a706-9e79364fb692"
NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
Optim = "429524aa-4258-5aef-a3af-852621145aeb"
PartialFunctions = "570af359-4316-4cb7-8c74-252c00c2016b"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
WeightInitializers = "d49dbf32-c5c2-4618-8acc-27bb2598ef2d"

[compat]
Adapt = "3.3.3, 4"
Aqua = "0.8"
CellularAutomata = "0.0.2"
DifferentialEquations = "7"
Distances = "0.10"
Distributions = "0.24.5, 0.25"
Distributions = "0.25.36"
LIBSVM = "0.8"
LinearAlgebra = "1.10"
MLJLinearModels = "0.9.2"
NNlib = "0.8.4, 0.9"
Optim = "1"
Random = "1"
PartialFunctions = "1.2"
Random = "1.10"
SafeTestsets = "0.1"
SparseArrays = "1.10"
Statistics = "1.10"
Test = "1"
julia = "1.6"
WeightInitializers = "0.1.5"
julia = "1.10"

[extras]
Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595"
Expand Down
13 changes: 9 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,15 @@ test = data[:, (shift + train_len):(shift + train_len + predict_len - 1)]
Now that we have the data we can initialize the ESN with the chosen parameters. Given that this is a quick example we are going to change the least amount of possible parameters. For more detailed examples and explanations of the functions please refer to the documentation.

```julia
input_size = 3
res_size = 300
esn = ESN(input_data;
reservoir = RandSparseReservoir(res_size, radius = 1.2, sparsity = 6 / res_size),
input_layer = WeightedLayer(),
esn = ESN(input_data, input_size, res_size;
reservoir = rand_sparse(; radius = 1.2, sparsity = 6 / res_size),
input_layer = weighted_init,
nla_type = NLAT2())
```

The echo state network can now be trained and tested. If not specified, the training will always be Ordinary Least Squares regression. The full range of training methods is detailed in the documentation.
The echo state network can now be trained and tested. If not specified, the training will always be ordinary least squares regression. The full range of training methods is detailed in the documentation.

```julia
output_layer = train(esn, target_data)
Expand Down Expand Up @@ -103,3 +104,7 @@ If you use this library in your work, please cite:
url = {http://jmlr.org/papers/v23/22-0611.html}
}
```

## Acknowledgements

This project was possible thanks to initial funding through the [Google summer of code](https://summerofcode.withgoogle.com/) 2020 program. Francesco M. further acknowledges [ScaDS.AI](https://scads.ai/) and [RSC4Earth](https://rsc4earth.de/) for supporting the current progress on the library.
18 changes: 11 additions & 7 deletions docs/src/esn_tutorials/hybrid.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Hybrid Echo State Networks

Following the idea of giving physical information to machine learning models, the hybrid echo state networks [^1] try to achieve this results by feeding model data into the ESN. In this example, it is explained how to create and leverage such models in ReservoirComputing.jl. The full script for this example is available [here](https://github.com/MartinuzziFrancesco/reservoir-computing-examples/blob/main/hybrid/hybrid.jl). This example was run on Julia v1.7.2.
Following the idea of giving physical information to machine learning models, the hybrid echo state networks [^1] try to achieve this results by feeding model data into the ESN. In this example, it is explained how to create and leverage such models in ReservoirComputing.jl.

## Generating the data

Expand Down Expand Up @@ -47,25 +47,29 @@ function prior_model_data_generator(u0, tspan, tsteps, model = lorenz)
end
```

Given the initial condition, time span, and time steps, this function returns the data for the chosen model. Now, using the `Hybrid` method, it is possible to input all this information to the model.
Given the initial condition, time span, and time steps, this function returns the data for the chosen model. Now, using the `KnowledgeModel` method, it is possible to input all this information to `HybridESN`.

```@example hybrid
using ReservoirComputing, Random
Random.seed!(42)

hybrid = Hybrid(prior_model_data_generator, u0, tspan_train, train_len)
km = KnowledgeModel(prior_model_data_generator, u0, tspan_train, train_len)

esn = ESN(input_data,
reservoir = RandSparseReservoir(300),
variation = hybrid)
in_size = 3
res_size = 300
hesn = HybridESN(km,
input_data,
in_size,
res_size;
reservoir = rand_sparse)
```

## Training and Prediction

The training and prediction of the Hybrid ESN can proceed as usual:

```@example hybrid
output_layer = train(esn, target_data, StandardRidge(0.3))
output_layer = train(hesn, target_data, StandardRidge(0.3))
output = esn(Generative(predict_len), output_layer)
```

Expand Down
14 changes: 7 additions & 7 deletions docs/src/esn_tutorials/lorenz_basic.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Lorenz System Forecasting

This example expands on the readme Lorenz system forecasting to better showcase how to use methods and functions provided in the library for Echo State Networks. Here the prediction method used is `Generative`, for a more detailed explanation of the differences between `Generative` and `Predictive` please refer to the other examples given in the documentation. The full script for this example is available [here](https://github.com/MartinuzziFrancesco/reservoir-computing-examples/blob/main/lorenz_basic/lorenz_basic.jl). This example was run on Julia v1.7.2.
This example expands on the readme Lorenz system forecasting to better showcase how to use methods and functions provided in the library for Echo State Networks. Here the prediction method used is `Generative`, for a more detailed explanation of the differences between `Generative` and `Predictive` please refer to the other examples given in the documentation.

## Generating the data

Expand Down Expand Up @@ -46,25 +46,25 @@ using ReservoirComputing

#define ESN parameters
res_size = 300
in_size = 3
res_radius = 1.2
res_sparsity = 6 / 300
input_scaling = 0.1

#build ESN struct
esn = ESN(input_data;
variation = Default(),
reservoir = RandSparseReservoir(res_size, radius = res_radius, sparsity = res_sparsity),
input_layer = WeightedLayer(scaling = input_scaling),
esn = ESN(input_data, in_size, res_size;
reservoir = rand_sparse(; radius = res_radius, sparsity = res_sparsity),
input_layer = weighted_init(; scaling = input_scaling),
reservoir_driver = RNN(),
nla_type = NLADefault(),
states_type = StandardStates())
```

Most of the parameters chosen here mirror the default ones, so a direct call is not necessary. The readme example is identical to this one, except for the explicit call. Going line by line to see what is happening, starting from `res_size`: this value determines the dimensions of the reservoir matrix. In this case, a size of 300 has been chosen, so the reservoir matrix will be 300 x 300. This is not always the case, since some input layer constructions can modify the dimensions of the reservoir, but in that case, everything is taken care of internally.

The `res_radius` determines the scaling of the spectral radius of the reservoir matrix; a proper scaling is necessary to assure the Echo State Property. The default value in the `RandSparseReservoir()` method is 1.0 in accordance with the most commonly followed guidelines found in the literature (see [^2] and references therein). The `sparsity` of the reservoir matrix in this case is obtained by choosing a degree of connections and dividing that by the reservoir size. Of course, it is also possible to simply choose any value between 0.0 and 1.0 to test behaviors for different sparsity values. In this example, the call to the parameters inside `RandSparseReservoir()` was done explicitly to showcase the meaning of each of them, but it is also possible to simply pass the values directly, like so `RandSparseReservoir(1.2, 6/300)`.
The `res_radius` determines the scaling of the spectral radius of the reservoir matrix; a proper scaling is necessary to assure the Echo State Property. The default value in the `rand_sparse` method is 1.0 in accordance with the most commonly followed guidelines found in the literature (see [^2] and references therein). The `sparsity` of the reservoir matrix in this case is obtained by choosing a degree of connections and dividing that by the reservoir size. Of course, it is also possible to simply choose any value between 0.0 and 1.0 to test behaviors for different sparsity values.

The value of `input_scaling` determines the upper and lower bounds of the uniform distribution of the weights in the `WeightedLayer()`. Like before, this value can be passed either as an argument or as a keyword argument `WeightedLayer(0.1)`. The value of 0.1 represents the default. The default input layer is the `DenseLayer`, a fully connected layer. The details of the weighted version can be found in [^3], for this example, this version returns the best results.
The value of `input_scaling` determines the upper and lower bounds of the uniform distribution of the weights in the `weighted_init`. The value of 0.1 represents the default. The default input layer is the `scaled_rand`, a dense matrix. The details of the weighted version can be found in [^3], for this example, this version returns the best results.

The reservoir driver represents the dynamics of the reservoir. In the standard ESN definition, these dynamics are obtained through a Recurrent Neural Network (RNN), and this is reflected by calling the `RNN` driver for the `ESN` struct. This option is set as the default, and unless there is the need to change parameters, it is not needed. The full equation is the following:

Expand Down
44 changes: 36 additions & 8 deletions src/ReservoirComputing.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,21 @@
using MLJLinearModels
using NNlib
using Optim
using PartialFunctions
using Random
using SparseArrays
using Statistics
using WeightInitializers

export NLADefault, NLAT1, NLAT2, NLAT3
export StandardStates, ExtendedStates, PaddedStates, PaddedExtendedStates
export StandardRidge, LinearModel
export AbstractLayer, create_layer
export WeightedLayer, DenseLayer, SparseLayer, MinimumLayer, InformedLayer, NullLayer
export BernoulliSample, IrrationalSample
export AbstractReservoir, create_reservoir
export RandSparseReservoir, PseudoSVDReservoir, DelayLineReservoir
export DelayLineBackwardReservoir, SimpleCycleReservoir, CycleJumpsReservoir, NullReservoir
export scaled_rand, weighted_init, sparse_init, informed_init, minimal_init
export rand_sparse, delay_line, delay_line_backward, cycle_jumps, simple_cycle, pseudo_svd
export RNN, MRNN, GRU, GRUParams, FullyGated, Minimal
export ESN, Default, Hybrid, train
export ESN, train
export HybridESN, KnowledgeModel
export DeepESN
export RECA, train
export RandomMapping, RandomMaps
export Generative, Predictive, OutputLayer
Expand Down Expand Up @@ -72,6 +73,31 @@
Predictive(prediction_data, prediction_len)
end

#fallbacks for initializers
for initializer in (:rand_sparse, :delay_line, :delay_line_backward, :cycle_jumps,
:simple_cycle, :pseudo_svd,
:scaled_rand, :weighted_init, :sparse_init, :informed_init, :minimal_init)
NType = ifelse(initializer === :rand_sparse, Real, Number)
@eval function ($initializer)(dims::Integer...; kwargs...)
return $initializer(_default_rng(), Float32, dims...; kwargs...)
end
@eval function ($initializer)(rng::AbstractRNG, dims::Integer...; kwargs...)
return $initializer(rng, Float32, dims...; kwargs...)
end
@eval function ($initializer)(::Type{T},
dims::Integer...; kwargs...) where {T <: $NType}
return $initializer(_default_rng(), T, dims...; kwargs...)
end
@eval function ($initializer)(rng::AbstractRNG; kwargs...)
return __partial_apply($initializer, (rng, (; kwargs...)))
end
@eval function ($initializer)(rng::AbstractRNG,

Check warning on line 94 in src/ReservoirComputing.jl

View check run for this annotation

Codecov / codecov/patch

src/ReservoirComputing.jl#L94

Added line #L94 was not covered by tests
::Type{T}; kwargs...) where {T <: $NType}
return __partial_apply($initializer, ((rng, T), (; kwargs...)))

Check warning on line 96 in src/ReservoirComputing.jl

View check run for this annotation

Codecov / codecov/patch

src/ReservoirComputing.jl#L96

Added line #L96 was not covered by tests
end
@eval ($initializer)(; kwargs...) = __partial_apply($initializer, (; kwargs...))
end

#general
include("states.jl")
include("predict.jl")
Expand All @@ -84,7 +110,9 @@
include("esn/esn_input_layers.jl")
include("esn/esn_reservoirs.jl")
include("esn/esn_reservoir_drivers.jl")
include("esn/echostatenetwork.jl")
include("esn/esn.jl")
include("esn/deepesn.jl")
include("esn/hybridesn.jl")
include("esn/esn_predict.jl")

#reca
Expand Down
46 changes: 46 additions & 0 deletions src/esn/deepesn.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
struct DeepESN{I, S, N, T, O, M, B, ST, W, IS} <: AbstractEchoStateNetwork
res_size::I
train_data::S
nla_type::N
input_matrix::T
reservoir_driver::O
reservoir_matrix::M
bias_vector::B
states_type::ST
washout::W
states::IS
end

function DeepESN(train_data,
in_size::Int,
res_size::Int;
depth::Int = 2,
input_layer = fill(scaled_rand, depth),
bias = fill(zeros64, depth),
reservoir = fill(rand_sparse, depth),
reservoir_driver = RNN(),
nla_type = NLADefault(),
states_type = StandardStates(),
washout::Int = 0,
rng = _default_rng(),
T = Float64,
matrix_type = typeof(train_data))
if states_type isa AbstractPaddedStates
in_size = size(train_data, 1) + 1
train_data = vcat(Adapt.adapt(matrix_type, ones(1, size(train_data, 2))),

Check warning on line 30 in src/esn/deepesn.jl

View check run for this annotation

Codecov / codecov/patch

src/esn/deepesn.jl#L29-L30

Added lines #L29 - L30 were not covered by tests
train_data)
end

reservoir_matrix = [reservoir[i](rng, T, res_size, res_size) for i in 1:depth]
input_matrix = [i == 1 ? input_layer[i](rng, T, res_size, in_size) :
input_layer[i](rng, T, res_size, res_size) for i in 1:depth]
bias_vector = [bias[i](rng, res_size) for i in 1:depth]
inner_res_driver = reservoir_driver_params(reservoir_driver, res_size, in_size)
states = create_states(inner_res_driver, train_data, washout, reservoir_matrix,
input_matrix, bias_vector)
train_data = train_data[:, (washout + 1):end]

DeepESN(res_size, train_data, nla_type, input_matrix,
inner_res_driver, reservoir_matrix, bias_vector, states_type, washout,
states)
end
Loading
Loading