You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello!
I am working on a model to predict neutron irradiation on materials, which basically feeds on large amounts of data, since using equations is not straightforward for this kind of predictions. The issue I encountered was that any training based on the dde.data.DataSet class would be really slow with my data, even when I changed batch_size to smaller values, but this wasn't the case with small datasets as the one in examples (dataset.train and dataset.test).
By further examining the Model class, I found that:
Firstly, the Model.batch_size parameter is not used, and even replaced by Model.train.batch_size in training
Secondly, when self.data.train_next_batch(self.batch_size)) is called, the data returned is exactly the same as the full input data. This comes from the DataSet class, having self.data.train_next_batch as a return of this full dataset, which should not be the case, since batch_size is passed to the function. Thankfully it's a simple fix, since a correct version of train_next_batch is defined in other Data classes such as Triple and can be easily adapted.
def train_next_batch(self, batch_size=None):
if batch_size is None:
return self.train_x, self.train_y
indices = self.train_sampler.get_next(batch_size)
return (
self.train_x[indices],
self.train_y[indices],
)
Finally, if I can make a suggestion, it would be helpful to have a mode for this DataSet class that loads the data in batches, instead of reading the whole dataset, since this can bloat the memory and crash the kernel.
I hope this helps, and thank you for the great work on this library, I am enjoying a lot learning from it!
The text was updated successfully, but these errors were encountered:
If you are not using equations you should try techniques suitable for purely data-driven surrogates. DeepXDE employs DeepONets for parametric learning. You could also try Fourier neural operators which is not available in DeepXDE.
Hello!
I am working on a model to predict neutron irradiation on materials, which basically feeds on large amounts of data, since using equations is not straightforward for this kind of predictions. The issue I encountered was that any training based on the dde.data.DataSet class would be really slow with my data, even when I changed batch_size to smaller values, but this wasn't the case with small datasets as the one in examples (dataset.train and dataset.test).
By further examining the Model class, I found that:
Original code found in DataSet:
Fix I used:
Finally, if I can make a suggestion, it would be helpful to have a mode for this DataSet class that loads the data in batches, instead of reading the whole dataset, since this can bloat the memory and crash the kernel.
I hope this helps, and thank you for the great work on this library, I am enjoying a lot learning from it!
The text was updated successfully, but these errors were encountered: