You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DALI iterators do not support resetting before the end of the epoch
and will ignore such request.
"""
ifself._counter>self._size:
self._counter=self._counter%self._size
else:
logging.warning("DALI iterator does not support resetting while epoch is not finished. Ignoring...")
What we can see here is that self._counter cycles in a way that will cause that the number of iterations over a dataset is not going to be the same every time. For example, let's say that our dataset has 5 samples and that our batch_size is 4. In the epoch = 0, a data loader will make 2 iterations over a dataset (the first batch will have 4 samples, the second batch will also have 4 samples; I have no idea where do these 3 additional samples in the second batch come from, my guess is that they are padding the last batch to get full batch_size?). At this point the self._counter = 8, and before the epoch = 1 starts the reset method will reset self._counter to self._counter = 3. Now, in the epoch = 1 we will have just one iteration over a dataset (the first batch will have 4 samples and this will increase counter to self._counter = 7 which is larger than self._size and it will break the loop). So in the end, we had 2 iterations over a dataset in the first epoch, and then only 1 iteration in the second epoch. Is this behavior intended or is it a bug? My "dirty" fix for this would be just to reset the counter to zero. Then every epoch will have the same number of iterations.
The second question is about these "padded" samples to fill the last batch. How are they created, randomly chosen or something else?
The text was updated successfully, but these errors were encountered:
Hi!
I have two (somehow related) questions.
The first one is about the
reset
method.DeepLearningExamples/PyTorch/Detection/SSD/src/coco_pipeline.py
Lines 258 to 267 in 64ea93d
What we can see here is that
self._counter
cycles in a way that will cause that the number of iterations over a dataset is not going to be the same every time. For example, let's say that our dataset has 5 samples and that ourbatch_size
is 4. In theepoch = 0
, a data loader will make 2 iterations over a dataset (the first batch will have 4 samples, the second batch will also have 4 samples; I have no idea where do these 3 additional samples in the second batch come from, my guess is that they are padding the last batch to get fullbatch_size
?). At this point theself._counter = 8
, and before theepoch = 1
starts the reset method will resetself._counter
toself._counter = 3
. Now, in theepoch = 1
we will have just one iteration over a dataset (the first batch will have 4 samples and this will increase counter toself._counter = 7
which is larger thanself._size
and it will break the loop). So in the end, we had 2 iterations over a dataset in the first epoch, and then only 1 iteration in the second epoch. Is this behavior intended or is it a bug? My "dirty" fix for this would be just to reset the counter to zero. Then every epoch will have the same number of iterations.The second question is about these "padded" samples to fill the last batch. How are they created, randomly chosen or something else?
The text was updated successfully, but these errors were encountered: