We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python train.py ExtremeNet loading all datasets... using 4 threads loading from cache file: ./cache/coco_extreme_train2017.pkl loading annotations into memory... Done (t=12.73s) creating index... index created! loading from cache file: ./cache/coco_extreme_train2017.pkl loading annotations into memory... Done (t=12.93s) creating index... index created! loading from cache file: ./cache/coco_extreme_train2017.pkl loading annotations into memory... Done (t=10.87s) creating index... index created! loading from cache file: ./cache/coco_extreme_train2017.pkl loading annotations into memory... Done (t=15.55s) creating index... index created! system config... {'batch_size': 24, 'cache_dir': './cache', 'chunk_sizes': [4, 5, 5, 5, 5], 'config_dir': './config', 'data_dir': './data', 'data_rng': <mtrand.RandomState object at 0x7f87c7ffa480>, 'dataset': 'MSCOCOExtreme', 'decay_rate': 10, 'display': 5, 'learning_rate': 0.00025, 'max_iter': 250000, 'nnet_rng': <mtrand.RandomState object at 0x7f87c7ffa4c8>, 'opt_algo': 'adam', 'prefetch_size': 10, 'pretrain': './cache/CornerNet_500000.pkl', 'result_dir': './results', 'sampling_function': 'kp_detection', 'snapshot': 50000, 'snapshot_name': 'ExtremeNet', 'stepsize': 200000, 'test_split': 'testdev', 'train_split': 'train', 'val_iter': 100, 'val_split': 'val', 'weight_decay': False, 'weight_decay_rate': 1e-05, 'weight_decay_type': 'l2'} db config... {'ae_threshold': 0.5, 'aggr_weight': 0.1, 'border': 128, 'categories': 80, 'center_thresh': 0.1, 'data_aug': True, 'gaussian_bump': True, 'gaussian_iou': 0.7, 'gaussian_radius': -1, 'input_size': [511, 511], 'lighting': True, 'max_per_image': 100, 'merge_bbox': False, 'nms_algorithm': 'exp_soft_nms', 'nms_kernel': 3, 'nms_threshold': 0.5, 'output_sizes': [[128, 128]], 'rand_color': True, 'rand_crop': True, 'rand_pushes': False, 'rand_samples': False, 'rand_scale_max': 1.4, 'rand_scale_min': 0.6, 'rand_scale_step': 0.1, 'rand_scales': array([0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3]), 'scores_thresh': 0.1, 'special_crop': False, 'suppres_ghost': True, 'test_scales': [1], 'top_k': 40, 'weight_exp': 8} len of db: 118287 start prefetching data... shuffling indices... start prefetching data... start prefetching data... shuffling indices... shuffling indices... building model... module_file: models.ExtremeNet start prefetching data... shuffling indices... total parameters: 198531504 loading from pretrained model loading from ./cache/CornerNet_500000.pkl setting learning rate to: 0.00025 training start... 0%| | 0/250000 [00:00<?, ?it/s] Traceback (most recent call last): File "train.py", line 225, in train(training_dbs, None, args.start_iter, args.debug) File "train.py", line 159, in train training_loss = nnet.train(**training) File "/home/rencong/ExtremeNet/nnet/py_factory.py", line 81, in train loss = self.network(xs, ys) File "/home/rencong/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, **kwargs) File "/home/rencong/ExtremeNet/models/py_utils/data_parallel.py", line 66, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids, self.chunk_sizes) File "/home/rencong/ExtremeNet/models/py_utils/data_parallel.py", line 77, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim, chunk_sizes=self.chunk_sizes) File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 30, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim, chunk_sizes) if inputs else [] File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 25, in scatter return scatter_map(inputs) File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 18, in scatter_map return list(zip(map(scatter_map, obj))) File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 20, in scatter_map return list(map(list, zip(map(scatter_map, obj)))) File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 15, in scatter_map return Scatter.apply(target_gpus, chunk_sizes, dim, obj) File "/home/rencong/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 89, in forward outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams) File "/home/rencong/anaconda3/lib/python3.6/site-packages/torch/cuda/comm.py", line 148, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: CUDA error: invalid device ordinal (exchangeDevice at /opt/conda/conda-bld/pytorch_1550802451070/work/aten/src/ATen/cuda/detail/CUDAGuardImpl.h:28) frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6d (0x7f8821feb69d in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libc10.so) frame #1: + 0x4f223c (0x7f881f16d23c in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #2: + 0x5fc38e (0x7f87fbb9638e in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so) frame #3: + 0x739e55 (0x7f87fbcd3e55 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so) frame #4: at::TypeDefault::copy(at::Tensor const&, bool, c10::optionalc10::Device) const + 0x74 (0x7f87fbe4f204 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so) frame #5: at::native::to(at::Tensor const&, at::TensorOptions const&, bool, bool) + 0xc6d (0x7f87fbc327fd in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so) frame #6: at::TypeDefault::to(at::Tensor const&, at::TensorOptions const&, bool, bool) const + 0x2c (0x7f87fbe0bcbc in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so) frame #7: torch::autograd::VariableType::to(at::Tensor const&, at::TensorOptions const&, bool, bool) const + 0x19c (0x7f87fe532e1c in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch.so.1) frame #8: torch::cuda::scatter(at::Tensor const&, c10::ArrayRef, c10::optional<std::vector<long, std::allocator > > const&, long, c10::optional<std::vector<c10::optionalat::cuda::CUDAStream, std::allocator<c10::optionalat::cuda::CUDAStream > > > const&) + 0x7a8 (0x7f881f183da8 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #9: + 0x5124de (0x7f881f18d4de in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #10: + 0xfd760 (0x7f881ed78760 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so) frame #21: THPFunction_apply(_object, _object) + 0x6ad (0x7f881ef7482d in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
terminate called without an active exception Aborted (core dumped)
The text was updated successfully, but these errors were encountered:
Do you solve this question?
Sorry, something went wrong.
No branches or pull requests
python train.py ExtremeNet
loading all datasets...
using 4 threads
loading from cache file: ./cache/coco_extreme_train2017.pkl
loading annotations into memory...
Done (t=12.73s)
creating index...
index created!
loading from cache file: ./cache/coco_extreme_train2017.pkl
loading annotations into memory...
Done (t=12.93s)
creating index...
index created!
loading from cache file: ./cache/coco_extreme_train2017.pkl
loading annotations into memory...
Done (t=10.87s)
creating index...
index created!
loading from cache file: ./cache/coco_extreme_train2017.pkl
loading annotations into memory...
Done (t=15.55s)
creating index...
index created!
system config...
{'batch_size': 24,
'cache_dir': './cache',
'chunk_sizes': [4, 5, 5, 5, 5],
'config_dir': './config',
'data_dir': './data',
'data_rng': <mtrand.RandomState object at 0x7f87c7ffa480>,
'dataset': 'MSCOCOExtreme',
'decay_rate': 10,
'display': 5,
'learning_rate': 0.00025,
'max_iter': 250000,
'nnet_rng': <mtrand.RandomState object at 0x7f87c7ffa4c8>,
'opt_algo': 'adam',
'prefetch_size': 10,
'pretrain': './cache/CornerNet_500000.pkl',
'result_dir': './results',
'sampling_function': 'kp_detection',
'snapshot': 50000,
'snapshot_name': 'ExtremeNet',
'stepsize': 200000,
'test_split': 'testdev',
'train_split': 'train',
'val_iter': 100,
'val_split': 'val',
'weight_decay': False,
'weight_decay_rate': 1e-05,
'weight_decay_type': 'l2'}
db config...
{'ae_threshold': 0.5,
'aggr_weight': 0.1,
'border': 128,
'categories': 80,
'center_thresh': 0.1,
'data_aug': True,
'gaussian_bump': True,
'gaussian_iou': 0.7,
'gaussian_radius': -1,
'input_size': [511, 511],
'lighting': True,
'max_per_image': 100,
'merge_bbox': False,
'nms_algorithm': 'exp_soft_nms',
'nms_kernel': 3,
'nms_threshold': 0.5,
'output_sizes': [[128, 128]],
'rand_color': True,
'rand_crop': True,
'rand_pushes': False,
'rand_samples': False,
'rand_scale_max': 1.4,
'rand_scale_min': 0.6,
'rand_scale_step': 0.1,
'rand_scales': array([0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3]),
'scores_thresh': 0.1,
'special_crop': False,
'suppres_ghost': True,
'test_scales': [1],
'top_k': 40,
'weight_exp': 8}
len of db: 118287
start prefetching data...
shuffling indices...
start prefetching data...
start prefetching data...
shuffling indices...
shuffling indices...
building model...
module_file: models.ExtremeNet
start prefetching data...
shuffling indices...
total parameters: 198531504
loading from pretrained model
loading from ./cache/CornerNet_500000.pkl
setting learning rate to: 0.00025
training start...
0%| | 0/250000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 225, in
train(training_dbs, None, args.start_iter, args.debug)
File "train.py", line 159, in train
training_loss = nnet.train(**training)
File "/home/rencong/ExtremeNet/nnet/py_factory.py", line 81, in train
loss = self.network(xs, ys)
File "/home/rencong/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/rencong/ExtremeNet/models/py_utils/data_parallel.py", line 66, in forward
inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids, self.chunk_sizes)
File "/home/rencong/ExtremeNet/models/py_utils/data_parallel.py", line 77, in scatter
return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim, chunk_sizes=self.chunk_sizes)
File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 30, in scatter_kwargs
inputs = scatter(inputs, target_gpus, dim, chunk_sizes) if inputs else []
File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 25, in scatter
return scatter_map(inputs)
File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 18, in scatter_map
return list(zip(map(scatter_map, obj)))
File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 20, in scatter_map
return list(map(list, zip(map(scatter_map, obj))))
File "/home/rencong/ExtremeNet/models/py_utils/scatter_gather.py", line 15, in scatter_map
return Scatter.apply(target_gpus, chunk_sizes, dim, obj)
File "/home/rencong/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/_functions.py", line 89, in forward
outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams)
File "/home/rencong/anaconda3/lib/python3.6/site-packages/torch/cuda/comm.py", line 148, in scatter
return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams))
RuntimeError: CUDA error: invalid device ordinal (exchangeDevice at /opt/conda/conda-bld/pytorch_1550802451070/work/aten/src/ATen/cuda/detail/CUDAGuardImpl.h:28)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6d (0x7f8821feb69d in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: + 0x4f223c (0x7f881f16d23c in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #2: + 0x5fc38e (0x7f87fbb9638e in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #3: + 0x739e55 (0x7f87fbcd3e55 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #4: at::TypeDefault::copy(at::Tensor const&, bool, c10::optionalc10::Device) const + 0x74 (0x7f87fbe4f204 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #5: at::native::to(at::Tensor const&, at::TensorOptions const&, bool, bool) + 0xc6d (0x7f87fbc327fd in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #6: at::TypeDefault::to(at::Tensor const&, at::TensorOptions const&, bool, bool) const + 0x2c (0x7f87fbe0bcbc in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libcaffe2.so)
frame #7: torch::autograd::VariableType::to(at::Tensor const&, at::TensorOptions const&, bool, bool) const + 0x19c (0x7f87fe532e1c in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch.so.1)
frame #8: torch::cuda::scatter(at::Tensor const&, c10::ArrayRef, c10::optional<std::vector<long, std::allocator > > const&, long, c10::optional<std::vector<c10::optionalat::cuda::CUDAStream, std::allocator<c10::optionalat::cuda::CUDAStream > > > const&) + 0x7a8 (0x7f881f183da8 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #9: + 0x5124de (0x7f881f18d4de in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #10: + 0xfd760 (0x7f881ed78760 in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #21: THPFunction_apply(_object, _object) + 0x6ad (0x7f881ef7482d in /home/rencong/anaconda3/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
terminate called without an active exception
Aborted (core dumped)
The text was updated successfully, but these errors were encountered: