Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 7.79 GiB total capacity; 6.03 GiB already allocated; 10.94 MiB free; 6.17 GiB reserved in total by PyTorch) #20

Open
abdul-mannan-khan opened this issue Jul 10, 2021 · 1 comment

Comments

@abdul-mannan-khan
Copy link

I tried to follow this repository. However, when I entered

export CUDA_VISIBLE_DEVICES="0" && python3 main.py --is_sim --obj_mesh_dir objects/blocks --num_obj 4 --push_rewards --experience_replay --explore_rate_decay --check_row --tcp_port 19997 --place --future_reward_discount 0.65 --max_train_actions 20000 --random_actions --common_sense --trial_reward

I got following error

Traceback (most recent call last):
  File "main.py", line 1755, in <module>
    one_train_test_run(args)
  File "main.py", line 1563, in one_train_test_run
    training_base_directory, best_dict = main(args)
  File "main.py", line 1141, in main
    trainer.backprop(prev_color_heightmap, prev_valid_depth_heightmap, prev_primitive_action, prev_best_pix_ind, label_value, goal_condition=prev_goal_condition)
  File "/home/khan/cop_ws/src/good_robot/trainer.py", line 717, in backprop
    push_predictions, grasp_predictions, place_predictions, state_feat, output_prob = self.forward(color_heightmap, depth_heightmap, is_volatile=False, specific_rotation=best_pix_ind[0], goal_condition=goal_condition)
  File "/home/khan/cop_ws/src/good_robot/trainer.py", line 445, in forward
    output_prob, state_feat = self.model.forward(input_color_data, input_depth_data, is_volatile, specific_rotation, goal_condition=goal_condition)
  File "/home/khan/cop_ws/src/good_robot/models.py", line 246, in forward
    interm_push_feat, interm_grasp_feat, interm_place_feat, tiled_goal_condition = self.layers_forward(rotate_theta, input_color_data, input_depth_data, goal_condition, tiled_goal_condition)
  File "/home/khan/cop_ws/src/good_robot/models.py", line 301, in layers_forward
    interm_place_depth_feat = self.place_depth_trunk.features(rotate_depth)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/khan/anaconda3/envs/raisim_env/lib/python3.8/site-packages/torchvision/models/densenet.py", line 33, in forward
    new_features = super(_DenseLayer, self).forward(x)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 135, in forward
    return F.batch_norm(
  File "/home/khan/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 2149, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 7.79 GiB total capacity; 6.09 GiB already allocated; 28.69 MiB free; 6.26 GiB reserved in total by PyTorch)

After reading some blogs, I found some discussion here which pointed out that it is due to batch size and solution is possible if batch size is reduced. To do this, I tried trainer.py file and I found following section of the code

# Construct minibatch of size 1 (b,c,h,w)
        input_color_image.shape = (input_color_image.shape[0], input_color_image.shape[1], input_color_image.shape[2], 1)
        input_depth_image.shape = (input_depth_image.shape[0], input_depth_image.shape[1], input_depth_image.shape[2], 1)
        input_color_data = torch.from_numpy(input_color_image.astype(np.float32)).permute(3,2,0,1)
        input_depth_data = torch.from_numpy(input_depth_image.astype(np.float32)).permute(3,2,0,1)
        if self.flops:
            # sorry for the super random code here, but this is where we will check the
            # floating point operations (flops) counts and parameters counts for now...
            print('input_color_data trainer: ' + str(input_color_data.size()))
            class Wrapper(object):
                custom_params = {'input_color_data': input_color_data, 'input_depth_data': input_depth_data, 'goal_condition': goal_condition}
            def input_constructor(shape):
                return Wrapper.custom_params
            flops, params = get_model_complexity_info(self.model, color_heightmap.shape, as_strings=True, print_per_layer_stat=True, input_constructor=input_constructor)
            print('flops: ' + flops + ' params: ' + params)
            exit(0)
        # Pass input data through model
        output_prob, state_feat = self.model.forward(input_color_data, input_depth_data, is_volatile, specific_rotation, goal_condition=goal_condition)

I think I need to divide something to reduce batch size. Can anyone help please?

@ahmedhassen7
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants