Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subprocess.CalledProcessError: #20

Open
Dhanachandra opened this issue Jan 13, 2022 · 1 comment
Open

subprocess.CalledProcessError: #20

Dhanachandra opened this issue Jan 13, 2022 · 1 comment

Comments

@Dhanachandra
Copy link

I got the following error:
[2022-01-13 14:47:32,154] [INFO] [launch.py:131:sigkill_handler] Killing subprocess 2273 Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 167, in <module> main() File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 156, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/home/ubuntu/anaconda3/envs/gpt2_lm/lib/python3.8/site-packages/deepspeed/launcher/launch.py", line 137, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/home/ubuntu/anaconda3/envs/gpt2_lm/bin/python', '-u', 'run_clm.py', '--local_rank=0', '--deepspeed', 'ds_config.json', '--model_name_or_path', 'gpt2-xl', '--train_file', '../../dataset/train.txt', '--validation_file', '../../dataset/test.txt', '--do_train', '--do_eval', '--fp16', '--overwrite_cache', '--evaluation_strategy=steps', '--output_dir', 'finetuned', '--eval_steps', '500', '--num_train_epochs', '1', '--gradient_accumulation_steps', '2', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1']' died with <Signals.SIGKILL: 9>.

@IdoAmit198
Copy link

Having the same issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants