Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format 求指导 #177

Open
yueool opened this issue Apr 18, 2024 · 1 comment

Comments

@yueool
Copy link

yueool commented Apr 18, 2024

训练
python tasks/run.py --config=egs/datasets/x6/lm3d_radnerf_sr.yaml --exp_name=motion2video_nerf/may_head --reset


Traceback (most recent call last):
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 151, in fit
self.run_single_process(self.task)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 209, in run_single_process
self.restore_weights(checkpoint)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 476, in restore_weights
getattr(task_ref, k).load_state_dict(v, strict=True)
File "D:\GeneFacePlusPlus_py39\python\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RADNeRFwithSR:
size mismatch for blink_encoder.1.weight: copying a param with shape torch.Size([8, 32]) from checkpoint, the shape in current model is torch.Size([2, 32]).
size mismatch for blink_encoder.1.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([2]).
'pkill' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
Traceback (most recent call last):
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 151, in fit
self.run_single_process(self.task)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 209, in run_single_process
self.restore_weights(checkpoint)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 476, in restore_weights
getattr(task_ref, k).load_state_dict(v, strict=True)
File "D:\GeneFacePlusPlus_py39\python\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RADNeRFwithSR:
size mismatch for blink_encoder.1.weight: copying a param with shape torch.Size([8, 32]) from checkpoint, the shape in current model is torch.Size([2, 32]).
size mismatch for blink_encoder.1.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([2]).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\GeneFacePlusPlus_py39\tasks\run.py", line 28, in
run_task()
File "D:\GeneFacePlusPlus_py39\tasks\run.py", line 16, in run_task
task_cls.start()
File "D:\GeneFacePlusPlus_py39\utils\commons\base_task.py", line 272, in start
trainer.fit(cls)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 156, in fit
subprocess.check_call(f'pkill -f "GeneFace_worker ({hparams["work_dir"]}"', shell=True)
File "D:\GeneFacePlusPlus_py39\python\lib\subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'pkill -f "GeneFace_worker (checkpoints/motion2video_nerf/may_head"' returned non-zero exit status 1.

我的环境:
WIN10 python39 torch2.0.1

卡在这里过不去了,求指导

@abinggo
Copy link

abinggo commented Apr 23, 2024

我也是这个问题,我采用不严格匹配的方式,具体的方法可以参考我的git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants