训练时 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format 求指导 #177

yueool · 2024-04-18T17:01:14Z

训练
python tasks/run.py --config=egs/datasets/x6/lm3d_radnerf_sr.yaml --exp_name=motion2video_nerf/may_head --reset

Traceback (most recent call last):
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 151, in fit
self.run_single_process(self.task)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 209, in run_single_process
self.restore_weights(checkpoint)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 476, in restore_weights
getattr(task_ref, k).load_state_dict(v, strict=True)
File "D:\GeneFacePlusPlus_py39\python\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RADNeRFwithSR:
size mismatch for blink_encoder.1.weight: copying a param with shape torch.Size([8, 32]) from checkpoint, the shape in current model is torch.Size([2, 32]).
size mismatch for blink_encoder.1.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([2]).
'pkill' 不是内部或外部命令，也不是可运行的程序
或批处理文件。
Traceback (most recent call last):
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 151, in fit
self.run_single_process(self.task)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 209, in run_single_process
self.restore_weights(checkpoint)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 476, in restore_weights
getattr(task_ref, k).load_state_dict(v, strict=True)
File "D:\GeneFacePlusPlus_py39\python\lib\site-packages\torch\nn\modules\module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RADNeRFwithSR:
size mismatch for blink_encoder.1.weight: copying a param with shape torch.Size([8, 32]) from checkpoint, the shape in current model is torch.Size([2, 32]).
size mismatch for blink_encoder.1.bias: copying a param with shape torch.Size([8]) from checkpoint, the shape in current model is torch.Size([2]).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\GeneFacePlusPlus_py39\tasks\run.py", line 28, in
run_task()
File "D:\GeneFacePlusPlus_py39\tasks\run.py", line 16, in run_task
task_cls.start()
File "D:\GeneFacePlusPlus_py39\utils\commons\base_task.py", line 272, in start
trainer.fit(cls)
File "D:\GeneFacePlusPlus_py39\utils\commons\trainer.py", line 156, in fit
subprocess.check_call(f'pkill -f "GeneFace_worker ({hparams["work_dir"]}"', shell=True)
File "D:\GeneFacePlusPlus_py39\python\lib\subprocess.py", line 373, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'pkill -f "GeneFace_worker (checkpoints/motion2video_nerf/may_head"' returned non-zero exit status 1.

我的环境：
WIN10 python39 torch2.0.1

卡在这里过不去了，求指导

abinggo · 2024-04-23T05:59:21Z

我也是这个问题，我采用不严格匹配的方式，具体的方法可以参考我的git

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练时 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format 求指导 #177

训练时 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format 求指导 #177

yueool commented Apr 18, 2024

abinggo commented Apr 23, 2024

训练时 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format 求指导 #177

训练时 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format 求指导 #177

Comments

yueool commented Apr 18, 2024

abinggo commented Apr 23, 2024