Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch) nvrtc compilation failed: #38

Open
b0b6a opened this issue Mar 11, 2024 · 3 comments

Comments

@b0b6a
Copy link

b0b6a commented Mar 11, 2024

Hello,

when i run the following command -
python inference_for_demo_video.py
--wav_path data/audio/acknowledgement_chinese.m4a
--style_clip_path data/style_clip/3DMM/M030_front_surprised_level3_001.mat
--pose_path data/pose/RichardShelby_front_neutral_level1_001.mat
--image_path data/src_img/cropped/zp1.png
--disable_img_crop
--cfg_scale 1.0
--max_gen_len 30
--output_name acknowledgement_chinese@M030_front_surprised_level3_001@zp1

I get the error
ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
configuration: --prefix=/tmp/build/80754af9/ffmpeg_1587154242452/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho --cc=/tmp/build/80754af9/ffmpeg_1587154242452/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264
libavutil 56. 31.100 / 56. 31.100
libavcodec 58. 54.100 / 58. 54.100
libavformat 58. 29.100 / 58. 29.100
libavdevice 58. 8.100 / 58. 8.100
libavfilter 7. 57.100 / 7. 57.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 5.100 / 5. 5.100
libswresample 3. 5.100 / 3. 5.100
libpostproc 55. 5.100 / 55. 5.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'data/audio/acknowledgement_english.m4a':
Metadata:
major_brand : M4A
minor_version : 0
compatible_brands: M4A isommp42
creation_time : 2023-12-20T14:25:20.000000Z
iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Duration: 00:00:16.57, start: 0.044000, bitrate: 246 kb/s
Stream #0:0(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 244 kb/s (default)
Metadata:
creation_time : 2023-12-20T14:25:20.000000Z
handler_name : Core Media Audio
Stream mapping:
Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
-async is forwarded to lavfi similarly to -af aresample=async=1:min_hard_comp=0.100000:first_pts=0.
Output #0, wav, to 'tmp/acknowledgement_english@M030_front_neutral_level1_001@male_face--device=cpu/acknowledgement_english@M030_front_neutral_level1_001@male_face--device=cpu_16K.wav':
Metadata:
major_brand : M4A
minor_version : 0
compatible_brands: M4A isommp42
iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ISFT : Lavf58.29.100
Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s (default)
Metadata:
creation_time : 2023-12-20T14:25:20.000000Z
handler_name : Core Media Audio
encoder : Lavc58.54.100 pcm_s16le
size= 518kB time=00:00:16.57 bitrate= 256.0kbits/s speed=1.23e+03x
video:0kB audio:518kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.014706%
Some weights of the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-english were not used when initializing Wav2Vec2Model: ['lm_head.weight', 'lm_head.bias']

  • This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Traceback (most recent call last):
    File "inference_for_demo_video.py", line 224, in
    max_audio_len=args.max_gen_len,
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
    File "inference_for_demo_video.py", line 105, in inference_one_video
    ddim_num_step=ddim_num_step,
    File "/home/ziyang/dreamtalk-main/core/networks/diffusion_net.py", line 226, in sample
    ready_style_code=ready_style_code,
    File "/home/ziyang/dreamtalk-main/core/networks/diffusion_net.py", line 170, in ddim_sample
    x_t_double, t=t_tensor_double, **context_double
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
    File "/home/ziyang/dreamtalk-main/core/networks/diffusion_util.py", line 126, in forward
    style_code = self.style_encoder(style_clip, style_pad_mask)
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
    File "/home/ziyang/dreamtalk-main/core/networks/generator.py", line 193, in forward
    style_code = self.aggregate_method(permute_style, pad_mask)
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
    File "/home/ziyang/dreamtalk-main/core/networks/self_attention_pooling.py", line 31, in forward
    att_logits = self.W(batch_rep).squeeze(-1)
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
    File "/home/ziyang/anaconda3/envs/dt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
    File "/home/ziyang/dreamtalk-main/core/networks/mish.py", line 51, in forward
    return mish(input)
    RuntimeError: nvrtc: error: invalid value for --gpu-architecture (-arch)
    nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)

template
device T maximum(T a, T b) {
return isnan(a) ? a : (a > b ? a : b);
}

template
device T minimum(T a, T b) {
return isnan(a) ? a : (a < b ? a : b);
}

extern "C" global
void fused_tanh_mul(float* t0, float* t1, float* aten_mul) {
{
float v = __ldg(t0 + (((512 * blockIdx.x + threadIdx.x) / 65536) * 65536 + 256 * (((512 * blockIdx.x + threadIdx.x) / 256) % 256)) + (512 * blockIdx.x + threadIdx.x) % 256);
float v_1 = __ldg(t1 + (((512 * blockIdx.x + threadIdx.x) / 65536) * 65536 + 256 * (((512 * blockIdx.x + threadIdx.x) / 256) % 256)) + (512 * blockIdx.x + threadIdx.x) % 256);
aten_mul[(((512 * blockIdx.x + threadIdx.x) / 65536) * 65536 + 256 * (((512 * blockIdx.x + threadIdx.x) / 256) % 256)) + (512 * blockIdx.x + threadIdx.x) % 256] = v * (tanhf(v_1));
}
}

Is there a problem with my GPU? I have no way to solve it.

Best regards,
Ziyang Jiao

@murphytju
Copy link

I meet the same problem but can't solve it yet. If you get any solutions, can you updata in this issue? Thanks.

@b0b6a
Copy link
Author

b0b6a commented Mar 11, 2024

@murphytju
Of course, but I am still working on it

@SuperMaximus1984
Copy link

Same problem here. I think it might be related to Torch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants