Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Torchbench models that don't run in dynamo runners #1901

Open
msaroufim opened this issue Sep 15, 2023 · 1 comment
Open

Torchbench models that don't run in dynamo runners #1901

msaroufim opened this issue Sep 15, 2023 · 1 comment

Comments

@msaroufim
Copy link
Member

msaroufim commented Sep 15, 2023

There's small nuances in how the dynamo runners benchmark models that can make certain torchbench models fail

Some models might be explicitly skipped, others might fail because of some dtype conversion. This can be frustrating because if you add a model to torchbench like clip or cm3leon you won't see it in the pt2dashboard so creating this giant tracker issue to solve this

To repro: look at logs in HUD for e.g https://ossci-raw-job-status.s3.amazonaws.com/log/16535270177 and compare to model names in models/ and canary_models/

If something is showing up in unique to torchbench that means it's not showing up in the pt2 dashboard

There are some concrete things we could do better in dynamo runners like starting with loudly erroring but should also track what these failures are

Notably I found this problem out after investigating stable diffusion and cm3leon

@msaroufim
Copy link
Member Author

msaroufim commented Sep 16, 2023

my dedup script has some mistakes but is mostly right but just keeping this here now so i can preprocess

Torchbench models Dynamo runner models Unique to Torchbench Unique to Dynamo
clip hf_T5_generate clip
diffuser_instruct_pix2pix hf_T5_large diffuser_instruct_pix2pix
fambench_dlrm hf_Whisper fambench_dlrm
gat lennard_jones gat
gcn llama gcn
hf_GPT2_generate llama_v2_7b_16h hf_GPT2_generate
hf_MPT_7b_instruct maml_omniglot hf_MPT_7b_instruct
lit_llama mnasnet1_0 lit_llama
lit_llama_generate mobilenet_v2 lit_llama_generate
lit_llama_lora mobilenet_v3_large lit_llama_lora
llama_v2_13b moco llama_v2_13b
llama_v2_70b nanogpt_generate llama_v2_70b
llama_v2_7b nvidia_deeprecommender llama_v2_7b
sage opacus_cifar10 sage
torchrec_dlrm phlippe_densenet torchrec_dlrm
BERT_pytorch phlippe_resnet BERT_pytorch
Background_Matting pyhpc_equation_of_state Background_Matting
DALLE2_pytorch pyhpc_isoneutral_mixing DALLE2_pytorch
LearningToPaint pyhpc_turbulent_kinetic_energy LearningToPaint
Super_SloMo pytorch_CycleGAN_and_pix2pix Super_SloMo
alexnet pytorch_stargan alexnet pytorch_stargan
basic_gnn_edgecnn hf_T5_generate basic_gnn_edgecnn
basic_gnn_gcn hf_T5_large basic_gnn_gcn
basic_gnn_gin hf_Whisper basic_gnn_gin
basic_gnn_sage lennard_jones basic_gnn_sage
cm3leon_generate llama cm3leon_generate
dcgan llama_v2_7b_16h dcgan
demucs maml_omniglot demucs
densenet121 mnasnet1_0 densenet121
detectron2_fasterrcnn_r_101_c4 mobilenet_v2 detectron2_fasterrcnn_r_101_c4
detectron2_fasterrcnn_r_101_dc5 mobilenet_v3_large detectron2_fasterrcnn_r_101_dc5
detectron2_fasterrcnn_r_101_fpn moco detectron2_fasterrcnn_r_101_fpn
detectron2_fasterrcnn_r_50_c4 nanogpt_generate detectron2_fasterrcnn_r_50_c4
detectron2_fasterrcnn_r_50_dc5 nvidia_deeprecommender detectron2_fasterrcnn_r_50_dc5
detectron2_fasterrcnn_r_50_fpn opacus_cifar10 detectron2_fasterrcnn_r_50_fpn
detectron2_fcos_r_50_fpn phlippe_densenet detectron2_fcos_r_50_fpn
detectron2_maskrcnn phlippe_resnet detectron2_maskrcnn
detectron2_maskrcnn_r_101_c4 pyhpc_equation_of_state detectron2_maskrcnn_r_101_c4
detectron2_maskrcnn_r_101_fpn pyhpc_isoneutral_mixing detectron2_maskrcnn_r_101_fpn
detectron2_maskrcnn_r_50_c4 pyhpc_turbulent_kinetic_energy detectron2_maskrcnn_r_50_c4
detectron2_maskrcnn_r_50_fpn pytorch_CycleGAN_and_pix2pix detectron2_maskrcnn_r_50_fpn
dlrm pytorch_stargan dlrm pytorch_stargan
doctr_det_predictor hf_T5_generate doctr_det_predictor
doctr_reco_predictor hf_T5_large doctr_reco_predictor
drq hf_Whisper drq
fambench_xlmr lennard_jones fambench_xlmr
fastNLP_Bert llama fastNLP_Bert
functorch_dp_cifar10 llama_v2_7b_16h functorch_dp_cifar10
functorch_maml_omniglot maml_omniglot functorch_maml_omniglot
hf_Albert mnasnet1_0 hf_Albert
hf_Bart mobilenet_v2 hf_Bart
hf_Bert mobilenet_v3_large hf_Bert
hf_Bert_large moco hf_Bert_large
hf_BigBird nanogpt_generate hf_BigBird
hf_DistilBert nvidia_deeprecommender hf_DistilBert
hf_GPT2 opacus_cifar10 hf_GPT2
hf_GPT2_large phlippe_densenet hf_GPT2_large
hf_Longformer phlippe_resnet hf_Longformer
hf_Reformer pyhpc_equation_of_state hf_Reformer
hf_T5 pyhpc_isoneutral_mixing hf_T5
hf_T5_base pytorch_CycleGAN_and_pix2pix
hf_T5_generate pytorch_stargan pytorch_stargan
hf_T5_large hf_T5_generate
hf_Whisper hf_T5_large
lennard_jones hf_Whisper
llama lennard_jones
llama_v2_7b_16h llama
maml llama_v2_7b_16h maml
maml_omniglot maml_omniglot
mnasnet1_0 mnasnet1_0
mobilenet_v2 mobilenet_v2
mobilenet_v2_quantized_qat mobilenet_v3_large mobilenet_v2_quantized_qat
mobilenet_v3_large moco
moco nanogpt_generate
nanogpt_generate nvidia_deeprecommender
nvidia_deeprecommender opacus_cifar10
opacus_cifar10 phlippe_densenet
phlippe_densenet phlippe_resnet
phlippe_resnet pyhpc_equation_of_state
pyhpc_equation_of_state pyhpc_isoneutral_mixing
pyhpc_isoneutral_mixing pyhpc_turbulent_kinetic_energy
pyhpc_turbulent_kinetic_energy pytorch_CycleGAN_and_pix2pix
pytorch_CycleGAN_and_pix2pix pytorch_stargan pytorch_stargan
pytorch_unet hf_T5_large pytorch_unet
resnet152 hf_Whisper resnet152
resnet18 lennard_jones resnet18
resnet50 llama resnet50
resnet50_quantized_qat llama_v2_7b_16h resnet50_quantized_qat
resnext50_32x4d maml_omniglot resnext50_32x4d
sam mnasnet1_0 sam
shufflenet_v2_x1_0 mobilenet_v2 shufflenet_v2_x1_0
simple_gpt mobilenet_v3_large simple_gpt
soft_actor_critic moco soft_actor_critic
speech_transformer nanogpt_generate speech_transformer
squeezenet1_1 nvidia_deeprecommender squeezenet1_1
stable_diffusion_text_encoder opacus_cifar10 stable_diffusion_text_encoder
stable_diffusion_unet phlippe_densenet stable_diffusion_unet
tacotron2 phlippe_resnet tacotron2
timm_efficientdet pyhpc_equation_of_state timm_efficientdet
timm_efficientnet pyhpc_isoneutral_mixing timm_efficientnet
timm_nfnet pyhpc_turbulent_kinetic_energy timm_nfnet
timm_regnet pytorch_CycleGAN_and_pix2pix timm_regnet
timm_resnest pytorch_stargan timm_resnest pytorch_stargan
timm_vision_transformer hf_T5_base timm_vision_transformer
timm_vision_transformer_large hf_T5_generate timm_vision_transformer_large
timm_vovnet hf_T5_large timm_vovnet
tts_angular hf_Whisper tts_angular
vgg16 lennard_jones vgg16
vision_maskrcnn llama vision_maskrcnn
yolov3 llama_v2_7b_16h yolov3
maml_omniglot
mnasnet1_0
mobilenet_v2
mobilenet_v3_large
moco
nanogpt_generate
nvidia_deeprecommender
opacus_cifar10
phlippe_densenet
phlippe_resnet
pyhpc_equation_of_state
pyhpc_isoneutral_mixing
pyhpc_turbulent_kinetic_energy
pytorch_CycleGAN_and_pix2pix
pytorch_stargan pytorch_stargan
hf_T5_base
hf_T5_generate
hf_T5_large
hf_Whisper
lennard_jones
llama
llama_v2_7b_16h

pytorchmergebot pushed a commit to pytorch/pytorch that referenced this issue Sep 19, 2023
Helps debug pytorch/benchmark#1901

I will wait until the ONNX beartype sev is fixed before merging

Pull Request resolved: #109536
Approved by: https://github.com/xuzhao9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant