You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great work. I ran into some issues when following the readme of Uni-Mol+ to generate the training dataset. And I hope to get some advice on that.
When I run python ../get_3d_lmdb.py train, it needs to take ~250 hrs (according to tqdm) to finish the dataset generation (number of CPU cores on our machine is 112). Then I used 10 molecules to test the speed and I found the speed of sequential processing and parallel processing is about the same. Then I narrow it down to the function rdkit_3d_gen (shown below), which blocks the speed up in multiprocessing.
Hi authors,
Thanks for the great work. I ran into some issues when following the readme of Uni-Mol+ to generate the training dataset. And I hope to get some advice on that.
When I run
python ../get_3d_lmdb.py train
, it needs to take ~250 hrs (according to tqdm) to finish the dataset generation (number of CPU cores on our machine is 112). Then I used 10 molecules to test the speed and I found the speed of sequential processing and parallel processing is about the same. Then I narrow it down to the function rdkit_3d_gen (shown below), which blocks the speed up in multiprocessing.If I comment out
AllChem.EmbedMolecule(mol, randomSeed=seed, maxAttempts=1000)
, the speed of parallel processing can become normal.I appreciate it if any suggestions on how to fix this issue. Looking forward to hearing from you.
Thank you
The text was updated successfully, but these errors were encountered: