Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different type of inputs by using UniMolRepr to generate embeddings. #217

Open
WzWang-2000 opened this issue Apr 19, 2024 · 2 comments
Open

Comments

@WzWang-2000
Copy link

When using unimol_tools, I found that this function can take a dictionary as input. However, when I tried to use it, I encountered the following error.

clf = UniMolRepr(data_type='molecule', remove_hs=False)
mol_dict = {}
mol_dict['atoms'] = atoms
mol_dict['coordinates'] = atoms_coord
clf.get_repr(mol_dict, return_atomic_reprs=True)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], [line 9](vscode-notebook-cell:?execution_count=2&line=9)
      [7](vscode-notebook-cell:?execution_count=2&line=7) mol_dict['atoms'] = atoms
      [8](vscode-notebook-cell:?execution_count=2&line=8) mol_dict['coordinates'] = atoms_coord
----> [9](vscode-notebook-cell:?execution_count=2&line=9) clf.get_repr(mol_dict, return_atomic_reprs=True)

File [~/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py:84](https://vscode-remote+ssh-002dremote-002b207.vscode-resource.vscode-cdn.net/home/wwz2000/NeuralPLexer/~/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py:84), in UniMolRepr.get_repr(self, data, return_atomic_reprs)
     [82](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=81) else:
     [83](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=82)     raise ValueError('Unknown data type: {}'.format(type(data)))
---> [84](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=83) datahub = DataHub(data=data, 
     [85](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=84)                  task='repr', 
     [86](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=85)                  is_train=False, 
     [87](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=86)                  **self.params,
     [88](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=87)                 )
     [89](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=88) dataset = MolDataset(datahub.data['unimol_input'])
     [90](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/predictor.py?line=89) self.trainer = Trainer(task='repr')

File [~/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py:42](https://vscode-remote+ssh-002dremote-002b207.vscode-resource.vscode-cdn.net/home/wwz2000/NeuralPLexer/~/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py:42), in DataHub.__init__(self, data, is_train, save_path, **params)
     [40](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py?line=39) self.multiclass_cnt = params.get('multiclass_cnt', None)
     [41](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py?line=40) self.ss_method = params.get('target_normalize', 'none')
---> [42](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py?line=41) self._init_data(**params)

File [~/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py:55](https://vscode-remote+ssh-002dremote-002b207.vscode-resource.vscode-cdn.net/home/wwz2000/NeuralPLexer/~/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py:55), in DataHub._init_data(self, **params)
     [44](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datahub.py?line=43) def _init_data(self, **params):
...
     [70](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datareader.py?line=69)     data = pd.DataFrame(data).rename(columns={smiles_col: 'SMILES'})
     [72](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datareader.py?line=71) elif isinstance(data, list):
     [73](file:///home/wwz2000/miniconda3/lib/python3.9/site-packages/unimol_tools-1.0.0-py3.9.egg/unimol_tools/data/datareader.py?line=72)     # load from smiles list

KeyError: 'target'

I believe the issue may lie in the datareader.py file. Indenting the '_= data.pop('target')' line seems to resolve the problem, but I am unsure if this is the correct approach.
1713499660019

@Naplessss
Copy link
Contributor

Thank you for pointing this out. This issue could potentially be due to the Python version in relation to data.pop. We will address and fix this issue as soon as possible. Alternatively, you are welcome to submit a pull request for this fix.

@emotionor
Copy link
Contributor

We have fixed the bug in !214

data.pop('target', None) will return None, and will not raise an KeyError

Please make sure that the code you are currently using has been updated to the latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants