Overview
Theanswer_chemistry_query tool leverages the LlaSMol-Mistral-7B model to answer chemistry-related queries. It handles name conversions, property predictions, molecule descriptions, synthesis planning, and more.
Note: This tool requires properly tagged input from structure_chem_prompt and cannot be used when LOW_VRAM mode is enabled.
Function Signature
Parameters
The chemistry-related query string with properly tagged chemical identifiers using
<SMILES> and <IUPAC> tags.Response
The generated response containing the requested chemical information, property prediction, or synthesis plan.
"Error generating response: [error details]"
Supported Query Types
LlaSMol supports 14 distinct chemistry task types organized into categories:1. Name Conversion (4 tasks)
Convert an IUPAC name to its molecular formula.Example:
"What is the molecular formula of <IUPAC> 2,5-diphenyl-1,3-oxazole </IUPAC>?"Convert an IUPAC name to SMILES notation.Example:
"Please provide the SMILES representation for <IUPAC> 4-ethyl-4-methyloxolan-2-one </IUPAC>."Convert SMILES notation to IUPAC name.Example:
"Can you tell me the IUPAC name of <SMILES> C1CCOC1 </SMILES>?"Convert SMILES notation to molecular formula.Example:
"What is the molecular formula for <SMILES> S=P1(N(CCCl)CCCl)NCCCO1 </SMILES>?"2. Property Prediction (6 tasks)
Predict aqueous solubility of a molecule.Example:
"How soluble is <SMILES> CC(C)Cl </SMILES>?"Predict lipophilicity (octanol-water partition coefficient) of a molecule.Example:
"What is the lipophilicity of <SMILES> CCO </SMILES>?"Predict whether a molecule can penetrate the blood-brain barrier.Example:
"Can <SMILES> CC(C)Cc1ccc(cc1)C(C)C(=O)O </SMILES> cross the blood-brain barrier?"Predict clinical toxicity of a molecule.Example:
"Is <SMILES> CN1C=NC2=C1C(=O)N(C(=O)N2C)C </SMILES> toxic?"Predict HIV replication inhibition activity.Example:
"Does <SMILES> CC(=O)Nc1ccc(O)cc1 </SMILES> inhibit HIV?"Predict potential side effects of a molecule.Example:
"What are the side effects of <SMILES> CC(C)NCC(COc1ccccc1)O </SMILES>?"3. Molecule Description (2 tasks)
Generate a natural language description of a molecule’s structure and properties.Example:
"Describe this molecule: <SMILES> CCOC(=O)C1=CN=CN1[C@H](C)C1=CC=CC=C1 </SMILES>"Generate SMILES for a molecule matching a natural language description.Example:
"Generate a molecule that is a beta-blocker with moderate lipophilicity."4. Reaction Prediction (2 tasks)
Predict reaction products from given reactants.Example:
"What are the products of the reaction between <SMILES> CCO </SMILES> and <SMILES> CC(=O)O </SMILES>?"Predict reactants needed to synthesize a target molecule.Example:
"What reactants are needed to synthesize <SMILES> CC(=O)OCC </SMILES>?"LlaSMol Integration
The tool uses theLLM4Chem package with the following configuration:
Model Details
- Model: osunlp/LlaSMol-Mistral-7B
- Device: CUDA (GPU required)
- Framework: LLM4Chem generation pipeline
Usage Examples
Example 1: SMILES to IUPAC Conversion
Query:Example 2: Molecule Description
Query:Example 3: Solubility Prediction
Query:System Requirements
GPU and VRAM
This tool requires:- CUDA-capable GPU
- Sufficient VRAM to load LlaSMol-Mistral-7B (~14GB recommended)
LOW_VRAM Mode
WhenLOW_VRAM=True in the configuration:
Response Tracking
The tool stores responses in thellasmol_response module:
Error Handling
The tool handles several error conditions:- LOW_VRAM enabled: Raises RuntimeError
- Generation failure: Returns
"Error generating response: [error details]" - Invalid query format: May produce unexpected results if tags are missing
Best Practices
- Always preprocess queries: Use
structure_chem_promptfirst to tag chemical identifiers - Validate SMILES: Consider using
validate_smiles_rdkitto check SMILES validity before querying - Check VRAM: Ensure sufficient GPU memory is available
- Use specific queries: LlaSMol performs best with clear, specific questions matching the supported task types