Welcome to ChemAgent
ChemAgent is a Plan-and-Execute agent that leverages RDKit, LangGraph, and LLMs to handle chemistry-related tasks with optional RAG support from PubChem. This guide will help you run your first successful chemistry query.ChemAgent uses GPT-4o for planning and optional LlaSMol-Mistral-7B for specialized chemistry tasks. Make sure you have your OpenAI API key ready.
Your First Query
The simplest way to get started is to run a basic IUPAC to SMILES conversion query.Check the output
The
process_input() function returns a tuple containing:- result: The final answer to your query
- completed: Boolean indicating if the task completed successfully
- attempts: Number of replanning attempts made
- llasmol_response: Raw response from the LlaSMol model (if LOW_VRAM=False)
- llasmol_errors: Any validation errors encountered
- formatted_input: The structured input created by the planner
Running from Command Line
You can also run queries directly from the terminal:Common Query Examples
Using RAG for Enhanced Context
Enable PubChem RAG to provide additional context from PubChem database:Processing Images with GPT-4o
ChemAgent can extract chemistry information from images using GPT-4o’s vision capabilities:Supported Query Types
ChemAgent supports a wide range of chemistry tasks:Name Conversion
- IUPAC to Molecular Formula
- IUPAC to SMILES
- SMILES to IUPAC
- SMILES to Molecular Formula
Property Prediction
- Solubility (ESOL)
- LIPO (Lipophilicity)
- BBBP (Blood-brain barrier permeability)
- Clintox (Clinical toxicity)
- HIV activity
- Side Effects
Molecule Tasks
- Molecule Captioning
- Molecule Generation
- Molecule Description
Reaction Chemistry
- Forward Synthesis
- Retrosynthesis
Understanding the Agent Architecture
ChemAgent uses a Plan-and-Execute architecture:- Planner: Creates a step-by-step plan using GPT-4o
- Executor: Executes each step using specialized tools:
structure_chem_prompt: Tags IUPAC/SMILES informationanswer_chemistry_query: Processes queries using LlaSMol (if enabled)validate_smiles_rdkit: Validates SMILES output using RDKit
- Replanner: Updates the plan based on results and replan if needed
The agent automatically validates all SMILES outputs using RDKit to ensure chemical accuracy.
Low VRAM Mode
If you’re running on a system with limited VRAM (less than 15GB), the agent defaults to LOW_VRAM mode:- LlaSMol model is not loaded
- Only GPT-4o is used for all tasks
- Significantly reduced memory footprint
- Still provides accurate results for most queries
Tracking Results
All query results are automatically logged torun_logs.csv:
- Query text
- Number of attempts
- Completion status
- Validation errors (if any)
Next Steps
Installation Guide
Complete setup instructions for production use
Core Concepts
Learn about the agent architecture
API Reference
Detailed documentation of all functions and tools
Guides
More chemistry query examples and use cases
Troubleshooting
GraphRecursionError
GraphRecursionError
If you encounter a
GraphRecursionError, the agent exceeded the recursion limit (default: 50). This usually means the query is too complex or vague.Solution: Simplify your query or increase the recursion limit in the code.SMILES Validation Errors
SMILES Validation Errors
If SMILES validation fails, check the
llasmol_errors return value for details.Solution: The agent will automatically replan and try to fix the error. If it persists, the SMILES may be fundamentally invalid.Image Not Found
Image Not Found
If the image path is invalid, the agent will print a warning and ignore the image.Solution: Verify the image path exists and is accessible.
Low VRAM RuntimeError
Low VRAM RuntimeError
If you see “answer_chemistry_query tool cannot be used with LOW_VRAM enabled”, the agent tried to use LlaSMol when it’s disabled.Solution: Set
LOW_VRAM=False in plan_execute_agent/config.py and ensure you have ≥15GB VRAM.