Overview
ChemAgent requires several environment variables to be configured before running the agent. These variables are loaded from a.env file in the root directory of the project.
Setup Instructions
1. Create .env File
The project usespython-dotenv to load environment variables. Create a .env file in the root directory:
The
.env file is automatically loaded using load_dotenv(override=True) in the agent scripts.2. Required Environment Variables
OpenAI API Key
ChemAgent uses OpenAI’s API for several components:- GPT-4o for the plan-and-execute agent (rdkit_agent.py:65)
- GPT-4o for structuring chemical prompts with tags (chem_tools.py:64)
- AsyncOpenAI client for RAG queries (rdkit_agent.py:268)
OPENAI_API_KEY
Usage Locations:
plan_execute_agent/rdkit_agent.py- ChatOpenAI LLM initializationplan_execute_agent/chem_tools.py- OpenAI client for structured outputsplan_execute_agent/pubchem_rag/llm_response.py- RAG query processing
.env
WandB Configuration (Optional)
For fine-tuning tasks, Weights & Biases (WandB) integration is available:.env
WandB variables are only required if you plan to fine-tune the LlaSMol models.
Environment Loading
The.env file is loaded in multiple locations:
Agent Scripts
plan_execute_agent/rdkit_agent.py:40-42plan_execute_agent/chem_tools.py:45
override=True flag ensures that environment variables in .env take precedence over system environment variables.
Distributed Training Variables
For fine-tuning with distributed training:These variables are typically set automatically by your distributed training launcher (e.g.,
torchrun, deepspeed) and don’t need manual configuration.Dependencies
The environment configuration requires:Verification
To verify your environment is configured correctly:Next Steps
VRAM Settings
Configure VRAM requirements for LlaSMol model
Model Selection
Choose and configure LlaSMol models