Environment Variables

Overview

ChemAgent requires several environment variables to be configured before running the agent. These variables are loaded from a .env file in the root directory of the project.

Setup Instructions

1. Create .env File

The project uses python-dotenv to load environment variables. Create a .env file in the root directory:

cp .env.example .env

The .env file is automatically loaded using load_dotenv(override=True) in the agent scripts.

2. Required Environment Variables

OpenAI API Key

ChemAgent uses OpenAI’s API for several components:

GPT-4o for the plan-and-execute agent (rdkit_agent.py:65)
GPT-4o for structuring chemical prompts with tags (chem_tools.py:64)
AsyncOpenAI client for RAG queries (rdkit_agent.py:268)

Variable Name: OPENAI_API_KEY Usage Locations:

plan_execute_agent/rdkit_agent.py - ChatOpenAI LLM initialization
plan_execute_agent/chem_tools.py - OpenAI client for structured outputs
plan_execute_agent/pubchem_rag/llm_response.py - RAG query processing

Configuration:

.env

OPENAI_API_KEY=your_api_key_here

The OpenAI API key is required for the agent to function. Without it, the agent will fail to initialize.

WandB Configuration (Optional)

For fine-tuning tasks, Weights & Biases (WandB) integration is available:

.env

WANDB_PROJECT=your_project_name
WANDB_WATCH=gradients
WANDB_LOG_MODEL=checkpoint

Usage: LLM4Chem/finetune.py for tracking fine-tuning experiments

WandB variables are only required if you plan to fine-tune the LlaSMol models.

Environment Loading

The .env file is loaded in multiple locations:

Agent Scripts

from dotenv import load_dotenv

load_dotenv(override=True)  # Loads .env file

Locations:

plan_execute_agent/rdkit_agent.py:40-42
plan_execute_agent/chem_tools.py:45

The override=True flag ensures that environment variables in .env take precedence over system environment variables.

Distributed Training Variables

For fine-tuning with distributed training:

LOCAL_RANK=0
WORLD_SIZE=1

Usage: LLM4Chem/finetune.py for multi-GPU training coordination

These variables are typically set automatically by your distributed training launcher (e.g., torchrun, deepspeed) and don’t need manual configuration.

Dependencies

The environment configuration requires:

python-dotenv==0.19.1
langchain-openai==0.1.25
openai

Install via:

pip install -r agent_requirements.txt
pip install -r comb_requirements.txt

Verification

To verify your environment is configured correctly:

import os
from dotenv import load_dotenv

load_dotenv(override=True)

# Check if OpenAI API key is loaded
if os.getenv("OPENAI_API_KEY"):
    print("✓ OpenAI API key configured")
else:
    print("✗ OpenAI API key not found")

Environment Variables

Overview

Setup Instructions

1. Create .env File

2. Required Environment Variables

OpenAI API Key

WandB Configuration (Optional)

Environment Loading

Agent Scripts

Distributed Training Variables

Dependencies

Verification

Next Steps

VRAM Settings

Model Selection

​Overview

​Setup Instructions

​1. Create .env File

​2. Required Environment Variables

​OpenAI API Key

​WandB Configuration (Optional)

​Environment Loading

​Agent Scripts

​Distributed Training Variables

​Dependencies

​Verification

​Next Steps

VRAM Settings

Model Selection

Overview

Setup Instructions

1. Create .env File

2. Required Environment Variables

OpenAI API Key

WandB Configuration (Optional)

Environment Loading

Agent Scripts

Distributed Training Variables

Dependencies

Verification

Next Steps