gpt4o_chem_extract function.
Overview
The image processing pipeline uses GPT-4o’s vision capabilities to:- Extract chemical names and formulas from diagrams
- Read SMILES strings from images
- Identify structural features from molecular drawings
- Process scanned documents and handwritten notes
Basic Usage
Standalone Image Extraction
With Agent Integration
The--image flag automatically integrates image extraction into the agent workflow:
How It Works
The image processing workflow (plan_execute_agent/rdkit_agent.py:257):
Integration with Queries
When usingprocess_input() with an image, the extracted text is automatically combined with your query:
Supported Image Formats
The function supports common image formats:- PNG (
.png) - JPEG (
.jpg,.jpeg) - GIF (
.gif) - BMP (
.bmp) - WebP (
.webp)
Use Cases
Document OCR
Extract chemical data from scanned papers and patents
Structure Recognition
Read molecular structures from diagrams
Lab Notes
Process handwritten chemical formulas
Whiteboard Capture
Digitize reactions from classroom photos
Examples
Extract SMILES from Structural Diagram
Identify Compound from Image
Extract Reaction from Scheme
Batch Processing
Combining with RAG
Use both image extraction and PubChem RAG for comprehensive analysis:Best Practices
Image Quality
Image Quality
- Use high-resolution images (at least 300 DPI for scans)
- Ensure good contrast between text/structures and background
- Avoid blurry or distorted images
- Crop to relevant content when possible
Query Formulation
Query Formulation
- Be specific about what to extract
- Mention if image contains multiple compounds
- Indicate expected format (SMILES, IUPAC, formula)
- Provide context when ambiguity is possible
Validation
Validation
- Always validate extracted SMILES with
validate_smiles_rdkit - Cross-reference extracted names with databases
- Review complex structures manually
- Use multiple angles/views for 3D structures
Performance
Performance
- Process images asynchronously for better performance
- Cache results for repeated queries
- Batch similar images together
- Consider preprocessing (crop, enhance) before extraction
Error Handling
Limitations
Advanced Usage
Custom Extraction Prompts
See Also
- PubChem RAG - Augment with database information
- Name Conversion - Convert extracted representations
- SMILES Validation - Validate extracted SMILES
- Molecule Operations - Process extracted structures