Skip to content

BioMCP Architecture Diagrams

This page describes BioMCP's architecture, data flows, and workflows.

System Architecture Overview

BioMCP consists of three main layers:

Client Layer

  • CLI Interface: Command-line tool for direct interaction
  • Claude Desktop: AI assistant integration via MCP protocol
  • Python SDK: Programmatic access for custom applications
  • Custom MCP Clients: Any MCP-compatible client

BioMCP Core

  • MCP Server: Handles protocol communication
  • Request Router: Directs queries to appropriate handlers
  • Cache Layer: Intelligent caching for API responses
  • Domain Handlers: Specialized processors for each data type
  • Articles Handler (PubMed/PubTator3)
  • Trials Handler (ClinicalTrials.gov, NCI)
  • Variants Handler (MyVariant.info)
  • Genes Handler (MyGene.info)

External APIs

  • PubMed/PubTator3: Biomedical literature
  • ClinicalTrials.gov: US clinical trials registry
  • NCI CTS API: National Cancer Institute trials
  • MyVariant.info: Genetic variant annotations
  • MyGene.info: Gene information
  • cBioPortal: Cancer genomics data
  • AlphaGenome: Variant effect predictions

Data Flow Architecture

  1. User Request: Query submitted via CLI, Claude, or SDK
  2. Cache Check: System checks for cached results
  3. API Request: If cache miss, fetch from external API
  4. Result Processing: Normalize and enrich data
  5. Cache Storage: Store results for future use
  6. Response Delivery: Return formatted results to user

Key Workflows

Search Workflow

  1. Think Tool: Plan search strategy
  2. Execute Search: Query relevant data sources
  3. Enrich Results: Add contextual information
  4. Combine Data: Merge results from multiple sources
  5. Format Output: Present in user-friendly format

Article Search Pipeline

  1. Query Processing: Parse user input
  2. Entity Recognition: Normalize gene/disease names
  3. PubTator3 Search: Query literature database
  4. Preprint Integration: Include bioRxiv/medRxiv if enabled
  5. cBioPortal Enrichment: Add cancer genomics data for genes
  6. Result Merging: Combine all data sources

Clinical Trial Matching

  1. Patient Profile: Parse eligibility criteria
  2. Location Filter: Geographic constraints
  3. Molecular Profile: Mutation requirements
  4. Prior Treatments: Treatment history matching
  5. Scoring Algorithm: Rank trials by relevance
  6. Contact Extraction: Retrieve site information

Variant Interpretation

  1. Input Parsing: Process VCF/MAF files
  2. Batch Processing: Group variants efficiently
  3. Annotation Gathering:
  4. Clinical significance from MyVariant.info
  5. Population frequency data
  6. In silico predictions
  7. Literature evidence
  8. Clinical trial associations
  9. AlphaGenome Integration: Regulatory predictions (optional)
  10. Tier Classification: Categorize by clinical relevance
  11. Report Generation: Create interpretation summary

Architecture Patterns

Caching Strategy

  • Multi-tier Cache: Memory → Disk → External
  • Smart TTL: Domain-specific expiration times
  • Cache Key Generation: Include all query parameters
  • Invalidation Logic: Clear on errors or updates

Error Handling

  • Retry Logic: Exponential backoff for transient errors
  • Rate Limiting: Respect API limits with queuing
  • Graceful Degradation: Return partial results when possible
  • Clear Error Messages: Help users troubleshoot issues

Authentication Flow

  1. Check for user-provided API key
  2. Fall back to environment variable
  3. Use public access if no key available
  4. Handle authentication errors gracefully

Performance Optimization

  • Request Batching: Combine multiple queries
  • Parallel Execution: Concurrent API calls
  • Connection Pooling: Reuse HTTP connections
  • Result Streaming: Return data as available

Deployment Options

Local Development

  • Single process with in-memory cache
  • Direct file system access
  • Simple configuration

Docker Deployment

  • Containerized application
  • Volume-mounted cache
  • Environment-based configuration

Cloud Deployment

  • Load-balanced instances
  • Shared Redis cache
  • Auto-scaling capabilities
  • Monitoring integration

Creating Documentation Diagrams

For visual diagrams, we recommend:

  1. ASCII Art: Universal compatibility

  2. Use tools like asciiflow.com

  3. Store in docs/assets/ directory

  4. Screenshots: For complex UIs

  5. Annotate with arrows/labels

  6. Save as PNG in docs/assets/

  7. External Tools:

  8. draw.io for flowcharts
  9. Lucidchart for professional diagrams
  10. Export as static images

ASCII System Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                              USER INTERFACES                             │
├────────────────┬───────────────────┬───────────────┬───────────────────┤
│                │                   │               │                   │
│   CLI Tool     │  Claude Desktop   │  Python SDK   │   Custom Client   │
│  (biomcp)      │   (MCP Client)    │   (async)     │    (your app)     │
│                │                   │               │                   │
└────────┬───────┴─────────┬─────────┴───────┬───────┴───────────┬───────┘
         │                 │                 │                   │
         └─────────────────┴─────────────────┴───────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│                            BioMCP CORE SERVER                            │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────┐  ┌────────────┐  │
│  │   Router    │  │ Rate Limiter │  │ Cache Manager│  │   Logger   │  │
│  │             │  │              │  │              │  │            │  │
│  └──────┬──────┘  └──────────────┘  └──────────────┘  └────────────┘  │
│         │                                                               │
│         ▼                                                               │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                      Domain Handlers                             │   │
│  ├─────────────┬─────────────┬─────────────┬──────────────────────┤   │
│  │  Articles   │   Trials    │  Variants   │  Genes/Drugs/Disease │   │
│  │  Handler    │   Handler   │  Handler    │      Handler         │   │
│  └──────┬──────┴──────┬──────┴──────┬──────┴──────────┬───────────┘   │
│         │             │             │                 │                 │
└─────────┼─────────────┼─────────────┼─────────────────┼─────────────────┘
          │             │             │                 │
          ▼             ▼             ▼                 ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          EXTERNAL DATA SOURCES                           │
├─────────────┬─────────────┬─────────────┬──────────────────────────────┤
│             │             │             │                              │
│  PubMed/    │ Clinical    │ MyVariant   │        BioThings Suite       │
│  PubTator3  │ Trials.gov  │   .info     │  (MyGene/MyDisease/MyChem)  │
│             │    + NCI    │             │                              │
│             │             │             │                              │
├─────────────┴─────────────┴─────────────┴──────────────────────────────┤
│                                                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                 │
│  │  cBioPortal  │  │  AlphaGenome │  │  Europe PMC  │                 │
│  │   (Cancer)   │  │ (Predictions)│  │  (Preprints) │                 │
│  └──────────────┘  └──────────────┘  └──────────────┘                 │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

See also: Quick Architecture Reference

Next Steps