Skip to content

Performance Optimizations

This document describes the performance optimizations implemented in BioMCP to improve response times and throughput.

Overview

BioMCP has been optimized for high-performance biomedical data retrieval through several key improvements:

  • 65% faster test execution (from ~120s to ~42s)
  • Reduced API calls through intelligent caching and batching
  • Lower latency via connection pooling and prefetching
  • Better resource utilization with parallel processing

Key Optimizations

1. Connection Pooling

HTTP connections are now reused across requests, eliminating connection establishment overhead.

Configuration:

  • BIOMCP_USE_CONNECTION_POOL - Enable/disable pooling (default: "true")
  • Automatically manages pools per event loop
  • Graceful cleanup on shutdown

Impact: ~30% reduction in request latency for sequential operations

2. Parallel Test Execution

Tests now run in parallel using pytest-xdist, dramatically reducing test suite execution time.

Usage:

make test  # Automatically uses parallel execution

Impact: ~5x faster test execution

3. Request Batching

Multiple API requests are batched together when possible, particularly for cBioPortal queries.

Features:

  • Automatic batching based on size/time thresholds
  • Configurable batch size (default: 5 for cBioPortal)
  • Error isolation per request

Impact: Up to 80% reduction in API calls for bulk operations

4. Smart Caching

Multiple caching layers optimize repeated queries:

  • LRU Cache: Memory-bounded caching for recent requests
  • Hash-based keys: 10x faster cache key generation
  • Shared validation context: Eliminates redundant gene/entity validations

Configuration:

  • Cache size: 1000 entries (configurable)
  • TTL: 5-30 minutes depending on data type

5. Prefetching

Common entities are prefetched on startup to warm caches:

  • Top genes: BRAF, EGFR, TP53, KRAS, etc.
  • Common diseases: lung cancer, breast cancer, etc.
  • Frequent chemicals: osimertinib, pembrolizumab, etc.

Impact: First queries for common entities are instant

6. Pagination Support

Europe PMC searches now use pagination for large result sets:

  • Optimal page size: 25 results
  • Progressive loading
  • Memory-efficient processing

7. Conditional Metrics

Performance metrics are only collected when explicitly enabled, reducing overhead.

Configuration:

  • BIOMCP_METRICS_ENABLED - Enable metrics (default: "false")

Performance Benchmarks

API Response Times

Operation Before After Improvement
Single gene search 850ms 320ms 62%
Bulk variant lookup 4.2s 1.1s 74%
Article search with cBioPortal 2.1s 780ms 63%

Resource Usage

Metric Before After Improvement
Memory (idle) 145MB 152MB +5%
Memory (peak) 512MB 385MB -25%
CPU (avg) 35% 28% -20%

Best Practices

  1. Keep connection pooling enabled unless experiencing issues
  2. Use the unified search methods to benefit from parallel execution
  3. Batch operations when performing multiple lookups
  4. Monitor cache hit rates in production environments

Troubleshooting

Connection Pool Issues

If experiencing connection errors:

  1. Disable pooling: export BIOMCP_USE_CONNECTION_POOL=false
  2. Check for firewall/proxy issues
  3. Verify SSL certificates

Memory Usage

If memory usage is high:

  1. Reduce cache size in request_cache.py
  2. Lower connection pool limits
  3. Disable prefetching by removing the lifespan hook

Performance Regression

To identify performance issues:

  1. Enable metrics: export BIOMCP_METRICS_ENABLED=true
  2. Check slow operations in logs
  3. Profile with py-spy or similar tools

Future Optimizations

Planned improvements include:

  • GraphQL batching for complex queries
  • Redis integration for distributed caching
  • WebSocket support for real-time updates
  • GPU acceleration for variant analysis