Benchmarks¶
This page defines practical benchmark patterns for BioMCP.
The objective is reproducible command behavior, not synthetic leaderboard numbers.
Benchmark goals¶
- Compare latency across entities.
- Track output-size drift.
- Validate cache behavior.
- Catch regressions in command startup overhead.
Baseline command set¶
Use a fixed baseline so runs are comparable over time.
biomcp get gene BRAF
biomcp get variant "BRAF V600E"
biomcp get trial NCT02576665
biomcp search article -g BRAF --limit 5
Latency measurement¶
Use repeated runs and report median + spread.
Example with hyperfine:
Recommendations:
- run on stable network,
- avoid mixed-background workloads,
- capture raw command outputs for auditing.
Output-size tracking¶
Track markdown and JSON sizes independently.
Why this matters:
- markdown size impacts prompt context cost,
- JSON size affects API and queue throughput,
- sudden growth can signal response-shape regressions.
Cache-behavior checks¶
Compare first call vs second call for cache-eligible endpoints.
Capture timing for both runs to verify expected improvement.
Date-validation contract checks¶
Invalid dates should fail before network calls.
Examples:
biomcp search article -g BRAF --since 2024-13-01 --limit 1
biomcp search article -g BRAF --since 2024-02-30 --limit 1
Expected behavior:
- immediate validation error,
- no long wait for upstream API timeouts.
Cache-directory contract checks¶
Downloaded artifacts should use platform cache paths.
Use command help and health output to validate paths in your environment.
Reporting template¶
When sharing benchmark results, include:
- command list,
- machine details,
- network assumptions,
- cache warm/cold status,
- median and p95 latency,
- output byte counts.
Caveats¶
- Public APIs can throttle unpredictably.
- Upstream schema changes can affect command duration.
- Geographic distance to API hosts influences baseline latency.
Recommended cadence¶
- Run quick checks on every release candidate.
- Run fuller benchmark suites before major parser or rendering changes.