FDA Integration Security Documentation¶
Overview¶
This document outlines the security measures implemented in the BioMCP FDA integration to ensure safe handling of medical data and protection against common vulnerabilities.
Security Features¶
1. Input Validation & Sanitization¶
All user inputs are validated and sanitized before being sent to the FDA API:
- Injection Prevention: Removes characters that could be used for SQL injection, XSS, or command injection (
<>\"';&|\\
) - Length Limits: Enforces maximum lengths on all input fields
- Type Validation: Ensures parameters match expected types (dates, numbers, etc.)
- Format Validation: Validates specific formats (e.g., YYYY-MM-DD for dates)
Implementation: src/biomcp/openfda/input_validation.py
# Example usage
from biomcp.openfda.input_validation import sanitize_input, validate_drug_name
safe_drug = validate_drug_name("Aspirin<script>") # Returns "Aspirin"
safe_input = sanitize_input("'; DROP TABLE;") # SQL injection blocked
2. API Key Protection¶
API keys are protected at multiple levels:
- Cache Key Exclusion: API keys are removed before generating cache keys
- No Logging: API keys are never logged, even in debug mode
- Environment Variables: Keys stored in environment variables, not in code
- Validation: API key format is validated before use
Implementation: src/biomcp/openfda/cache.py
, src/biomcp/openfda/utils.py
3. Rate Limiting¶
Client-side rate limiting prevents API quota exhaustion:
- Token Bucket Algorithm: Allows bursts while maintaining average rate
- Configurable Limits: 40 requests/minute without key, 240 with key
- Concurrent Request Limiting: Maximum 10 concurrent requests via semaphore
- Automatic Backoff: Delays requests when approaching limits
Implementation: src/biomcp/openfda/rate_limiter.py
4. Circuit Breaker Pattern¶
Prevents cascading failures when FDA API is unavailable:
- Failure Threshold: Opens after 5 consecutive failures
- Recovery Timeout: Waits 60 seconds before retry attempts
- Half-Open State: Tests recovery with limited requests
- Automatic Recovery: Returns to normal operation when API recovers
States:
- CLOSED: Normal operation
- OPEN: Blocking all requests (API is down)
- HALF_OPEN: Testing if API has recovered
5. Memory Protection¶
Prevents memory exhaustion from large responses:
- Response Size Limits: Maximum 1MB per cached response
- Cache Size Limits: Maximum 100 entries in cache
- FIFO Eviction: Oldest entries removed when cache is full
- Size Validation: Large responses rejected before caching
Configuration:
6. File Operation Security¶
Secure handling of cache files:
- File Locking: Uses
fcntl
for exclusive/shared locks - Atomic Operations: Writes to temp files then renames
- Race Condition Prevention: Locks prevent concurrent modifications
- Permission Control: Files created without world-write permissions
Implementation: src/biomcp/openfda/drug_shortages.py
Security Best Practices¶
For Developers¶
- Never Log Sensitive Data
# BAD
logger.debug(f"API key: {api_key}")
# GOOD
logger.debug("API key configured" if api_key else "No API key")
- Always Validate Input
from biomcp.openfda.input_validation import validate_drug_name
# Always validate before using
safe_drug = validate_drug_name(user_input)
if safe_drug:
# Use safe_drug, not user_input
await search_adverse_events(drug=safe_drug)
- Use Rate Limiting
from biomcp.openfda.rate_limiter import rate_limited_request
# Wrap API calls with rate limiting
result = await rate_limited_request(make_api_call, params)
For System Administrators¶
-
API Key Management
-
Store API keys in environment variables
- Rotate keys regularly (recommended: every 90 days)
- Use different keys for dev/staging/production
-
Monitor key usage for anomalies
-
Monitoring
-
Set up alerts for circuit breaker state changes
- Monitor rate limit consumption
- Track cache hit/miss ratios
-
Log validation failures (potential attacks)
-
Resource Limits
Threat Model¶
Threats Addressed¶
Threat | Mitigation | Implementation |
---|---|---|
SQL Injection | Input sanitization | input_validation.py |
XSS Attacks | HTML/JS character removal | sanitize_input() |
Command Injection | Shell metacharacter removal | sanitize_input() |
API Key Exposure | Exclusion from logs/cache | cache.py , utils.py |
DoS via Rate Limits | Client-side rate limiting | rate_limiter.py |
Cascading Failures | Circuit breaker pattern | CircuitBreaker class |
Memory Exhaustion | Response size limits | MAX_RESPONSE_SIZE |
Race Conditions | File locking | fcntl usage |
Cache Poisoning | Input validation | build_safe_query() |
Residual Risks¶
-
API Key Compromise: If environment is compromised, keys are accessible
-
Mitigation: Use secret management systems in production
-
Zero-Day FDA API Vulnerabilities: Unknown vulnerabilities in FDA API
-
Mitigation: Monitor FDA security advisories
-
Distributed DoS: Multiple clients could still overwhelm FDA API
- Mitigation: Implement global rate limiting at gateway level
Compliance Considerations¶
HIPAA (If Applicable)¶
While FDA's public APIs don't contain PHI, if extended to include patient data:
- Encryption: Use TLS for all API communications
- Audit Logging: Log all data access (but not the data itself)
- Access Controls: Implement user authentication/authorization
- Data Retention: Define and enforce retention policies
FDA Data Usage¶
- Attribution: Always include FDA disclaimers in responses
- Data Currency: Warn users that data may not be real-time
- Medical Decisions: Explicitly state data is not for clinical decisions
- Rate Limits: Respect FDA's terms of service
Security Testing¶
Automated Tests¶
Run security tests with:
Tests cover:
- Input validation
- Cache key security
- Rate limiting
- Circuit breaker
- File operations
Manual Security Review¶
Checklist for security review:
- [ ] No sensitive data in logs
- [ ] All inputs validated
- [ ] Rate limiting functional
- [ ] Circuit breaker triggers correctly
- [ ] Cache size limited
- [ ] File operations are atomic
- [ ] API keys not in cache keys
- [ ] Error messages don't leak information
Incident Response¶
If API Key is Compromised¶
- Immediate: Revoke compromised key at FDA portal
- Generate: Create new API key
- Update: Update environment variables
- Restart: Restart services to load new key
- Audit: Review logs for unauthorized usage
If Rate Limits Exceeded¶
- Check: Verify circuit breaker state
- Wait: Allow circuit breaker recovery timeout
- Reduce: Lower request rate if needed
- Monitor: Check for abnormal usage patterns
If Security Vulnerability Found¶
- Assess: Determine severity and exploitability
- Patch: Develop and test fix
- Deploy: Roll out fix with monitoring
- Document: Update this security documentation
- Notify: Inform users if data was at risk
Configuration Reference¶
Environment Variables¶
Variable | Default | Description |
---|---|---|
OPENFDA_API_KEY |
None | FDA API key for higher rate limits |
BIOMCP_FDA_CACHE_TTL |
15 | Cache TTL in minutes |
BIOMCP_FDA_MAX_CACHE_SIZE |
100 | Maximum cache entries |
BIOMCP_FDA_MAX_RESPONSE_SIZE |
1048576 | Maximum response size in bytes |
BIOMCP_SHORTAGE_CACHE_TTL |
24 | Drug shortage cache TTL in hours |
Security Headers¶
When deploying as a web service, add these headers:
headers = {
"X-Content-Type-Options": "nosniff",
"X-Frame-Options": "DENY",
"X-XSS-Protection": "1; mode=block",
"Strict-Transport-Security": "max-age=31536000; includeSubDomains",
"Content-Security-Policy": "default-src 'self'"
}
Contact¶
For security issues, contact: [email protected] (create this address)
For FDA API issues, see: https://open.fda.gov/apis/
Last Updated: 2025-08-07 Version: 1.0