How to Search NCI Organizations and Interventions¶
This guide demonstrates how to use BioMCP's NCI-specific tools to search for cancer research organizations, interventions (drugs, devices, procedures), and biomarkers.
Prerequisites¶
All NCI tools require an API key from api.cancer.gov:
# Set as environment variable
export NCI_API_KEY="your-key-here"
# Or provide per-request in your prompts
"Find cancer centers in Boston, my NCI API key is YOUR_KEY"
Organization Search and Lookup¶
Understanding Organization Search¶
The NCI Organization database contains:
- Cancer research centers and hospitals
- Clinical trial sponsors
- Academic institutions
- Pharmaceutical companies
- Government facilities
Basic Organization Search¶
Find organizations by name:
# CLI
biomcp organization search --name "MD Anderson" --api-key YOUR_KEY
# Python
orgs = await nci_organization_searcher(
name="MD Anderson",
api_key="your-key"
)
# MCP/AI Assistant
"Search for MD Anderson Cancer Center, my NCI API key is YOUR_KEY"
Location-Based Search¶
CRITICAL: Always use city AND state together to avoid Elasticsearch errors!
# ✅ CORRECT - City and state together
orgs = await nci_organization_searcher(
city="Houston",
state="TX",
api_key="your-key"
)
# ❌ WRONG - Will cause API error
orgs = await nci_organization_searcher(
city="Houston", # Missing state!
api_key="your-key"
)
# ❌ WRONG - Will cause API error
orgs = await nci_organization_searcher(
state="TX", # Missing city!
api_key="your-key"
)
Organization Types¶
Search by organization type:
# Find academic cancer centers
academic_centers = await nci_organization_searcher(
organization_type="Academic",
api_key="your-key"
)
# Find pharmaceutical companies
pharma_companies = await nci_organization_searcher(
organization_type="Industry",
api_key="your-key"
)
# Find government research facilities
gov_facilities = await nci_organization_searcher(
organization_type="Government",
api_key="your-key"
)
Valid organization types:
Academic
- Universities and medical schoolsIndustry
- Pharmaceutical and biotech companiesGovernment
- NIH, FDA, VA hospitalsCommunity
- Community hospitals and clinicsNetwork
- Research networks and consortiumsOther
- Other organization types
Getting Organization Details¶
Retrieve complete information about a specific organization:
# Get organization by ID
org_details = await nci_organization_getter(
organization_id="NCI-2011-03337",
api_key="your-key"
)
# Returns:
# - Full name and aliases
# - Contact information
# - Address and location
# - Associated clinical trials
# - Organization type and status
Practical Organization Workflows¶
Find Regional Cancer Centers¶
async def find_cancer_centers_by_region(state: str, cities: list[str]):
"""Find all cancer centers in specific cities within a state"""
all_centers = []
for city in cities:
# ALWAYS use city + state together
centers = await nci_organization_searcher(
city=city,
state=state,
organization_type="Academic",
api_key=os.getenv("NCI_API_KEY")
)
all_centers.extend(centers)
# Remove duplicates
unique_centers = {org['id']: org for org in all_centers}
return list(unique_centers.values())
# Example: Find cancer centers in major Texas cities
texas_centers = await find_cancer_centers_by_region(
state="TX",
cities=["Houston", "Dallas", "San Antonio", "Austin"]
)
Find Trial Sponsors¶
async def find_trial_sponsors_by_type(org_type: str, name_filter: str = None):
"""Find organizations sponsoring trials"""
# Search organizations
orgs = await nci_organization_searcher(
name=name_filter,
organization_type=org_type,
api_key=os.getenv("NCI_API_KEY")
)
# For each org, get details including trial count
sponsors = []
for org in orgs[:10]: # Limit to avoid rate limits
details = await nci_organization_getter(
organization_id=org['id'],
api_key=os.getenv("NCI_API_KEY")
)
if details.get('trial_count', 0) > 0:
sponsors.append(details)
return sorted(sponsors, key=lambda x: x.get('trial_count', 0), reverse=True)
# Find pharmaceutical companies with active trials
pharma_sponsors = await find_trial_sponsors_by_type("Industry")
Intervention Search and Lookup¶
Understanding Interventions¶
Interventions in clinical trials include:
- Drugs: Chemotherapy, targeted therapy, immunotherapy
- Devices: Medical devices, diagnostic tools
- Procedures: Surgical techniques, radiation protocols
- Biologicals: Cell therapies, vaccines, antibodies
- Behavioral: Lifestyle interventions, counseling
- Other: Dietary supplements, alternative therapies
Drug Search¶
Find specific drugs or drug classes:
# CLI - Find a specific drug
biomcp intervention search --name pembrolizumab --type Drug --api-key YOUR_KEY
# CLI - Find drug class
biomcp intervention search --name "PD-1 inhibitor" --type Drug --api-key YOUR_KEY
# Python - Search with synonyms
drugs = await nci_intervention_searcher(
name="pembrolizumab",
intervention_type="Drug",
synonyms=True, # Include Keytruda, MK-3475, etc.
api_key="your-key"
)
# Search for drug combinations
combos = await nci_intervention_searcher(
name="nivolumab AND ipilimumab",
intervention_type="Drug",
api_key="your-key"
)
Device and Procedure Search¶
# Find medical devices
devices = await nci_intervention_searcher(
intervention_type="Device",
name="robot", # Surgical robots
api_key="your-key"
)
# Find procedures
procedures = await nci_intervention_searcher(
intervention_type="Procedure",
name="minimally invasive",
api_key="your-key"
)
# Find radiation protocols
radiation = await nci_intervention_searcher(
intervention_type="Radiation",
name="proton beam",
api_key="your-key"
)
Getting Intervention Details¶
# Get complete intervention information
intervention = await nci_intervention_getter(
intervention_id="INT123456",
api_key="your-key"
)
# Returns:
# - Official name and synonyms
# - Intervention type and subtype
# - Mechanism of action (for drugs)
# - FDA approval status
# - Associated clinical trials
# - Manufacturer information
Practical Intervention Workflows¶
Drug Development Pipeline¶
async def analyze_drug_pipeline(drug_target: str):
"""Analyze drugs in development for a specific target"""
# Search for drugs targeting specific pathway
drugs = await nci_intervention_searcher(
name=drug_target,
intervention_type="Drug",
api_key=os.getenv("NCI_API_KEY")
)
pipeline = {
"preclinical": [],
"phase1": [],
"phase2": [],
"phase3": [],
"approved": []
}
for drug in drugs:
# Get detailed information
details = await nci_intervention_getter(
intervention_id=drug['id'],
api_key=os.getenv("NCI_API_KEY")
)
# Categorize by development stage
if details.get('fda_approved'):
pipeline['approved'].append(details)
else:
# Check associated trials for phase
trial_phases = details.get('trial_phases', [])
if 'PHASE3' in trial_phases:
pipeline['phase3'].append(details)
elif 'PHASE2' in trial_phases:
pipeline['phase2'].append(details)
elif 'PHASE1' in trial_phases:
pipeline['phase1'].append(details)
else:
pipeline['preclinical'].append(details)
return pipeline
# Analyze PD-1/PD-L1 inhibitor pipeline
pd1_pipeline = await analyze_drug_pipeline("PD-1 inhibitor")
Compare Similar Interventions¶
async def compare_interventions(intervention_names: list[str]):
"""Compare multiple interventions side by side"""
comparisons = []
for name in intervention_names:
# Search for intervention
results = await nci_intervention_searcher(
name=name,
synonyms=True,
api_key=os.getenv("NCI_API_KEY")
)
if results:
# Get detailed info for first match
details = await nci_intervention_getter(
intervention_id=results[0]['id'],
api_key=os.getenv("NCI_API_KEY")
)
comparisons.append({
"name": details['name'],
"type": details['type'],
"synonyms": details.get('synonyms', []),
"fda_approved": details.get('fda_approved', False),
"trial_count": len(details.get('trials', [])),
"mechanism": details.get('mechanism_of_action', 'Not specified')
})
return comparisons
# Compare checkpoint inhibitors
comparison = await compare_interventions([
"pembrolizumab",
"nivolumab",
"atezolizumab",
"durvalumab"
])
Biomarker Search¶
Understanding Biomarker Types¶
The NCI API supports two biomarker types:
reference_gene
- Gene-based biomarkers (e.g., EGFR, BRAF)branch
- Pathway/branch biomarkers
Note: You cannot search by gene symbol directly; use the name parameter.
Basic Biomarker Search¶
# Search for PD-L1 biomarkers
pdl1_biomarkers = await nci_biomarker_searcher(
name="PD-L1",
api_key="your-key"
)
# Search for specific biomarker type
gene_biomarkers = await nci_biomarker_searcher(
biomarker_type="reference_gene",
api_key="your-key"
)
Biomarker Analysis Workflow¶
async def analyze_trial_biomarkers(disease: str):
"""Find biomarkers used in trials for a disease"""
# Get all biomarkers
all_biomarkers = await nci_biomarker_searcher(
biomarker_type="reference_gene",
api_key=os.getenv("NCI_API_KEY")
)
# Filter by disease association
disease_biomarkers = []
for biomarker in all_biomarkers:
if disease.lower() in str(biomarker).lower():
disease_biomarkers.append(biomarker)
# Group by frequency
biomarker_counts = {}
for bio in disease_biomarkers:
name = bio.get('name', 'Unknown')
biomarker_counts[name] = biomarker_counts.get(name, 0) + 1
# Sort by frequency
return sorted(
biomarker_counts.items(),
key=lambda x: x[1],
reverse=True
)
# Find most common biomarkers in lung cancer trials
lung_biomarkers = await analyze_trial_biomarkers("lung cancer")
Combined Workflows¶
Regional Drug Development Analysis¶
async def analyze_regional_drug_development(
state: str,
cities: list[str],
drug_class: str
):
"""Analyze drug development in a specific region"""
# Step 1: Find organizations in the region
organizations = []
for city in cities:
orgs = await nci_organization_searcher(
city=city,
state=state,
organization_type="Industry",
api_key=os.getenv("NCI_API_KEY")
)
organizations.extend(orgs)
# Step 2: Find drugs of interest
drugs = await nci_intervention_searcher(
name=drug_class,
intervention_type="Drug",
api_key=os.getenv("NCI_API_KEY")
)
# Step 3: Cross-reference trials
regional_development = []
for drug in drugs[:10]: # Limit for performance
drug_details = await nci_intervention_getter(
intervention_id=drug['id'],
api_key=os.getenv("NCI_API_KEY")
)
# Check if any trials are sponsored by regional orgs
for trial in drug_details.get('trials', []):
for org in organizations:
if org['id'] in str(trial):
regional_development.append({
'drug': drug_details['name'],
'organization': org['name'],
'location': f"{org.get('city', '')}, {org.get('state', '')}",
'trial': trial
})
return regional_development
# Analyze immunotherapy development in California
ca_immuno = await analyze_regional_drug_development(
state="CA",
cities=["San Francisco", "San Diego", "Los Angeles"],
drug_class="immunotherapy"
)
Organization to Intervention Pipeline¶
async def org_to_intervention_pipeline(org_name: str):
"""Trace from organization to their interventions"""
# Find organization
orgs = await nci_organization_searcher(
name=org_name,
api_key=os.getenv("NCI_API_KEY")
)
if not orgs:
return None
# Get organization details
org_details = await nci_organization_getter(
organization_id=orgs[0]['id'],
api_key=os.getenv("NCI_API_KEY")
)
# Get their trials
org_trials = org_details.get('trials', [])
# Extract unique interventions
interventions = set()
for trial_id in org_trials[:20]: # Sample trials
trial = await trial_getter(
nct_id=trial_id,
source="nci",
api_key=os.getenv("NCI_API_KEY")
)
if trial.get('interventions'):
interventions.update(trial['interventions'])
# Get details for each intervention
intervention_details = []
for intervention_name in interventions:
results = await nci_intervention_searcher(
name=intervention_name,
api_key=os.getenv("NCI_API_KEY")
)
if results:
intervention_details.append(results[0])
return {
'organization': org_details,
'trial_count': len(org_trials),
'interventions': intervention_details
}
# Analyze Genentech's intervention portfolio
genentech_portfolio = await org_to_intervention_pipeline("Genentech")
Best Practices¶
1. Always Use City + State Together¶
# ✅ GOOD - Prevents API errors
await nci_organization_searcher(city="Boston", state="MA")
# ❌ BAD - Will cause Elasticsearch error
await nci_organization_searcher(city="Boston")
2. Handle Rate Limits¶
import asyncio
async def search_with_rate_limit(searches: list):
"""Execute searches with rate limiting"""
results = []
for search in searches:
result = await search()
results.append(result)
# Add delay to respect rate limits
await asyncio.sleep(0.1) # 10 requests per second
return results
3. Use Pagination for Large Results¶
async def get_all_organizations(org_type: str):
"""Get all organizations of a type using pagination"""
all_orgs = []
page = 1
while True:
orgs = await nci_organization_searcher(
organization_type=org_type,
page=page,
page_size=100, # Maximum allowed
api_key=os.getenv("NCI_API_KEY")
)
if not orgs:
break
all_orgs.extend(orgs)
page += 1
# Note: Total count may not be available
if len(orgs) < 100:
break
return all_orgs
4. Cache Results¶
from functools import lru_cache
import hashlib
@lru_cache(maxsize=100)
async def cached_org_search(city: str, state: str, org_type: str):
"""Cache organization searches to reduce API calls"""
return await nci_organization_searcher(
city=city,
state=state,
organization_type=org_type,
api_key=os.getenv("NCI_API_KEY")
)
Troubleshooting¶
Common Errors and Solutions¶
-
"Search Too Broad" Error
-
Always use city + state together for location searches
- Add more specific filters (name, type)
-
Reduce page_size parameter
-
"NCI API key required"
-
Set NCI_API_KEY environment variable
- Or provide api_key parameter in function calls
-
Or include in prompt: "my NCI API key is YOUR_KEY"
-
No Results Found
-
Check spelling of organization/drug names
- Try partial name matches
- Remove filters and broaden search
-
Enable synonyms for intervention searches
-
Rate Limit Exceeded
- Add delays between requests
- Reduce concurrent requests
- Cache frequently accessed data
- Consider upgrading API key tier
Debugging Tips¶
# Enable debug logging
import logging
logging.basicConfig(level=logging.DEBUG)
# Test API key
async def test_nci_connection():
try:
result = await nci_organization_searcher(
name="Mayo",
api_key=os.getenv("NCI_API_KEY")
)
print(f"✅ API key valid, found {len(result)} results")
except Exception as e:
print(f"❌ API key error: {e}")
# Check specific organization exists
async def verify_org_id(org_id: str):
try:
org = await nci_organization_getter(
organization_id=org_id,
api_key=os.getenv("NCI_API_KEY")
)
print(f"✅ Organization found: {org['name']}")
except:
print(f"❌ Organization ID not found: {org_id}")
Next Steps¶
- Review NCI prompts examples for AI assistant usage
- Explore trial search with biomarkers
- Learn about variant effect prediction
- Set up API authentication