How to: find articles¶
This guide shows practical literature-search patterns.
Translate a question into filters¶
When the gene, disease, or drug is already known, put that anchor in a typed
flag and keep the mechanism, phenotype, dataset, or outcome in -k.
Known anchor plus concept:
Unknown entity, keyword first:
Do not guess -g, -d, or --drug when the question is trying to identify
the entity itself. Keep the first search keyword-only, or start with
biomcp discover "<question>" if you want a typed follow-up command first.
Question-format terms can stay in the article filters: PubMed ESearch cleans
bounded filler words from unfielded gene, disease, drug, and keyword terms
provider-locally, while query echoes and non-PubMed sources keep the original
wording.
If the whole keyword exactly matches a gene, drug, or disease vocabulary label
or alias, keyword-only article search may return a typed get suggestion in
See also, _meta.next_commands, and JSON _meta.suggestions[]. Treat that
as a structured follow-up option, but do not expect direct entity suggestions
for multi-concept phrases such as BRAF V600E or lung cancer immunotherapy.
Dataset or method question:
Refine with typed flags before paginating:
If the first page reveals the gene, disease, or drug that actually anchors the question, rerun with that typed flag before you spend time paginating a noisy keyword-only result set.
Avoid keyword reformulation loops¶
When an agent is iterating on one literature task, pass a short local
--session label and request JSON. If the next keyword search overlaps the
previous same-session keyword by at least 60% after BioMCP removes common
search filler words, JSON _meta.suggestions[] can point to a better fallback:
inspect the prior hits with article batch, map the topic with discover, or
narrow by publication year when the current page supports that retry.
biomcp --json search article -k "Oncotype DX review" --session lit-review-1 --limit 5
biomcp --json search article -k "Oncotype DX DCIS" --session lit-review-1 --limit 5
Treat --session as a non-secret local correlation label. Do not put PHI,
credentials, email addresses, or user identifiers in it. Markdown article
search output does not show loop-breaker suggestions.
Start from a known anchor¶
search article always works without credentials. BioMCP keeps
sort=relevance as the default, but the effective ranking mode depends on the
query: keyword-bearing searches default to hybrid scoring, while entity-only
searches default to lexical directness. LitSense2 joins keyword-bearing
federated searches, and the Semantic Scholar leg is still eligible whenever the
filter set is compatible. S2_API_KEY upgrades those Semantic Scholar
requests to authenticated quota; without it, BioMCP uses the shared pool.
BioMCP also caps each federated source's contribution after deduplication and
before ranking. Default: 40% of --limit on federated pools with at least
three surviving primary sources. Rows count against their primary source after
deduplication. Use --max-per-source <N> to override that cap, use
--max-per-source 0 for the default cap explicitly, and set it equal to
--limit to disable capping.
Search PubMed directly¶
Direct PubMed search and the compatible federated PubMed leg apply the same question-format cleanup before ESearch, so a keyword question can still echo as written while PubMed receives content terms.
Add disease context¶
Tune semantic versus lexical balance¶
biomcp search article -k "Hirschsprung disease ganglion cells" --ranking-mode hybrid --weight-semantic 0.5 --weight-lexical 0.2 --limit 5
Use --ranking-mode lexical to force the old directness comparator on a
keyword query, --ranking-mode semantic to sort by the LitSense2-derived
semantic signal first, or --weight-* flags to retune the default hybrid
formula 0.4*semantic + 0.3*lexical + 0.2*citations + 0.1*position. Rows
without LitSense2 provenance contribute semantic=0 in semantic-aware
ranking modes.
Cap one source explicitly¶
Constrain by date¶
Exclude preprints when supported¶
Pull the full-text section¶
Fetch several shortlisted papers at once¶
Use article batch after search when you already know the candidate PMIDs or
DOIs and want compact title/journal/year/entity cards before opening one paper
in full detail. The helper preserves input order and still works when
S2_API_KEY is unset.
Use --type carefully¶
--type on the default --source all route uses Europe PMC + PubMed when the
other selected filters are PubMed-compatible. If you also need
--open-access or --no-preprints, PubMed drops out and the search collapses
to Europe PMC-only with an explicit note. Use --source pubmed when you want
PubMed-only article search on the compatible filter set and do not need those
PubMed-incompatible filters.
Inspect the ranking rationale in JSON¶
Look for semantic_scholar_enabled, row-level matched_sources, and
ranking metadata to see why a paper ranked where it did. Hybrid rows expose
normalized semantic, lexical, citation, and source-position components plus the
composite score; lexical rows preserve the existing directness metadata.
Inspect the executed search plan¶
Markdown:
JSON / MCP-friendly text output:
--debug-plan adds a top-level debug_plan payload in JSON and prepends the
same payload as a fenced JSON block in markdown. Request JSON+plan for MCP
callers with --json --debug-plan.
Follow-up pattern¶
After identifying key papers, pivot to trials or variants: