Skip to content

How to: find articles

This guide shows practical literature-search patterns.

Translate a question into filters

When the gene, disease, or drug is already known, put that anchor in a typed flag and keep the mechanism, phenotype, dataset, or outcome in -k.

Known anchor plus concept:

biomcp search article -g TP53 -k "apoptosis gene regulation" --limit 5

Unknown entity, keyword first:

biomcp search article -k '"cafe-au-lait spots" neurofibromas disease' --type review --limit 5

Do not guess -g, -d, or --drug when the question is trying to identify the entity itself. Keep the first search keyword-only, or start with biomcp discover "<question>" if you want a typed follow-up command first. Question-format terms can stay in the article filters: PubMed ESearch cleans bounded filler words from unfielded gene, disease, drug, and keyword terms provider-locally, while query echoes and non-PubMed sources keep the original wording.

If the whole keyword exactly matches a gene, drug, or disease vocabulary label or alias, keyword-only article search may return a typed get suggestion in See also, _meta.next_commands, and JSON _meta.suggestions[]. Treat that as a structured follow-up option, but do not expect direct entity suggestions for multi-concept phrases such as BRAF V600E or lung cancer immunotherapy.

Dataset or method question:

biomcp search article -k "TCGA mutation analysis dataset" --type review --limit 5

Refine with typed flags before paginating:

biomcp search article --drug amiodarone -k "photosensitivity mechanism" --limit 5

If the first page reveals the gene, disease, or drug that actually anchors the question, rerun with that typed flag before you spend time paginating a noisy keyword-only result set.

Avoid keyword reformulation loops

When an agent is iterating on one literature task, pass a short local --session label and request JSON. If the next keyword search overlaps the previous same-session keyword by at least 60% after BioMCP removes common search filler words, JSON _meta.suggestions[] can point to a better fallback: inspect the prior hits with article batch, map the topic with discover, or narrow by publication year when the current page supports that retry.

biomcp --json search article -k "Oncotype DX review" --session lit-review-1 --limit 5
biomcp --json search article -k "Oncotype DX DCIS" --session lit-review-1 --limit 5

Treat --session as a non-secret local correlation label. Do not put PHI, credentials, email addresses, or user identifiers in it. Markdown article search output does not show loop-breaker suggestions.

Start from a known anchor

biomcp search article -g BRAF --limit 10

search article always works without credentials. BioMCP keeps sort=relevance as the default, but the effective ranking mode depends on the query: keyword-bearing searches default to hybrid scoring, while entity-only searches default to lexical directness. LitSense2 joins keyword-bearing federated searches, and the Semantic Scholar leg is still eligible whenever the filter set is compatible. S2_API_KEY upgrades those Semantic Scholar requests to authenticated quota; without it, BioMCP uses the shared pool. BioMCP also caps each federated source's contribution after deduplication and before ranking. Default: 40% of --limit on federated pools with at least three surviving primary sources. Rows count against their primary source after deduplication. Use --max-per-source <N> to override that cap, use --max-per-source 0 for the default cap explicitly, and set it equal to --limit to disable capping.

Search PubMed directly

biomcp search article -g BRAF --source pubmed --limit 5

Direct PubMed search and the compatible federated PubMed leg apply the same question-format cleanup before ESearch, so a keyword question can still echo as written while PubMed receives content terms.

Add disease context

biomcp search article -g BRAF -d melanoma --limit 10

Tune semantic versus lexical balance

biomcp search article -k "Hirschsprung disease ganglion cells" --ranking-mode hybrid --weight-semantic 0.5 --weight-lexical 0.2 --limit 5

Use --ranking-mode lexical to force the old directness comparator on a keyword query, --ranking-mode semantic to sort by the LitSense2-derived semantic signal first, or --weight-* flags to retune the default hybrid formula 0.4*semantic + 0.3*lexical + 0.2*citations + 0.1*position. Rows without LitSense2 provenance contribute semantic=0 in semantic-aware ranking modes.

Cap one source explicitly

biomcp search article -k "Kartagener syndrome ciliopathy" --limit 50 --max-per-source 10

Constrain by date

biomcp search article -g BRAF --since 2024-01-01 --limit 10

Exclude preprints when supported

biomcp search article -g BRAF --since 2024-01-01 --no-preprints --limit 10

Pull the full-text section

biomcp get article 22663011 fulltext

Fetch several shortlisted papers at once

biomcp article batch 22663011 24200969 39073865

Use article batch after search when you already know the candidate PMIDs or DOIs and want compact title/journal/year/entity cards before opening one paper in full detail. The helper preserves input order and still works when S2_API_KEY is unset.

Use --type carefully

biomcp search article -g BRAF --type review --limit 5

--type on the default --source all route uses Europe PMC + PubMed when the other selected filters are PubMed-compatible. If you also need --open-access or --no-preprints, PubMed drops out and the search collapses to Europe PMC-only with an explicit note. Use --source pubmed when you want PubMed-only article search on the compatible filter set and do not need those PubMed-incompatible filters.

Inspect the ranking rationale in JSON

env -u S2_API_KEY biomcp --json search article -g BRAF --limit 3

Look for semantic_scholar_enabled, row-level matched_sources, and ranking metadata to see why a paper ranked where it did. Hybrid rows expose normalized semantic, lexical, citation, and source-position components plus the composite score; lexical rows preserve the existing directness metadata.

Inspect the executed search plan

Markdown:

env -u S2_API_KEY biomcp search article -g BRAF --debug-plan --limit 3

JSON / MCP-friendly text output:

env -u S2_API_KEY biomcp --json search article -g BRAF --debug-plan --limit 3

--debug-plan adds a top-level debug_plan payload in JSON and prepends the same payload as a fenced JSON block in markdown. Request JSON+plan for MCP callers with --json --debug-plan.

Follow-up pattern

After identifying key papers, pivot to trials or variants:

biomcp search trial -c melanoma --mutation "BRAF V600E" --limit 5
biomcp search variant -g BRAF --limit 5