Skip to content

How to Predict Variant Effects with AlphaGenome

This guide demonstrates how to use Google DeepMind's AlphaGenome to predict regulatory effects of genetic variants on gene expression, chromatin accessibility, and splicing.

Overview

AlphaGenome predicts how DNA variants affect:

  • Gene Expression: Log-fold changes in nearby genes
  • Chromatin Accessibility: ATAC-seq/DNase-seq signal changes
  • Splicing: Effects on splice sites and exon inclusion
  • Regulatory Elements: Impact on enhancers, promoters, and TFBS
  • 3D Chromatin: Changes in chromatin interactions

For technical details on the AlphaGenome integration, see the AlphaGenome API Reference.

Setup and API Key

Get Your API Key

  1. Visit AlphaGenome Portal
  2. Register for non-commercial use
  3. Receive API key via email

For detailed setup instructions, see Authentication and API Keys.

Configure API Key

Option 1: Environment Variable (Personal Use)

export ALPHAGENOME_API_KEY="your-key-here"

Option 2: Per-Request (AI Assistants)

"Predict effects of BRAF V600E. My AlphaGenome API key is YOUR_KEY_HERE"

Option 3: Configuration File

# .env file
ALPHAGENOME_API_KEY=your-key-here

Install AlphaGenome (Optional)

For local predictions:

git clone https://github.com/google-deepmind/alphagenome.git
cd alphagenome && pip install .

Basic Variant Prediction

Simple Prediction

Predict effects of BRAF V600E mutation:

# CLI
biomcp variant predict chr7 140753336 A T

# Python
result = await client.variants.predict(
    chromosome="chr7",
    position=140753336,
    reference="A",
    alternate="T"
)

# MCP Tool
result = await alphagenome_predictor(
    chromosome="chr7",
    position=140753336,
    reference="A",
    alternate="T"
)

Understanding Results

# Gene expression changes
for gene in result.gene_expression:
    print(f"{gene.name}: {gene.log2_fold_change}")
    # Positive = increased expression
    # Negative = decreased expression
    # |value| > 1.0 = strong effect

# Chromatin accessibility
for region in result.chromatin:
    print(f"{region.type}: {region.change}")
    # Positive = more open chromatin
    # Negative = more closed chromatin

# Splicing effects
for splice in result.splicing:
    print(f"{splice.event}: {splice.delta_psi}")
    # PSI = Percent Spliced In
    # Positive = increased inclusion

Tissue-Specific Predictions

Single Tissue Analysis

Predict effects in specific tissues using UBERON terms:

# Breast tissue analysis
result = await alphagenome_predictor(
    chromosome="chr17",
    position=41246481,
    reference="G",
    alternate="A",
    tissue_types=["UBERON:0000310"]  # breast
)

# Common tissue codes:
# UBERON:0000310 - breast
# UBERON:0002107 - liver
# UBERON:0002367 - prostate
# UBERON:0000955 - brain
# UBERON:0002048 - lung
# UBERON:0001155 - colon

Multi-Tissue Comparison

Compare effects across tissues:

tissues = [
    "UBERON:0000310",  # breast
    "UBERON:0002107",  # liver
    "UBERON:0002048"   # lung
]

results = {}
for tissue in tissues:
    results[tissue] = await alphagenome_predictor(
        chromosome="chr17",
        position=41246481,
        reference="G",
        alternate="A",
        tissue_types=[tissue]
    )

# Compare gene expression across tissues
for tissue, result in results.items():
    print(f"\n{tissue}:")
    for gene in result.gene_expression[:3]:
        print(f"  {gene.name}: {gene.log2_fold_change}")

Analysis Window Sizes

Choosing the Right Interval

Different interval sizes capture different regulatory effects:

# Short-range (promoter effects)
result_2kb = await alphagenome_predictor(
    chromosome="chr7",
    position=140753336,
    reference="A",
    alternate="T",
    interval_size=2048  # 2kb
)

# Medium-range (enhancer-promoter)
result_128kb = await alphagenome_predictor(
    chromosome="chr7",
    position=140753336,
    reference="A",
    alternate="T",
    interval_size=131072  # 128kb (default)
)

# Long-range (TAD-level effects)
result_1mb = await alphagenome_predictor(
    chromosome="chr7",
    position=140753336,
    reference="A",
    alternate="T",
    interval_size=1048576  # 1Mb
)

Interval Size Guide:

  • 2kb: Promoter variants, TSS mutations
  • 16kb: Local regulatory elements
  • 128kb: Enhancer-promoter interactions (default)
  • 512kb: Long-range regulatory
  • 1Mb: TAD boundaries, super-enhancers

Clinical Workflows

Workflow 1: VUS (Variant of Unknown Significance) Analysis

async def analyze_vus(chromosome: str, position: int, ref: str, alt: str):
    # Step 1: Think about the analysis
    await think(
        thought=f"Analyzing VUS at {chromosome}:{position} {ref}>{alt}",
        thoughtNumber=1
    )

    # Step 2: Get variant annotations
    variant_id = f"{chromosome}:g.{position}{ref}>{alt}"
    try:
        known_variant = await variant_getter(variant_id)
        if known_variant.clinical_significance:
            return f"Already classified: {known_variant.clinical_significance}"
    except:
        pass  # Variant not in databases

    # Step 3: Predict regulatory effects
    prediction = await alphagenome_predictor(
        chromosome=chromosome,
        position=position,
        reference=ref,
        alternate=alt,
        interval_size=131072
    )

    # Step 4: Analyze impact
    high_impact_genes = [
        g for g in prediction.gene_expression
        if abs(g.log2_fold_change) > 1.0
    ]

    # Step 5: Search literature
    if high_impact_genes:
        gene_symbols = [g.name for g in high_impact_genes[:3]]
        articles = await article_searcher(
            genes=gene_symbols,
            keywords=["pathogenic", "disease", "mutation"]
        )

    return {
        "variant": f"{chromosome}:{position} {ref}>{alt}",
        "high_impact_genes": high_impact_genes,
        "regulatory_assessment": assess_regulatory_impact(prediction),
        "literature_support": len(articles) if high_impact_genes else 0
    }

def assess_regulatory_impact(prediction):
    """Classify regulatory impact severity"""
    max_expression_change = max(
        abs(g.log2_fold_change) for g in prediction.gene_expression
    ) if prediction.gene_expression else 0

    if max_expression_change > 2.0:
        return "HIGH - Strong regulatory effect"
    elif max_expression_change > 1.0:
        return "MODERATE - Significant regulatory effect"
    elif max_expression_change > 0.5:
        return "LOW - Mild regulatory effect"
    else:
        return "MINIMAL - No significant regulatory effect"

Workflow 2: Non-coding Variant Prioritization

async def prioritize_noncoding_variants(variants: list[dict], disease_genes: list[str]):
    """Rank non-coding variants by predicted impact on disease genes"""

    results = []

    for variant in variants:
        # Predict effects
        prediction = await alphagenome_predictor(
            chromosome=variant["chr"],
            position=variant["pos"],
            reference=variant["ref"],
            alternate=variant["alt"]
        )

        # Check impact on disease genes
        disease_impact = {}
        for gene in prediction.gene_expression:
            if gene.name in disease_genes:
                disease_impact[gene.name] = gene.log2_fold_change

        # Calculate priority score
        if disease_impact:
            max_impact = max(abs(v) for v in disease_impact.values())
            results.append({
                "variant": variant,
                "disease_genes_affected": disease_impact,
                "priority_score": max_impact,
                "chromatin_changes": len([c for c in prediction.chromatin if c.change > 0.5])
            })

    # Sort by priority
    results.sort(key=lambda x: x["priority_score"], reverse=True)
    return results

# Example usage
variants_to_test = [
    {"chr": "chr17", "pos": 41246000, "ref": "A", "alt": "G"},
    {"chr": "chr17", "pos": 41246500, "ref": "C", "alt": "T"},
    {"chr": "chr17", "pos": 41247000, "ref": "G", "alt": "A"}
]

breast_cancer_genes = ["BRCA1", "BRCA2", "TP53", "PTEN"]
prioritized = await prioritize_noncoding_variants(variants_to_test, breast_cancer_genes)

Workflow 3: Splicing Analysis

async def analyze_splicing_variant(gene: str, exon: int, variant_pos: int, ref: str, alt: str):
    """Analyze potential splicing effects of a variant"""

    # Get gene information
    gene_info = await gene_getter(gene)
    chromosome = f"chr{gene_info.genomic_location.chr}"

    # Predict splicing effects
    prediction = await alphagenome_predictor(
        chromosome=chromosome,
        position=variant_pos,
        reference=ref,
        alternate=alt,
        interval_size=16384  # Smaller window for splicing
    )

    # Analyze splicing predictions
    splicing_effects = []
    for event in prediction.splicing:
        if abs(event.delta_psi) > 0.1:  # 10% change in splicing
            splicing_effects.append({
                "type": event.event_type,
                "change": event.delta_psi,
                "affected_exon": event.exon,
                "interpretation": interpret_splicing(event)
            })

    # Search for similar splicing variants
    articles = await article_searcher(
        genes=[gene],
        keywords=[f"exon {exon}", "splicing", "splice site"]
    )

    return {
        "variant": f"{gene} exon {exon} {ref}>{alt}",
        "splicing_effects": splicing_effects,
        "likely_consequence": predict_consequence(splicing_effects),
        "literature_precedent": len(articles)
    }

def interpret_splicing(event):
    """Interpret splicing changes"""
    if event.delta_psi > 0.5:
        return "Strong increase in exon inclusion"
    elif event.delta_psi > 0.1:
        return "Moderate increase in exon inclusion"
    elif event.delta_psi < -0.5:
        return "Strong exon skipping"
    elif event.delta_psi < -0.1:
        return "Moderate exon skipping"
    else:
        return "Minimal splicing change"

Research Applications

Enhancer Variant Analysis

async def analyze_enhancer_variant(chr: str, pos: int, ref: str, alt: str, target_gene: str):
    """Analyze variant in potential enhancer region"""

    # Use larger window to capture enhancer-promoter interactions
    prediction = await alphagenome_predictor(
        chromosome=chr,
        position=pos,
        reference=ref,
        alternate=alt,
        interval_size=524288  # 512kb
    )

    # Find target gene effect
    target_effect = None
    for gene in prediction.gene_expression:
        if gene.name == target_gene:
            target_effect = gene.log2_fold_change
            break

    # Analyze chromatin changes
    chromatin_opening = sum(
        1 for c in prediction.chromatin
        if c.change > 0 and c.type == "enhancer"
    )

    return {
        "variant_location": f"{chr}:{pos}",
        "target_gene": target_gene,
        "expression_change": target_effect,
        "enhancer_activity": "increased" if chromatin_opening > 0 else "decreased",
        "likely_enhancer": abs(target_effect or 0) > 0.5 and chromatin_opening > 0
    }

Pharmacogenomic Predictions

async def predict_drug_response_variant(drug_target: str, variant: dict):
    """Predict how variant affects drug target expression"""

    # Get drug information
    drug_info = await drug_getter(drug_target)
    target_genes = drug_info.targets

    # Predict variant effects
    prediction = await alphagenome_predictor(
        chromosome=variant["chr"],
        position=variant["pos"],
        reference=variant["ref"],
        alternate=variant["alt"],
        tissue_types=["UBERON:0002107"]  # liver for drug metabolism
    )

    # Check effects on drug targets
    target_effects = {}
    for gene in prediction.gene_expression:
        if gene.name in target_genes:
            target_effects[gene.name] = gene.log2_fold_change

    # Interpret results
    if any(abs(effect) > 1.0 for effect in target_effects.values()):
        response = "Likely altered drug response"
    elif any(abs(effect) > 0.5 for effect in target_effects.values()):
        response = "Possible altered drug response"
    else:
        response = "Unlikely to affect drug response"

    return {
        "drug": drug_target,
        "variant": variant,
        "target_effects": target_effects,
        "prediction": response,
        "recommendation": "Consider dose adjustment" if "altered" in response else "Standard dosing"
    }

Best Practices

1. Validate Input Coordinates

# Always use "chr" prefix
chromosome = "chr7"  # ✅ Correct
# chromosome = "7"   # ❌ Wrong

# Use 1-based positions (not 0-based)
position = 140753336  # ✅ 1-based

2. Handle API Errors Gracefully

try:
    result = await alphagenome_predictor(...)
except Exception as e:
    if "API key" in str(e):
        print("Please provide AlphaGenome API key")
    elif "Invalid sequence" in str(e):
        print("Check chromosome and position")
    else:
        print(f"Prediction failed: {e}")

3. Combine with Other Tools

# Complete variant analysis pipeline
async def comprehensive_variant_analysis(variant_id: str):
    # 1. Get known annotations
    known = await variant_getter(variant_id)

    # 2. Predict regulatory effects
    prediction = await alphagenome_predictor(
        chromosome=f"chr{known.chr}",
        position=known.pos,
        reference=known.ref,
        alternate=known.alt
    )

    # 3. Search literature
    articles = await article_searcher(
        variants=[variant_id],
        genes=[known.gene.symbol]
    )

    # 4. Find relevant trials
    trials = await trial_searcher(
        other_terms=[known.gene.symbol, "mutation"]
    )

    return {
        "annotations": known,
        "predictions": prediction,
        "literature": articles,
        "trials": trials
    }

4. Interpret Results Appropriately

def interpret_expression_change(log2_fc):
    """Convert log2 fold change to interpretation"""
    if log2_fc > 2.0:
        return "Very strong increase (>4x)"
    elif log2_fc > 1.0:
        return "Strong increase (2-4x)"
    elif log2_fc > 0.5:
        return "Moderate increase (1.4-2x)"
    elif log2_fc < -2.0:
        return "Very strong decrease (<0.25x)"
    elif log2_fc < -1.0:
        return "Strong decrease (0.25-0.5x)"
    elif log2_fc < -0.5:
        return "Moderate decrease (0.5-0.7x)"
    else:
        return "Minimal change"

Limitations and Considerations

Technical Limitations

  • Human only: GRCh38 reference genome
  • SNVs only: No indels or structural variants
  • Exact coordinates: Must have precise genomic position
  • Sequence context: Requires reference sequence match

Interpretation Caveats

  • Predictions not certainties: Validate with functional studies
  • Context matters: Cell type, developmental stage affect outcomes
  • Indirect effects: May miss complex regulatory cascades
  • Population variation: Individual genetic background influences

Troubleshooting

Common Issues

"API key required"

  • Set environment variable or provide per-request
  • Check key validity at AlphaGenome portal

"Invalid sequence length"

  • Verify chromosome format (use "chr" prefix)
  • Check position is within chromosome bounds
  • Ensure ref/alt are single nucleotides

"No results returned"

  • May be no genes in analysis window
  • Try larger interval size
  • Check if variant is in gene desert

Installation issues

  • Ensure Python 3.10+
  • Try pip install --upgrade pip first
  • Check for conflicting protobuf versions

Next Steps