methylphase

Documentation for the methylphase toolkit

View the Project on GitHub SorenHeidelbach/methylphase

Quickstart

Use this minimal workflow to go from an indexed, mod-tagged BAM to phased methylation labels.

1) Collect inputs

2) Run the end-to-end pipeline

This drives split-reads internally, merges haplotype and methylation features, selects the best latent variant model, and imputes read labels.

methylphase phase-variants \
  --floria floria/contig/contig.haploset \
  --bam sample.mod.bam \
  --motif GATC_6mA_1 \
  --out results/phase_variants 

Key outputs land under results/phase_variants: split_reads/*, dataset.tsv, categories.toml, best_model.json, best_responsibilities.tsv, imputed.tsv, and imputed_labels.tsv.

3) Inspect quick summaries

Alternative quick run: split and aggregate only

If you just need read clustering and per-read methylation tables:

methylphase extract \
  --bam sample.mod.bam \
  --motif CG_5 \
  --output-dir results/extract

methylphase split-reads \
  --bam sample.mod.bam \
  --motif-file motifs.tsv \
  --output-dir results/split \
  --cluster-algorithm hdbscan \
  --emit-fastq \
  --threads 8