Hapnet | Davinacklab

hapnet: installation guide and use

What Hapnet does

Hapnet builds a population-aware haplotype network from an aligned FASTA file. It produces:

a publication-ready network figure (PNG/PDF/SVG).
TSV logs including haplotype definitions, sequence-to-haplotype membership, shared haplotypes, and a summary.

Requirements

Python 3.9+
An aligned FASTA file (all sequences same length)

Quick start (most users)

Open Terminal (macOS/Linux) or PowerShell (Windows)
Create a fresh environment and install:

MacOS/Linux
pip install hapnet

Windows (Powershell)
pip install hapnet
Run Hapnet on your aligned FASTA:
hapnet your_alignment.fasta --out network.png --log-prefix run1

Outputs will be written to your current folder (or wherever you specify).

Input FASTA format (important)

1) Sequences must be aligned.

All sequences must be the same length (e.g., output from MAFFT, MUSCLE, Clustal, etc.)

2) Population must be the last underscore-delimited token

Hapnet parses population identity from the final underscore-delimited token in each FASTA header.

Examples:

>Ind1_Pop1
>Ind2_Pop2

>Ind3_Pop2

More complex headers are fine as long as population is last:

>Ind7_Site1_2019_Pop3
>MN605578_Pneocaeca_rI

In these examples, the populations are interpreted as Pop1, Pop2, Pop3, RI.

Command-line options

hapnet input.fasta --out network.png --log-prefix run1

--out can be network.png, network.pdf, or network.svg

--log-prefix sets the prefix for output TSV files (e.g., run1_*)

Output files explained

If you run:

hapnet input.fasta --out network.png --log-prefix run1

You will get:

1. network.png (the haplotype. network figure (nodes sized by frequency; shared haplotypes shown as population pie charts; mutation tick marks on edges)

2. run1_haplotypes.tsv (one row per haplotype - haplotype ID, sequence, total count, etc.)

3. run1_membership.tsv (maps each input sequence header to its haplotype ID)

4. run1_summary.tsv (summary statistics e.g. total haplotypes, number private/shared, etc.)

Recommended Workflow for Real Data Sets

1. Align sequences (MAFFT/MUSCLE/etc.).

2. Ensure headers end with _PopulationLabel.

3. Run hapnet.

4. Use the TSV logs for downstream analysis, QC, and manuscript tables.

Troubleshooting

"command not found: hapnet"

You likely forgot to activate the environment or you installed into a different environment than the one you're using

"Sequences are not the same length"

Your FASTA is not aligned or contains sequences with gaps/missing ends: realign and/or trim to equal length.

"My populations are wrong"

hapnet uses the final underscore token. If your header ends in _2019 or _COI then hapnet will treat that as population. FIx by renaming headers so population is last.

"Plot is crowded/nodes overlap"

Dense networks can become hard to visualize when there are many haplotypes. The TSV logs remain correct even if the figure is crowded; consider plotting subsets or summarizing.

Citation

If you do plan to use this in your analysis and to publish it, please cite as follows:

Davinack, A.A. (2026). hapnet: Population-aware haplotype networks in Python (v0.1.0). https://pypi.org/project/hapnet

Davinack Research Group @ Wheaton College

hapnet: installation guide and use