About Alvis

Alvis is a tool for visualizing amino acid conservation across multiple sequence alignments of protein families. It takes aligned FASTA sequences, identifies conserved positions above a configurable threshold, and produces an SVG diagram.

How to use it

  1. Define your groups. Each group represents one protein family or alignment. Give it a name, set a conservation threshold (percentage of sequences that must agree at a position for it to be considered conserved), and provide the aligned FASTA sequences — either by pasting them directly or uploading a file. Alternatively, upload a ZIP containing multiple FASTA files to populate all groups at once.
  2. Choose a representative. Once an alignment is loaded, pick which sequence to use as the reference for position numbering. By default this is the first sequence, but if you attach a PDB structure, Alvis will automatically select the sequence that best matches the structure's chain.
  3. Attach PDB structures (optional). Upload a PDB file or fetch one by ID from the RCSB. Alvis runs DSSP to extract secondary structure (helices, sheets, coils) and maps it onto the alignment, drawing it below the conservation line in the SVG.
  4. Cross-conservation (optional). Include a file named all.fasta in your ZIP (or add it as a cross-alignment). This is a single alignment that combines representative sequences from all groups. Alvis identifies positions conserved across all families and draws dashed connecting lines between groups in the diagram.
  5. Generate. Click "Generate SVG" to run the analysis and view the result.

Methodology

For each column in an alignment, Alvis counts the most common residue. If its frequency (count / total sequences) meets or exceeds the threshold, the position is marked as conserved. Gap characters count against conservation — a column where half the sequences have gaps needs the remaining residues to compensate. Conserved positions are reported as ungapped coordinates in the representative sequence (1-indexed).

Secondary structure assignment uses DSSP (Dictionary of Secondary Structure of Proteins) on the provided PDB file. The PDB chain is aligned to the FASTA representative via substring matching or pairwise alignment, and DSSP segments are remapped to FASTA coordinates for display.

← Back to Alvis