Variant translation library for Clojure.
- VCF variant ⇄ HGVS
- Finding HGVS aliases
- Conversion of a genomic coordinates between assemblies
Clojure CLI/deps.edn:
varity/varity {:mvn/version "0.10.1"}
Leiningen/Boot:
[varity "0.10.1"]
To use varity with Clojure 1.8, you must include a dependency on clojure-future-spec.
We introduced enhancements to the description of protein changes by varity.vcf-to-hgvs
, specifically making deletions more clinically meaningful:
- exon-intron boundary deletions:
The deletion that overlaps the exon-intron boundary will trigger an Exception because alterations affecting the splice sites are predicted to be splicing abnormalities.
- stop codon deletions:
In cases where deletions contain a stop codon, varity.vcf-to-hgvs
generates the following outputs based on the alteration sequence:
- If the alteration sequence contains a stop codon, varity outputs as deletion-insertion.
- Otherwise, this outputs
p.?
.
The default value of :prefer-deletion?
option is changed to false
.
(require '[varity.vcf-to-hgvs :as v2h])
(v2h/vcf-variant->coding-dna-hgvs {:chr "chr7", :pos 140924774, :ref "GGGAGGC", :alt "G"}
"path/to/hg38.fa" "path/to/refGene.txt.gz")
;;=> (#clj-hgvs/hgvs "NM_004333:c.-95GCCTCC[3]")
If you hope the previous behavior, specify :prefer-deletion? true
.
(v2h/vcf-variant->coding-dna-hgvs {:chr "chr7", :pos 140924774, :ref "GGGAGGC", :alt "G"}
"path/to/hg38.fa" "path/to/refGene.txt.gz"
{:prefer-deletion? true})
;;=> (#clj-hgvs/hgvs "NM_004333:c.-77_-72delGCCTCC")
All positions are represented as one-based number, and all ranges are represented as one-based closed intervals. For example,
{:pos 3}
represents the third position from the start, and
{:chr "chr1", :start 1, :end 3}
represents the first three bases of chromosome 1.
varity.vcf-to-hgvs
provides functions to convert a VCF-style variant into HGVS.
The returned HGVS is data structure of clj-hgvs.
(require '[varity.vcf-to-hgvs :as v2h])
(v2h/vcf-variant->hgvs {:chr "chr7", :pos 55191822, :ref "T", :alt "G"}
"path/to/hg38.fa" "path/to/refGene.txt.gz")
;;=> ({:coding-dna #clj-hgvs/hgvs "NM_005228:c.2573T>G",
;; :protein #clj-hgvs/hgvs "p.L858R"})
Use clj-hgvs.core/format
to obtain HGVS text.
(require '[clj-hgvs.core :as hgvs])
(def l858r (-> (v2h/vcf-variant->protein-hgvs {:chr "chr7", :pos 55191822, :ref "T", :alt "G"}
"path/to/hg38.fa" "path/to/refGene.txt.gz")
first))
(hgvs/format l858r {:amino-acid-format :long})
;;=> "p.Leu858Arg"
varity.hgvs-to-vcf
provides functions to convert HGVS into VCF-style variants.
(require '[varity.hgvs-to-vcf :as h2v]
'[clj-hgvs.core :as hgvs])
(h2v/hgvs->vcf-variants #clj-hgvs/hgvs "NM_005228:c.2573T>G" "path/to/hg38.fa" "path/to/refGene.txt.gz")
;;=> ({:chr "chr7", :pos 55191822, :ref "T", :alt "G"})
(h2v/hgvs->vcf-variants #clj-hgvs/hgvs "c.2573T>G" "EGFR" "path/to/hg38.fa" "path/to/refGene.txt.gz")
;;=> ({:chr "chr7", :pos 55191822, :ref "T", :alt "G"})
(h2v/hgvs->vcf-variants #clj-hgvs/hgvs "p.A222V" "MTHFR" "path/to/hg38.fa" "path/to/refGene.txt.gz")
;;=> ({:chr "chr1", :pos 11796320, :ref "GG", :alt "CA"}
;; {:chr "chr1", :pos 11796320, :ref "GG", :alt "AA"}
;; {:chr "chr1", :pos 11796320, :ref "GG", :alt "TA"}
;; {:chr "chr1", :pos 11796321, :ref "G", :alt "A"})
varity.hgvs/find-aliases
finds alternative HGVS expressions for the same
variant.
(require '[varity.hgvs :as vhgvs])
(vhgvs/find-aliases #clj-hgvs/hgvs "NM_000059:c.162CAA[1]"
"path/to/hg38.fa" "path/to/refGene.txt.gz")
;;=> (#clj-hgvs/hgvs "NM_000059:c.162CAA[1]"
;; #clj-hgvs/hgvs "NM_000059:c.165_167delCAA")
To convert a genomic coordinate between assemblies,
(require '[varity.lift :as lift])
(lift/convert-coord {:chr "chr1", :pos 743267} "path/to/hg19ToHg38.over.chain.gz")
;;=> {:chr "chr1", :pos 807887}
Copyright 2017-2022 Xcoo, Inc.
Licensed under the Apache License, Version 2.0.
The algorithm of varity.fusion
was initially developed by Norio Tanaka at Cancer Precision Medicine Center, Japanese Foundation for Cancer Research. We thank him for his scientific insight and technical help.