You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our operating system is Red Hat enterprise Linux. Anvio was installed in a conda environment.
Detailed description of the issue
Looking at the ribosomal RNA HMM readme file, it appears that the ribosomal RNA HMMs all use a GA cutoff of 750 bitscore. However, the 5S rRNA gene is short, and when I run the 5S HMM independently on circularized genomes I see that alignment along the full length of the HMM (about 115nt) only reaches an e-value of about 1e-14 or 1e-15 and a bitscore of 52-55. This suggests that the current bitscore cutoff makes it impossible to detect the 5S rRNA gene with anvi-run-hmms. After more testing on a few other genomes to refine the cutoff, I would suggest changing the GA bitscore cutoff for the 5S rRNA HMM to 45.
Files / commands to reproduce the issue
To test Anvio HMMs on a circularized genome found here:
If I'm not wrong, the rRNA HMM models come from Barrnap. Tbh, I've had similar issues recently with other pipelines that rely on Barrnap for rRNA detection, also with a collection of archaeal circularised genomes. Mainly missing the 5S but occasionally also some copies of 16S. For single genome annotations I ended using Infernal with the covariance models from Rfam.
The HMMs do come from Barrnap, but the Anvio readme file about how they imported them says they arbitrarily added a GA cutoff of 750 bitscore that wasn't present in the original HMMs.
Thanks for bringing this up, @philipwoods. Do you happen to have any insights whether changing that GA cutoff changes the rate of the recognition of 5S rRNA?
We can also remove the GA cutoff and put a significance threshold there, as well. I'm happy to take a look and see what works well, but it would be much better if it was done by someone who is genuinely interested in identifying 5S rRNAs more effectively, and we always appreciate any help :)
Short description of the problem
Even in single-contig circularized genomes,
anvi-run-hmms
does not reliably detect the 5S rRNA gene.anvi'o version
System info
Our operating system is Red Hat enterprise Linux. Anvio was installed in a conda environment.
Detailed description of the issue
Looking at the ribosomal RNA HMM readme file, it appears that the ribosomal RNA HMMs all use a GA cutoff of 750 bitscore. However, the 5S rRNA gene is short, and when I run the 5S HMM independently on circularized genomes I see that alignment along the full length of the HMM (about 115nt) only reaches an e-value of about 1e-14 or 1e-15 and a bitscore of 52-55. This suggests that the current bitscore cutoff makes it impossible to detect the 5S rRNA gene with
anvi-run-hmms
. After more testing on a few other genomes to refine the cutoff, I would suggest changing the GA bitscore cutoff for the 5S rRNA HMM to 45.Files / commands to reproduce the issue
To test Anvio HMMs on a circularized genome found here:
anvi-get-sequences-for-hmm-hits -c Methanosarcina_acetivorans_GCF_000007345.1.db --hmm-sources Ribosomal_RNA_5S -o 5S-test.fasta
To run the HMM independently of Anvio:
nhmmer 5S-rRNA.hmm Methanosarcina_acetivorans_GCF_000007345.1.fna
To test the HMM independently with the GA threshold:
nhmmer --cut_ga 5S-rRNA.hmm Methanosarcina_acetivorans_GCF_000007345.1.fna
The text was updated successfully, but these errors were encountered: