Skip to content

rafalab/encodeRNAseqEvaluation

Repository files navigation

gencode.v16.annotation.gtf is the original ENCODE gtf file and can be obtained here:
ftp://ftp.sanger.ac.uk/pub/gencode/release_16/gencode.v16.annotation.gtf.gz

I used make-tables.R
to extract the transcripts and genes and make two new gtf files.

gencode.v16.annotation.genes.gtf.gz
gencode.v16.annotation.transcripts.gtf.gz

These tables are subsets of the original except for these addition to the attribute column:
1- the first addition is a unique ID (feature_id)
2- the second addition is the row number of the original file corresponding to the entry (original_row)
3- rows were appended to the end to annotate the ENCODE spikeins

Note that the files exon.id.txt, genes.id.txt and transcript.id.txt contain just the feature_ids

You can use the file sanitycheck.R to check if the tables columns and rows match

An example submission is here: exampleSubmission.tgz

Note 1: that there is also an exon table:

gencode.v16.annotation.exon.gtf.gz

in case we want to permit future exon submissions 

Note 2: the files spikein_exon.gtf, spikein_gene.gtf, spikein_transcript.gtf contain just the spike-ins. These have been appended to the end of the annotation files

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages