Hi everyone,
I want to calculate GC content of transcripts in the gtf
file like this:
chr1 Cufflinks
transcript 3 22 1000 + . gene_id "CUFF.23955"; transcript_id
"CUFF.23955.1";
chr1
Cufflinks exon 3 10 1000 + . gene_id "CUFF.23955"; transcript_id "CUFF.23955.1";
exon_number "1";
chr1
Cufflinks exon 13 18 1000 + . gene_id "CUFF.23955"; transcript_id
"CUFF.23955.1"; exon_number "2";
chr1
Cufflinks exon 20 22 1000 + . gene_id "CUFF.23955"; transcript_id
"CUFF.23955.1"; exon_number "3";
and the genome sequence that transcript comes from
is:
>chr1
GTAGCGTCTCCGACGCGGATATGACCGCACGCTGATGCTCCCAGGGATGAGAGGCGTGCG
I have to calculate GC content of the
transcript after getting the sequence of the
transcript.
So how can I get the sequence of the
transcript. In this case, it would be AGCGTCTC
+ ACGCGG + TAT, meaning
the
transcript sequence would be AGCGTCTCACGCGGTAT.
Is it possible in the
Galaxy?