Cufflinks merging more than one transcript on bacterial genomes

Noa Sher

24 Jan 2012 24 Jan '12

8:07 a.m.

Attachments:

attachment.htm (text/html — 1.9 KB)

Show replies by date

Jeremy Goecks

25 Jan 25 Jan

12:13 p.m.

Noa,

...

This is one thing I would like help with- is it worth simply reducing to nothing the max intron size? What is accepted consensus when using tophat on bacterial genomes?

I'm not sure that folks on this list have much experience with bacterial transcriptome analysis. You might try seqanswers.com or try emailing the Tophat/Cufflinks authors directly: tophat.cufflinks@gmail.com If you find something interesting in another place, please feel free to share with the Galaxy community.

...

When I look at the second tophat file, of accepted hits, all hits align nicely with known genes. However, when I run cufflinks I run into the following issues: when I use a reference genome, I get in addition to the known transcripts, a bunch of very long transcripts spanning very large genomic regions. Also, I will have two genes that are very near each other but run in opposite directions (which you can see beautifully in the tophat accepted hits alignments - different colors for each strand) but they merge into a single CUFF identifier. Is there any way I can address this- is it something I am missing with respect to parameters I have to change because I am working on a bacterial genome?

Reference genome or reference gene annotation? Using a genome to correct for bias should not change the assembled transcripts, only their expression levels. You can use a reference gene annotation either as ground truth or as a guide; using the reference as ground truth ensures that Cufflinks will only assemble transcripts defined in the annotation. Good luck, J.

5263

Age (days ago)

5264

Last active (days ago)

List overview

Download

1 comments

2 participants

participants (2)

Jeremy Goecks
Noa Sher

Noa Sher

Jeremy Goecks

tags

participants (2)