Cufflinks reporting FPKM values of all zeroes (0)
Hello! I have come across a problem where Cufflinks is reporting all FPKM values as zeroes (0). I have a unique RNA-Seq project from a collaborator where they are studying eyesight by using tree shrews. I found that Ensembl (http://useast.ensembl.org/Tupaia_belangeri/Info/Index) has the FASTA file for the tree shrew genome (only a 2x coverage, so not very good in the first place) and had this file indexed in our local instance of Galaxy. I ran TopHat and it looks as if TopHat ran fine because I'm getting anywhere from 71-80% properly paired when I check the stats using "Flagstat." I then take the accepted hits BAM file from TopHat plus the GTF RefGene file from Ensembl for tree shrew and load that into Cufflinks. It seems as if Cufflinks works okay, but when I inspect Cufflinks three output files, all the FPKM values are 0. I have two other RNA-Seq projects (human and mouse) and both of these projects worked fine through TopHat and Cuff(links/Compare/Diff) and with a RefGene GTF file on our local instance of Galaxy (as well as on the Galaxy instance at Penn State), so it makes me think that both TopHat and Cufflinks are working okay. I'm wondering if it has to do something with the tree shrew reference genome. Has anyone encountered anything like this? If so, how did you solve the problem? If not, do you have any suggestions as to what I can do next? Any help/info would be greatly appreciated. Thanks, David
Hello David, There is most likely some mismatch between the input data. Some things to check: First, double check that the identifiers in the reference genome exactly match those in the RefGene gtf file and modify if necessary. Second, make certain that your RefGene file is sorted the same as the sorting in the BAM file (usually by reference genome chromosome/scaffold/etc and start coordinate). If neither of these solve the issue, Cufflinks has an email list for questions: tophat.cufflinks at gmail.com These are the essentials from this prior Q/A thread concerning the same issue: http://lists.bx.psu.edu/pipermail/galaxy-user/2011-March/002267.html Hopefully this helps! Best wishes for your project, Jen Galaxy team On 7/19/11 12:52 PM, David K Crossman wrote:
Hello!
I have come across a problem where Cufflinks is reporting all FPKM values as zeroes (0). I have a unique RNA-Seq project from a collaborator where they are studying eyesight by using tree shrews. I found that Ensembl (http://useast.ensembl.org/Tupaia_belangeri/Info/Index) has the FASTA file for the tree shrew genome (only a 2x coverage, so not very good in the first place) and had this file indexed in our local instance of Galaxy. I ran TopHat and it looks as if TopHat ran fine because I’m getting anywhere from 71-80% properly paired when I check the stats using “Flagstat.” I then take the accepted hits BAM file from TopHat plus the GTF RefGene file from Ensembl for tree shrew and load that into Cufflinks. It seems as if Cufflinks works okay, but when I inspect Cufflinks three output files, all the FPKM values are 0.
I have two other RNA-Seq projects (human and mouse) and both of these projects worked fine through TopHat and Cuff(links/Compare/Diff) and with a RefGene GTF file on our local instance of Galaxy (as well as on the Galaxy instance at Penn State), so it makes me think that both TopHat and Cufflinks are working okay.
I’m wondering if it has to do something with the tree shrew reference genome. Has anyone encountered anything like this? If so, how did you solve the problem? If not, do you have any suggestions as to what I can do next? Any help/info would be greatly appreciated.
Thanks,
David
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/
It appear that you are now using Ensembl make suer it is properly formated- Chr is added to first column. It may be worth checking. Vasu --- On Tue, 7/19/11, Jennifer Jackson <jen@bx.psu.edu> wrote: From: Jennifer Jackson <jen@bx.psu.edu> Subject: Re: [galaxy-user] Cufflinks reporting FPKM values of all zeroes (0) To: "David K Crossman" <dkcrossm@uab.edu> Cc: "galaxy-user (galaxy-user@lists.bx.psu.edu)" <galaxy-user@lists.bx.psu.edu> Date: Tuesday, July 19, 2011, 5:17 PM Hello David, There is most likely some mismatch between the input data. Some things to check: First, double check that the identifiers in the reference genome exactly match those in the RefGene gtf file and modify if necessary. Second, make certain that your RefGene file is sorted the same as the sorting in the BAM file (usually by reference genome chromosome/scaffold/etc and start coordinate). If neither of these solve the issue, Cufflinks has an email list for questions: tophat.cufflinks at gmail.com These are the essentials from this prior Q/A thread concerning the same issue: http://lists.bx.psu.edu/pipermail/galaxy-user/2011-March/002267.html Hopefully this helps! Best wishes for your project, Jen Galaxy team On 7/19/11 12:52 PM, David K Crossman wrote:
Hello!
I have come across a problem where Cufflinks is reporting all FPKM values as zeroes (0). I have a unique RNA-Seq project from a collaborator where they are studying eyesight by using tree shrews. I found that Ensembl (http://useast.ensembl.org/Tupaia_belangeri/Info/Index) has the FASTA file for the tree shrew genome (only a 2x coverage, so not very good in the first place) and had this file indexed in our local instance of Galaxy. I ran TopHat and it looks as if TopHat ran fine because I’m getting anywhere from 71-80% properly paired when I check the stats using “Flagstat.” I then take the accepted hits BAM file from TopHat plus the GTF RefGene file from Ensembl for tree shrew and load that into Cufflinks. It seems as if Cufflinks works okay, but when I inspect Cufflinks three output files, all the FPKM values are 0.
I have two other RNA-Seq projects (human and mouse) and both of these projects worked fine through TopHat and Cuff(links/Compare/Diff) and with a RefGene GTF file on our local instance of Galaxy (as well as on the Galaxy instance at Penn State), so it makes me think that both TopHat and Cufflinks are working okay.
I’m wondering if it has to do something with the tree shrew reference genome. Has anyone encountered anything like this? If so, how did you solve the problem? If not, do you have any suggestions as to what I can do next? Any help/info would be greatly appreciated.
Thanks,
David
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/ ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
participants (3)
-
David K Crossman
-
Jennifer Jackson
-
vasu punj