Dear Galaxy, I know this issue has been discussed multiple times but I think what I'm trying to do is a little bit different and I wanted to see if it is viable. Some time ago, I used bowtie and the included index for Mus musculus to do an alignment. Now I'm looking to use Cufflinks/compare/diff for expression. I annotated my alignments(SAM files) with the gff's found here: ftp://biomirror.aarnet.edu.au/biomirror/ncbigenomes/M_musculus/special_requests/gff3/ I've tried running Cufflinks but have gotten 0 FPKM for all genes when using a reference like Ensembl, UCSC, etc. If I were to upload the gff's I posted here, concatenate them, and then try and use that as the reference, would it work? I'm going to try but just thought I'd throw it out there first. The main issue is that in my SAM/BAM files the annotations are GenBank Id's, gi|#|, and all other references I tried were with chr1 and the such. Now that I think about it, I might try just downloading the NCBI gtf from iGenomes. Regardless, let me know what you think! Thanks! -- Richard Linchangco PSM in Bioinformatics College of Computing and Informatics University of North Carolina at Charlotte Tel: (630)440-7010 rlinchan@uncc.edu, rlinch2@gmail.com
Hello Richard, The best way is evaluate the file is to compare them with the specifications in the Cufflinks documentation (http://cufflinks.cbcb.umd.edu/gff.html) and then to run them through the tool. However, I can let you know that from a quick look, many of the key attributes are not in these files (transcript start site, gene name). This is one of the areas that can be the trickiest to get working. Identifiers (specifically chromosome identifiers) much be exact, file formats must be in spec, and their are certain attributes in the GTF/GFF files that Cufflinks will take advantage of if present (and ignore if not). The issues with the UCSC and Ensembl GTF files and Cufflinks was also likely with missing attributes in the ninth field. iGenomes does have datasets that were specifically annotated to have a full compliment of these attributes. You can use either UCSC's data for mm9 or mm10 at Galaxy (mm10 is new in Galaxy - today! - so you would need to run TopHat again). http://cufflinks.cbcb.umd.edu/igenomes.html Best, Jen Galaxy team On 6/6/12 12:32 PM, Richard Linchangco wrote:
Dear Galaxy, I know this issue has been discussed multiple times but I think what I'm trying to do is a little bit different and I wanted to see if it is viable.
Some time ago, I used bowtie and the included index for Mus musculus to do an alignment. Now I'm looking to use Cufflinks/compare/diff for expression. I annotated my alignments(SAM files) with the gff's found here:
ftp://biomirror.aarnet.edu.au/biomirror/ncbigenomes/M_musculus/special_requests/gff3/
I've tried running Cufflinks but have gotten 0 FPKM for all genes when using a reference like Ensembl, UCSC, etc. If I were to upload the gff's I posted here, concatenate them, and then try and use that as the reference, would it work? I'm going to try but just thought I'd throw it out there first. The main issue is that in my SAM/BAM files the annotations are GenBank Id's, gi|#|, and all other references I tried were with chr1 and the such. Now that I think about it, I might try just downloading the NCBI gtf from iGenomes. Regardless, let me know what you think! Thanks!
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
Richard Linchangco