Cufflinks error with illumina igenome .GTF for annnotation
Hi all, I am attempting to use tophat>cufflinks>cuffmerge>cuffdiff to compare transcript expression in 3 samples (no replicates, illumina single-end reads). Using the built in UCSC mm9 reference genome I can complete the analysis just fine, with the caveat that there is no annotation. When I repeat the analysis using the illumina igenome UCSC mm9 .gtf annotation file I get the following error in Cufflinks: An error occurred running this job: cufflinks v1.3.0 cufflinks -q --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 -G /galaxy/main_pool/pool5/files/004/309/dataset_4309547.dat -N Error running cufflinks. return code = -11 cufflinks: /lib64/libz.so.1: no version information available I have set the identifier/build as "Mouse July 2007 (NCBI37/mm9) (mm9)" so that does not seem to be the probelem. Suggestions as to how to amend this problem OR add annotations to the already completed analysis would be terrific. Thanks! Sarah
Hello Sarah, This specific error code has been seen before from Cuffdiff when there is a format problem with the GTF file. A few things to double check: 1) That you downloaded the iGenomes .tar archive to your local computer or server, unpacked it using a utility or on the command line (tar -xvf), then uploaded *only* the .gtf annotation file to Galaxy? The Galaxy datatype is .gtf and there are no metadata problems (these usually show up as warnings within a yellow box, in dataset or on "Edit Attributes" form (click on pencil icon for dataset). This particular archive is known is give a few problems (various warnings due to age of software used to create archive and not all data is usable) while unpacking, but the .gtf data is intact and is small enough to load through a direct browser upload. (FTP is not needed, but you could use FTP if you wanted). If you needed to use FTP because of the data size, or you loaded via a URL directly from the source, then you very likely loaded the .tar archive itself and will need to start over: download .tar, unpack, and load just the .gtf data. 2) When you uploaded the .gtf file to Galaxy, you *did not check* the box next to "Convert spaces to tabs:". The original and upload .gtf file should have nine, tab-delimited, columns of data. If you have 12 columns, then this means the box was checked and the format is incorrect. You will want to reload the .gtf dataset (without converting spaces to tabs), after loading confirm GTF format is correct and assigned, and re-run cuffdiff. If you have any problems or are unsure about how your data fits with these checks, please submit a bug report directly from the failed CuffDiff run from the job as executed on the public main Galaxy instance. Be sure to leave all inputs and the error dataset undeleted in your history (undelete if necessary). Hopefully this helps! Jen Galaxy team On 6/1/12 10:27 AM, Sarah Elisabeth Ewald wrote:
Hi all,
I am attempting to use tophat>cufflinks>cuffmerge>cuffdiff to compare transcript expression in 3 samples (no replicates, illumina single-end reads). Using the built in UCSC mm9 reference genome I can complete the analysis just fine, with the caveat that there is no annotation.
When I repeat the analysis using the illumina igenome UCSC mm9 .gtf annotation file I get the following error in Cufflinks:
An error occurred running this job: cufflinks v1.3.0 cufflinks -q --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 -G /galaxy/main_pool/pool5/files/004/309/dataset_4309547.dat -N Error running cufflinks. return code = -11 cufflinks: /lib64/libz.so.1: no version information available
I have set the identifier/build as "Mouse July 2007 (NCBI37/mm9) (mm9)" so that does not seem to be the probelem. Suggestions as to how to amend this problem OR add annotations to the already completed analysis would be terrific.
Thanks!
Sarah
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
Sarah Elisabeth Ewald