Hello Delong,

Duplicated GFF IDs are not permitted in reference annotation inputs for this tool suite. There are a few options.

1 - edit the file to remove/reduce the duplicates. There could be scientific consequences when doing this, so consider carefully.

2 - use another source. iGenomes is a recommended option. An added benefit is that these files contain additional attributes in the 9th field utilized by the tools, enabling full functionality. You can read about this in the "inputs" section for each tool in the manual, I'll link it below.

The human iGenomes gtf file is already in the public Main Galaxy instance in Shared Data -> Data Libraries -> iGenomes. Or, you can download the original data at the Cufflinks web site, extract the gtf, and load where ever you are using Galaxy (local, cloud, other public instance).

http://wiki.galaxyproject.org/Support#Tools_on_the_Main_server
     ExampleRNA-seq analysis tools.
http://cufflinks.cbcb.umd.edu/manual.html
http://cufflinks.cbcb.umd.edu/igenomes.html

Best,

Jen
Galaxy team


On 7/4/13 6:30 AM, Delong, Zhou wrote:
Hello,
I was doing a RNA analyse and I wished to compare the transcription and expression of two samples using a reference annotation, however this is the error message I got:

=====Quote=====
Error running cuffmerge. 
[Thu Jul  4 07:32:59 2013] Beginning transcriptome assembly merge
-------------------------------------------

[Thu Jul  4 07:32:59 2013] Preparing output location cm_output/
[Thu Jul  4 07:34:07 2013] Converting GTF files to SAM
[07:34:07] Loading reference annotation.
[07:34:07] Loading reference annotation.
[Thu Jul  4 07:34:08 2013] Quantitating transcripts
You are using Cufflinks v2.1.1, which is the most recent release.
Command line:
cufflinks -o cm_output/ -F 0.05 -g /galaxy/main_pool/pool7/files/006/446/dataset_6446730.dat -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 4 cm_output/tmp/mergeSam_fileIO17rb 
[bam_header_read] EOF marker is absent.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
File cm_output/tmp/mergeSam_fileIO17rb doesn't appear to be a valid BAM file, trying SAM...
[07:34:08] Loading reference annotation.
[07:35:53] Inspecting reads and determining fragment length distribution.
Processed 33854 loci.                       
Map Properties:
	Normalized Map Mass: 8719.00
	Raw Map Mass: 8719.00
	Fragment Length Distribution: Truncated Gaussian (default)
	              Default Mean: 200
	           Default Std Dev: 80
[07:35:53] Assembling transcripts and estimating abundances.
Processed 33854 loci.                       
[Thu Jul  4 07:39:29 2013] Comparing against reference file /galaxy/main_pool/pool7/files/006/446/dataset_6446730.dat
You are using Cufflinks v2.1.1, which is the most recent release.
Error: duplicate GFF ID 'ENST00000361547.2' encountered!
	[FAILED]
Error: could not execute cuffcompare

======End quote======

The job goes well without the annotation reference. 
The annotation file I used can be downloaded here:
ftp://ftp.sanger.ac.uk/pub/gencode/release_17/gencode.v17.annotation.gtf.gz

Can anyone help me please?
Thanks,
Delong

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

-- 
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org