Thanks Jeremy, I've changed the names from protein_id to p_id using text edit and I'm going to try it again as soon as its loaded back into the galaxy site. I've been shy of using the UCSC site because I'm not a bioinformatics person and I'm learning on the fly so I was not confident I would download the right gtf file for hg19 in one simple (with emphasis on the word simple!) step - if you know exactly how to do it that'd be great. I'll let everyone know if the p_id thing fixes the ensemble.gtf file for good.
Let's first back up to the original issue because I didn't address it correctly earlier. Issue: Cuffdiff isn't producing all the output, and the problem is at least partially due to the GTF file being provided to it. Solution: The GTF that you want to provide to Cuffdiff is not the reference GTF but the GTF generated by Cuffcompare; in particular, you want the GTF file of combined transcripts. So, at least with Cuffdiff, the problem is not where you got your reference GTF. The combined transcripts produced by Cuffcompare will have tss_id and p_id attributes included, so you won't need to worry about adding them. Now, to the question of getting a reference gene annotation from UCSC. This is straightforward: (a) in Galaxy tools, go to Get Data --> UCSC Main (b) select clade/genome/assembly (c) for group, use 'Genes and Gene Prediction Tracks' (d) for track, use your favorite annotation; both RefSeq and Ensembl are good (e) for table, choose the defaultl (refGene for RefSeq, enGene for Ensembl (f) region: genome (g) output format: GTF (h) make sure 'Send to Galaxy' is checked That's it. Hope this helps, J.