Hi Jen,
Thank you for the information regarding the FastQ information. It was really helpful.
Lately, I have been getting the following error: "Error getting history update from this server- Bad Gateway". This occurred after I tried to reupload some pre-aligned/ and indexed BAM files from NCBI GEO because I was hoping to generate and retrieve FPKM/RPKM values from them. This has now been resolved, very sorry for the confusion it caused.
Unfortunately, the my old files are still not available on Galaxy and I get an Internal Server Error when trying to retrieve them. Although I can get the work flow for them. Same, resolved now.
The last weird error is that when I use Cuffdiff, I get FPKM of 0 with p/q values of 1 all the time. When this should not be the case as the BAM files are from two different organs. This is for every single gene, hence this indicates that something is wrong. I was able to retrieve the GTF file from UCSC main with the following settings:
Insect - D. pseuddobscura Group - Genes and Gene Prediction Tracks Track: Flybase Table FlybaseGene Output format: GTF.
I was wondering should these setting be fine or should I change the Group to mRNA or some other settings. Although the one that is avilable on UCSC is old dp3 file from 2004. The latest GFF is 3.1 on Flybase. I was wondering anyway to convert to a GTF file. I can't recommend a conversion tool, but there are a few on the web that could be tested out, if you decide to go that route. I do know that certain GFF3 files directly from FLYBASE have been problematic with the RNA-seq tools due to duplicated "ID" attributes. I don't know if this is all versions or not, or just the dm3 version. That said, the issue has been isolated to a few records (a gene mapping to >1 location), and
Hi Zain, On 5/19/13 1:35 PM, Zain A Alvi wrote: there isn't any reason why you shouldn't test out the /D. pseuddobscura/ version and then adjust it, if needed. The GTF file from the UCSC Table browser is correct, but Cuffdiff is looking for attributes that this version of the file does not have. If you look at the 9th field of the file to examine these attributes and compare it to the Cuffdiff input documentation, you can see how these differ. The gene_id and transcript_id are the same value and other attributes are not present such as tss_id and p_id. There is nothing wrong with the file, but without these attributes populated a particular way, certain calculations will not be done. http://cufflinks.cbcb.umd.edu/manual.html These variations are just different projects following a slightly different file specification. Some are content variations, some are format variations. This is common with this file type family (GFF, GTF, GFF3). This is why iGenomes creates files specifically for certain genomes for use with this tool set. When you do obtain a file that has the format and content you want to use, double check that the chromosome names are *exactly* the same between the reference genome, Tophat output, and GTF or GFF3 file. Mismatches can also lead to calculations being missed. http://wiki.galaxyproject.org/Support#Tools_on_the_Main_server iGenomes did not produce a file for fruit fly, but you could request one from them. This is where they publish the data for other genomes, and there is a link to the project at the top of the page: http://cufflinks.cbcb.umd.edu/igenomes.html Good luck with your project, Jen Galaxy team
Sorry for so many questions. Thank you again for the great help.
Sincerely,
Zain
-- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org