Hi Humberto, Yes, my apologies, this should have been included in the original reply. The 'locus' field in the Cuffdiff files refers to a gene bound - not individual transcripts. To get to the transcripts, the inputs to Cuffdiff need to be accessed. If you used Cuffmerge, the "merged transcripts" GTF file would be the correct file to use as input to "Extract". If you used just Cuffcompare, use the "combined transcripts" GTF. To know which transcript was associated with which gene bound, compare the Cuffmerge merged transcripts GTF attributes (9th column: gene_id, tss_id, etc) with Cuffdiffs "gene_id", "tss_id" values - is also in the test_id column, depending on the file. The Cuffcompare GTF comparisons will be similar. You can gain access to the GTF attributes with the tool "Filter and Sort -> Filter GTF data by attribute values_list". Cut out the column of interest in the Cuffdiff file ("Text Manipulation -> Cut"), edit as desired, and use as a list filter. Or explore the other GFF filter options in the same tool group. Take care, Jen Galaxy team On 9/13/12 11:14 AM, Humberto Boncristiani wrote:
Hi
Fetch sequence-extract genomic DNA do not accept cuffidif files. Should I convert this file to some specific format?
Thanks,
Humberto.
*Dr. Humberto Boncristiani* National Research Council (NRC) Fellow Adjunct Research Associate Department of Biology Univ. North Carolina at Greensboro 312 Eberhart Bldg Greensboro, NC 27403, USA. Tel.:(1) 336-256-2591 Fax: (1) 336-334-5839 email: humbfb@gmail.com <mailto:humbfb@gmail.com>
On Sep 13, 2012, at 2:06 PM, Jennifer Jackson wrote:
Hello,
By no annotation, do you mean species-specific annotation (GTF) was not used? And you want to compare to a protein database like Genbank NR or RefSeq? Then these are the instructions. Please let us know if you had something else in mind.
The sequence extraction can be done on Galaxy Main (if that is where you are working), but the BLAST will need to be run on a local or cloud install. To get set up (instance and data), start here: http://getgalaxy.org http://usegalaxy.org/cloud
The BLAST+ wrapper recently moved from the distribution to the Tool Shed, but there are installation tools integrated to help get this into your instance. See the latest News Brief for details (Sept 7, 2012) - these are also good to follow as you maintain your instance: http://wiki.g2.bx.psu.edu/News http://wiki.g2.bx.psu.edu/DevNewsBriefs/2012_09_07
Questions about local/cloud installs are best directed to the galaxy-dev@bx.psu.edu mailing list: http://wiki.g2.bx.psu.edu/Mailing%20Lists
To extract the transcript sequences, use the tool 'Fetch Sequences -> Extract Genomic DNA'. This will accept a custom reference genome from the history, if you have been using one, by changing the option "Source for Genomic Data:" to "History".
Hopefully this helps,
Jen Galaxy team
On 9/13/12 10:09 AM, Humberto Boncristiani wrote:
Hi.
I got cuffdiff files with gene differential expression on it. I don't have the annotation, therefore I need to extract the sequence information from the genome coordinates and them blast them to identify those. How the easiest way to do it?
Thanks.
Humberto
*Dr. Humberto Boncristiani* National Research Council (NRC) Fellow Adjunct Research Associate Department of Biology Univ. North Carolina at Greensboro 312 Eberhart Bldg Greensboro, NC 27403, USA. Tel.:(1) 336-256-2591 Fax: (1) 336-334-5839 email: humbfb@gmail.com <mailto:humbfb@gmail.com>
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
-- Jennifer Jackson http://galaxyproject.org