October 2011 - galaxy-user - lists.galaxyproject.org

New Genome Load Request
by Mark Guiltinan 26 Jan '12

26 Jan '12

Hello Galaxy, Would you please perform the following new genome load? Theobroma cacao All the necessary resource files can be found and downloaded at: http://cocoagendb.cirad.fr/gbrowse/download.html Thank you Mark Guiltinan Mark Guiltinan Professor of Plant Molecular Biology Penn State University Department of Horticulture 422 Life Sciences Building University Park, PA 16802-5807 Phone 814 863-7957 mjg9(a)psu.edu Web Site: http://guiltinanlab.cas.psu.edu

3 3

Deleted history
by Herve Rhinn 19 Jan '12

19 Jan '12

Hi, I have not been using Galaxy for a few months an it seems that a least one of my history was deleted. I assume it is because of inactivity for too long/exceeding the new quotas. Would there be by chance a a way to re-access it temporarily to be able to download some of that data or has it been totally erased? Thanks lot, Herve Rhinn

2 1

Patch for better FASTQ description handling
by Florent Angly 30 Nov '11

30 Nov '11

Hi, I have found some issue with the way FASTQ read description is handled by Galaxy utilities: https://bitbucket.org/galaxy/galaxy-central/issue/665/paired-end-code-misha… Please consider pulling my patch, thanks, Florent

4 8

Re: [galaxy-user] Names for genes in RNA-Seq analysis (Emilie Chautard)
by Emilie Chautard 29 Nov '11

29 Nov '11

Hi Olivier, Did you try to run Cuffcompare (part of Cufflinks) on your results? According to the Cufflinks manual (http://cufflinks.cbcb.umd.edu/manual.html ): >Cufflinks includes a program that you can use to help analyze the transfrags you assemble. The program cuffcompare helps you: > - Compare your assembled transcripts to a reference annotation > [...] In the Galaxy version of Cuffcompare, I think that you can provide a reference annotation file using "Use Reference Annotation:", which will be compared to your results with Cufflinks. It makes an "union" of the transcripts obtained with Cufflinks with the annotation file (both in *.gtf format). You can then obtain a transcript identifier for those already annotated. It also provides a class code for the transcripts, which can inform about a potential isoform for example. Hope this helps. Emilie -- Emilie Chautard, PhD Postdoctoral Fellow Ontario Institute for Cancer Research MaRS Centre, South Tower 101 College Street, Suite 800 Toronto, Ontario, Canada M5G 0A3 Tel: 416-673-8518 Toll-free: 1-866-678-6427 www.oicr.on.ca > Message: 7 > Date: Thu, 20 Oct 2011 15:12:45 +0200 > From: GANDRILLON OLIVIER <olivier.gandrillon(a)univ-lyon1.fr> > To: "galaxy-user(a)bx.psu.edu" <galaxy-user(a)bx.psu.edu> > Subject: [galaxy-user] Names for genes in RNA-Seq analysis > Message-ID: <CAC5EAED.8E99%olivier.gandrillon(a)univ-lyon1.fr> > Content-Type: text/plain; charset="windows-1252" > > Hello > > I am using Galaxy to analyse RNA-seq libraries made from chicken cells. > > I just groomed my sequences, passed them through TopHat and then Cufflinks. > > This worked well and in the end I get a list of genes and their respective > FPKM values. > > My only problem is that the names of the genes do not appears in the > listing, they are simply reference as "CUFF.1, CUFF.2, " etc? > > Could you please tell me how I could obtain gene names? (I went through the > FAQ and could not get the answer). > > Sincerely > > Olivier >

3 2

cufflinks error
by jh yu 03 Nov '11

03 Nov '11

Dear all: Recently I am using cufflinks to analyze differential expression between different conditions, but when using cufflinks an error occurred: An error occurred running this job:cufflinks v1.0.3 cufflinks -q --no-update-check -I 1 -F 0.050000 -j 0.050000 -p 8 -g /galaxy/main_database/files/002/991/dataset_2991920.dat Error running cufflinks. [bam_header_read] EOF marker is absent. [bam_header_read] invalid BAM binary header (this However, when I used the same parameters to analyze another file, it worked well: 19,904 lines format: gtf, database: rhodRHA1 Info: cufflinks v1.0.3 cufflinks -q --no-update-check -I 1 -F 0.050000 -j 0.050000 -p 8 -g /galaxy/main_database/files/002/991/dataset_2991920.dat The only difference is the size of each file, the failed one input file is 23 G, while the succeeded one input file is 3.5 G, is the size causing failure? Thank you in advance. Best wishes! Sincerely, Jinhai YU Jinhai YU, Ph.D Candidate 010-64888521 Institute of Biophysics, Chinese Academy of Sciences, 15 Datun Road, Chaoyang District, Beijing, 100101, China

3 2

cufflinks set parameters
by dongdong zhaoweiming 03 Nov '11

03 Nov '11

Hi, It's the first time for me to use cufflinks in galaxy, two choices confused me as follows: Perform Bias Correction: No Yes Bias detection and correction can significantly improve accuracy of transcript abundance estimates. Set Parameters for Paired-end Reads? (not recommended): No Yes what is the Bias correction and how does it work? And for "set parameters for paired-end reads? (not recommended)" ,what would be important difference between recommended and not recommended? Thanks a lot! dongdong

2 1

data format
by Klaudyna Borewicz 03 Nov '11

03 Nov '11

Hi, I would like to use Galaxy to run LEfSe, but I don't know how to get the data into tabular format that is required (http://huttenhower.org/galaxy/tool_runner?tool_id=LEfSe_for) My data is 454 fasta files that I was analyzing with RDP to get the classification. It works fine, I get .txt file that i can load to Galaxy, it looks like this: norank Root 37646 unclassified_Root 9 domain Bacteria 37637 unclassified_Bacteria 5998 phylum OD1 0 unclassified_OD1 0 genus OD1_genera_incertae_sedis 0 phylum BRC1 0 unclassified_BRC1 0 genus BRC1_genera_incertae_sedis 0 phylum Deferribacteres 0 unclassified_"Deferribacteres" 0 class Deferribacteres 0 unclassified_Deferribacteres 0 order Deferribacterales 0 unclassified_Deferribacterales 0 family Deferribacterales_incertae_sedis 0 unclassified_Deferribacterales_incertae_sedis 0 genus Caldithrix 0 family Deferribacteraceae 0 unclassified_Deferribacteraceae 0 genus Calditerrivibrio 0 genus Mucispirillum 0 ..... but i need to have the labels in a hierarchical organization and I cannot find the way to get it to work. Please let me know if you have any suggestions, or maybe RDP is just not the way to go. Thank you and hope to hear from you soon, Klaudyna -- ____________________________________________________ Klaudyna Borewicz M.Sc, B.Sc Department of Veterinary and Biomedical Sciences University of Minnesota 1971 Commonwealth Ave. St. Paul, Minnesota 55108 Phone: (612)624-6226 FAX: (612)625-5203

2 1

Downloading files
by KNIGHT M.R. 02 Nov '11

02 Nov '11

Is there any way of downloading processed files by e.g. FTP, rather than "Save as" in the web browser? I have a large (9.6Gb) BAM file to download, and downloading it via the browser seems unstable for some reason. Thanks, Marc.

2 1

Removing low quality reads
by Getiria Onsongo 02 Nov '11

02 Nov '11

Galaxy Users, I would like to filter a .bam file to remove reads with low mapping quality, especially ambiguously mapped reads (MAPQ = 0). I can easily do this using the command line version of samtools as shown below. samtools view -bq 20 hba1.bam > hba1_MAPQ20.bam None of the options available under "NGS:SAM Tools" (e.g., Generate pileup and Filter SAM) provide an option for removing reads with low mapping quality. The history shown in http://main.g2.bx.psu.edu/u/onsongo/h/obtaininghighqualityreads shows the results I would like to obtain. Data 2 shows the results of Picard tools SAM/BAM Alignment Summary Metrics<http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1.bam which contains reads with MAPQ values less than 20. As shown in this summary html, PF_READS_ALIGNED = 775 and PF_HQ_ALIGNED_READS = 241. Data 4 shows the results of Picard tools SAM/BAM Alignment Summary Metrics<http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1_MAPQ20.bam which contains only reads with MAPQ greater than or equal to 20. As shown in this summary html, PF_READS_ALIGNED = 241 and PF_HQ_ALIGNED_READS = 241. Is there a way in Galaxy to filter a bam file to remove low quality mapped reads similar to using the samtools command line alternative shown above? Thanks, Getiria -- Getiria Onsongo, Ph.D. Bioinformatics Research Scientist Masonic Cancer Center, University of Minnesota Minneapolis, MN 55455 Phone: 612-625-0101

2 1

Re: [galaxy-user] Names for genes in RNA-Seq analysis
by Jennifer Jackson 02 Nov '11

02 Nov '11

Hello Olivier, When deleting data, it takes the server a short amount of time to refresh. It may take a bit longer right now since many people are performing this action at the same time. For the RNA-seq analysis question, reference annotation GTF files are used by the Cuff* programs. (These are different than the result GTF files produced by the programs). For reference annotation GTF files, there are many sources, including Ensembl and UCSC. Here are links to a tutorial and an FAQ that can help with your usage question. http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq But, there are many small details to running the tools to get the optimal results. These types of questions concerning functionality are best directed to the tool authors at tophat.cufflinks(a)gmail.com Take care, Jen Galaxy team On 10/25/11 11:24 PM, GANDRILLON OLIVIER wrote: > Hello Jennifer > > > Le 26/10/11 02:15, « Jennifer Jackson »<jen(a)bx.psu.edu> a écrit : > >> Hello Olivier, >> >> Are you using a reference gene annotation GTF file? > > Well if I do , I do not know :-) > > What I used is first TopHat and then Cufflink. At what stage shall I use > such a GTF file? > And additionnaly: where shall I find such a GTF file? > > As I understand it: there are tow sources for a GTF file: > 1. The output from cufflinks generates one (but there is no gene names in > it..) > 2. I could get one from Ensembl? > > Shall I then use Cuffcompare on the cufflink output? > > Best > > Olivier > > PS: I am now stuck with the galaxy websute that state that I am over my > disk quota. I have deleted a couple of files but this does not seems to > help... > > -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support

2 1