November 2011 - galaxy-user - lists.galaxyproject.org

Downloading files
by KNIGHT M.R. 02 Nov '11

02 Nov '11

Is there any way of downloading processed files by e.g. FTP, rather than "Save as" in the web browser? I have a large (9.6Gb) BAM file to download, and downloading it via the browser seems unstable for some reason. Thanks, Marc.

2 1

Re: [galaxy-user] Dataset not appearing as available input to SAM analyses
by Jennifer Jackson 02 Nov '11

02 Nov '11

Hi Clare, Galaxy is fairly transparent, so this occurs sometimes and can mean that either the server is very busy or there is another problem, perhaps with the dataset. Please note that mpileup is not supported yet, just pileup, although this is on our short list of to-do upgrades. If you used mpileup, this could very likely be the root cause of the tool error ("server error"). You can track the upgrade to mpileup through this ticket: https://bitbucket.org/galaxy/galaxy-central/issue/524 To troubleshoot a pileup file: 1 - double check the format of your data again the specification as outlined on the tool form for "NGS: SAM Tools -> Generate pileup from BAM dataset". 2 - try running the file type assignment again the job again to double check against a transient error. If pileup was used and the cause of the error is still unclear, we take a look we can provide feedback. Please share a link to your history using "Options -> Share or Publish" from the top of the history panel, generate a share link, then email that back to me directly. Please note which dataset presents the problem and confirm that "samtools pileup" was used. Thanks! Jen Galaxy team On 11/2/11 8:12 AM, James Nelson wrote: > Thanks Jen, but the very first time I tried changing the type to "pileup", Galaxy came back with > > Server Error > An error occurred. See the error logs for more information. (Turn debug on to display exception reports here) > > I somehow don't think the user's supposed to see the "error logs" or exception reports, so what next? > > Clare > > ----- Original Message ----- > From: "Jennifer Jackson"<jen(a)bx.psu.edu> > To: "James Nelson"<jcn(a)k-state.edu> > Cc: galaxy-user(a)bx.psu.edu > Sent: Wednesday, November 2, 2011 8:11:09 AM > Subject: Re: [galaxy-user] Dataset not appearing as available input to SAM analyses > > Hello Clare, > > Perhaps the loaded pileup datasets are not assigned to the "pileup" > datatype? Click on the pencil icon in the right upper corner of the > loaded pileup dataset, then use "Edit Attributes -> Change data type". > > Hopefully this helps, > > Best, > > Jen > Galaxy team > > > On 10/29/11 4:10 PM, James Nelson wrote: >> Hi, I'm trying to use the SAM/Filter pileup analysis on some pileup output files I've uploaded. But when I click the analysis link, these output files don't appear in the Select dataset dropdown menu; instead only one file, the output from FASTQ Summary Statistics which is obviously not pileup output, is shown. >> >> Thanks >> Clare Nelson >> ___________________________________________________________ >> The Galaxy User list should be used for the discussion of >> Galaxy analysis and other features on the public server >> at usegalaxy.org. Please keep all replies on the list by >> using "reply all" in your mail client. For discussion of >> local Galaxy instances and the Galaxy source code, please >> use the Galaxy Development list: >> >> http://lists.bx.psu.edu/listinfo/galaxy-dev >> >> To manage your subscriptions to this and other Galaxy lists, >> please use the interface at: >> >> http://lists.bx.psu.edu/ > -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support

1 0

Galaxy tracker browser freezes when zoomed in
by Steve 02 Nov '11

02 Nov '11

Hi, I'm going through the RNA-seq exercise at http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise. I'm at step 2 of mapping the reads where I've run tophat and have loaded the output files and the regGene tracks onto a new track browser. When I zoom in (to ~45kbp window), the accepted hits and one of the splice junction tracks from tophat do not show data and instead show a crossed pattern and the browser no longer allows clicking and dragging to move the window. Please advise. Cheers, Steve

2 1

Removing low quality reads
by Getiria Onsongo 02 Nov '11

02 Nov '11

Galaxy Users, I would like to filter a .bam file to remove reads with low mapping quality, especially ambiguously mapped reads (MAPQ = 0). I can easily do this using the command line version of samtools as shown below. samtools view -bq 20 hba1.bam > hba1_MAPQ20.bam None of the options available under "NGS:SAM Tools" (e.g., Generate pileup and Filter SAM) provide an option for removing reads with low mapping quality. The history shown in http://main.g2.bx.psu.edu/u/onsongo/h/obtaininghighqualityreads shows the results I would like to obtain. Data 2 shows the results of Picard tools SAM/BAM Alignment Summary Metrics<http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1.bam which contains reads with MAPQ values less than 20. As shown in this summary html, PF_READS_ALIGNED = 775 and PF_HQ_ALIGNED_READS = 241. Data 4 shows the results of Picard tools SAM/BAM Alignment Summary Metrics<http://main.g2.bx.psu.edu/tool_runner?tool_id=PicardASMetrics> on hba1_MAPQ20.bam which contains only reads with MAPQ greater than or equal to 20. As shown in this summary html, PF_READS_ALIGNED = 241 and PF_HQ_ALIGNED_READS = 241. Is there a way in Galaxy to filter a bam file to remove low quality mapped reads similar to using the samtools command line alternative shown above? Thanks, Getiria -- Getiria Onsongo, Ph.D. Bioinformatics Research Scientist Masonic Cancer Center, University of Minnesota Minneapolis, MN 55455 Phone: 612-625-0101

2 1

Expected Transcriptome BLAST
by Colicchio, Jack M 02 Nov '11

02 Nov '11

Hey, Jack Colicchio here, a PhD. student at KU. I am about to get an illumina Next gen transcriptome data setthat I would like to align and quantify against a list of expected transcripts from Mimulus guttatus. The expected transcripts are in .gff format, and I was wondering how I could get that file uploaded to you're website to allow me to align my transcriptome against. I successfully uploaded the .GFF file, and can view it on your site, but do not know how I could blast my .fastq data from illumine against this file. Thanks, Jack

2 2

Re: [galaxy-user] Names for genes in RNA-Seq analysis
by Jennifer Jackson 02 Nov '11

02 Nov '11

Hello Olivier, When deleting data, it takes the server a short amount of time to refresh. It may take a bit longer right now since many people are performing this action at the same time. For the RNA-seq analysis question, reference annotation GTF files are used by the Cuff* programs. (These are different than the result GTF files produced by the programs). For reference annotation GTF files, there are many sources, including Ensembl and UCSC. Here are links to a tutorial and an FAQ that can help with your usage question. http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq But, there are many small details to running the tools to get the optimal results. These types of questions concerning functionality are best directed to the tool authors at tophat.cufflinks(a)gmail.com Take care, Jen Galaxy team On 10/25/11 11:24 PM, GANDRILLON OLIVIER wrote: > Hello Jennifer > > > Le 26/10/11 02:15, « Jennifer Jackson »<jen(a)bx.psu.edu> a écrit : > >> Hello Olivier, >> >> Are you using a reference gene annotation GTF file? > > Well if I do , I do not know :-) > > What I used is first TopHat and then Cufflink. At what stage shall I use > such a GTF file? > And additionnaly: where shall I find such a GTF file? > > As I understand it: there are tow sources for a GTF file: > 1. The output from cufflinks generates one (but there is no gene names in > it..) > 2. I could get one from Ensembl? > > Shall I then use Cuffcompare on the cufflink output? > > Best > > Olivier > > PS: I am now stuck with the galaxy websute that state that I am over my > disk quota. I have deleted a couple of files but this does not seems to > help... > > -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support

2 1

Cannot get rid of custom build error
by James Vincent 02 Nov '11

02 Nov '11

Hello, I've tried to create a custom build (starting from the trackster page) and clearly made some mistake. I get a NoConverterException error which is almost certainly my fault putting something wrong in. However, now the Custom Builds page shows this error and I cannot make it go away. I have a local install, have killed and restarted, logged in and out, but still this error stays. I cannot try to enter another custom build because there is nothing on the page except this error. Does anyone kow how to get rid of this? Any help in getting around this error are appreciated. Jim

2 1

fastq groomer
by arabidopsis 02 Nov '11

02 Nov '11

Hi all, Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong? Thanks, Slon

6 8

get data through FTP site
by Benoit HENNUY 02 Nov '11

02 Nov '11

Hello, I uploaded (on Oct, 25) data through FTP site and I was not able to retreiven them to include them in a history. I suppose that the files should appeared in the FTP list under the Get data page. THese data are named "P2 fastq .gz" files that I wold like to use in RNA Seq workflow. Thank you to help me to be able to use these data. Best regards, Benoit HENNUY GIGA Genomics University of Liege Belgium

3 3

Re: [galaxy-user] Names for genes in RNA-Seq analysis
by Michael Gooch 02 Nov '11

02 Nov '11

Regarding the GTF files for cuffllinks, how do I obtain one for all human mRNA that actualy contains gene names rather than accession numbers. I went to the UCSC table browser but their files contain accession numbers that I dont know how to decode en-masse.

2 1