October 2013 - galaxy-user - lists.galaxyproject.org

Re: [galaxy-user] SNP calling problems (Jennifer Jackson)
by garzetti 02 Oct '13

02 Oct '13

Hi Jen, thank you for your answer! I have used the Add or Replace group tool and it worked pretty well, so that I could use the FreeBayes tool with no problem! Now I have another question: I have been pre-processing my data with the NGS: GATK tools according to their Best Practices and I am ready for SNP calling. I have read the Unified Genotyper documentation and, since I am working with bacterial genome sequences, I would need to set the -sample-ploidy argument to 1 (default 2). I cannot find this option in the Galaxy version of this tool, not even in the advanced options. How can I do that? Thank you very much! Debora > > Message: 3 > Date: Fri, 27 Sep 2013 14:02:50 -0700 > From: Jennifer Jackson <jen(a)bx.psu.edu> > To: garzetti <garzetti(a)mvp.uni-muenchen.de> > Cc: galaxy-user(a)bx.psu.edu > Subject: Re: [galaxy-user] SNP calling problems > Message-ID: <5245F27A.7020200(a)bx.psu.edu> > Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" > > Hi Debora, > > Sorry to hear that you are having problems. We can help get you going > again! Please see below: > > On 9/26/13 7:20 AM, garzetti wrote: > >> Dear all, >> >> I have been looking for an answer to my problem in all the Galaxy >> Support resources but with no success. I am sorry if this topic has >> been already discussed! >> >> So, I am analyzing MiSeq data on the main Galaxy. >> I have Fastq files from 4 paired-end samples. After having checked the >> quality with FastQC and groomed them, I have performed a BWA mapping, >> filtered the results and converted the SAM to BAM files (for each >> sample separately). I have then called SNPs with Freebayes and >> SAMtools, encountering problems in both cases. >> >> 1) SAMtools: if I run the Generate pileup tool, then the Filter pileup >> doesn't recognize any valid format in the files I have in my History >> and I cannot go on with the analysis. Why is that? What can I do? >> > Make sure that the output format is set as "pileup" and the tool will > accept the input. Click on the pencil icon to make the datatype > assignment change. > http://wiki.galaxyproject.org/Support#Tool_doesn.27t_recognize_dataset > > Note that Mpileup has an option to produce .bcf format, and that is not > the same as pileup. If you have selected that type of output, then > either re-run the tool with options that create pileup format, or > convert bcf -> vcf and use one of the tools that work with vcf format to > work with your data downstream from there. > >> 2) I have performed variant calling with Freebayes on single BAM files >> and on one merged BAM files from all my four BWA mapping files. In all >> cases, the last column is "unknown", while it should be the name of my >> sample. This is not a big deal for the single vcf files, but from the >> merged BAM file, I cannot discriminate from which sample the SNPs were >> detected. I think there is a problem in the BAM files which are not >> properly indexed. Also Freebayes needs an RG tag. >> Is there a tool in Galaxy I can use to index BAM files, adding the RG >> tag? >> > The tool " NGS: Picard (beta) -> Add or Replace Groups" can be used to > annotate SAM/BAM files. This tool can be a bit picky about formats, so > just watch for that if you get an error. > > /_Quick tip:_/ You can click on the bug icon on failed datasets to see > the complete error message and it will often tell you exactly what is > wrong so that you can correct it (this doesn't automatically submit a > bug, which is good to know when you are in a hurry at night or on > weekends or just want to troubleshoot yourself). You can use this on any > error dataset to get more information if the dataset's "i" info button's > stderr/stdout links or attributes "Info" field does not provide enough > details. => This functions on servers that have bug reporting enabled > (the public Main server does, and this is straightforward to configure > on local/cloud instances, including your own, even if you use one for > small local file manipulations or file backup/storage (very handy & key > file backups are always a good idea, when doing analysis in general, > anywhere). See the Admin wiki section for more. > > Going forward, there is a short screencast about the Learning resources > in Galaxy here in a Page. It will be uploaded to Vimeo sometime in the > next 24 hrs, and will be likely updated to include the very latest as > the infrastructure updates on Main settle out in the next weeks or so, > but for now here is the link: Click on the "Learning Resources" graphic > to launch the quick tour: > https://main.g2.bx.psu.edu/u/galaxyproject/p/screencasts-usegalaxyorg > > Galaxy team's Vimeo account: http://vimeo.com/channels/581769 > We are uploading all of our vids, old & new, right now and over next few > days. We really like and hope our user's do too and follow along. The > public Main server will have direct links to this content, in the center > home page, soon as part of the "New & Improved" Galaxy experience! I > won't give an ETA, as this is in progress, but can hint that soon == > expected very soon. (!) > > Good luck and let us know if you need more help, > > Jen > Galaxy team > >> I hope someone can help me! >> >> Thank you very much! >> Debora >> >> -- Debora Garzetti, PhD Student AG Rakin Max von Pettenkofer-Institute, LMU Pettenkoferstraße 9A 80336 Munich E-mail: garzetti(a)mvp.uni-muenchen.de Phone: +49 (0)89 2180 72915

2 1

Tophat Error: segment-based junction search failed with err
by Delong, Zhou 02 Oct '13

02 Oct '13

Hello, I don't know why I still have this problem.. I have run tophat2 with different dataset, sometimes it goes well but sometime I have this error. I run only one job at a time on a virtual machine with 8G memory without using galaxy plateform. I tried --no-coverage-search option but it changes nothing. Thanks. Delong ________________________________ De : Delong, Zhou Envoyé : 27 août 2013 9:36 À : galaxy-user(a)bx.psu.edu Objet : Tophat Error: segment-based junction search failed with err Hello, I have run several analysis with Tophat 2 on my local instance of galaxy and I get this error for all of them.. segment-based junction search failed with err = 1 or -9 Here is an example of full error report: Error in tophat: [2013-08-23 11:56:58] Beginning TopHat run (v2.0.6) ----------------------------------------------- [2013-08-23 11:56:58] Checking for Bowtie Bowtie version: 2.0.2.0 [2013-08-23 11:56:58] Checking for Samtools Samtools version: 0.1.18.0 [2013-08-23 11:56:58] Checking for Bowtie index files [2013-08-23 11:56:58] Checking for reference FASTA file [2013-08-23 11:56:58] Generating SAM header for /usr/local/data/bowtie2/hg19/hg19 format: fastq quality scale: phred33 (default) [2013-08-23 11:58:04] Preparing reads left reads: min. length=50, max. length=50, 145339247 kept reads (34946 discarded) right reads: min. length=50, max. length=50, 145340153 kept reads (34040 discarded) [2013-08-23 14:16:21] Mapping left_kept_reads to genome hg19 with Bowtie2 [2013-08-24 01:04:37] Mapping left_kept_reads_seg1 to genome hg19 with Bowtie2 (1/2) [2013-08-24 03:38:22] Mapping left_kept_reads_seg2 to genome hg19 with Bowtie2 (2/2) [2013-08-24 05:29:58] Mapping right_kept_reads to genome hg19 with Bowtie2 [2013-08-24 19:50:22] Mapping right_kept_reads_seg1 to genome hg19 with Bowtie2 (1/2) [2013-08-24 22:36:38] Mapping right_kept_reads_seg2 to genome hg19 with Bowtie2 (2/2) [2013-08-25 01:40:37] Searching for junctions via segment mapping Coverage-search algorithm is turned on, making this step very slow Please try running TopHat again with the option (--no-coverage-search) if this step takes too much time or memory. [FAILED] Error: segment-based junction search failed with err =-9 Collecting potential splice sites in islands cp: cannot stat `/home/galaxy/galaxy-dist/database/job_working_directory/000/515/tophat_out/deletions.bed': No such file or directory cp: cannot stat `/home/galaxy/galaxy-dist/database/job_working_directory/000/515/tophat_out/insertions.bed': No such file or directory I did some research on the internet and it seems to be a memory problem to me, is there any solution other than rerun these jobs on a more powerful machine? And why has Bowtie/Tophat discard different numbers of reads? What will be the impact? Does it means that if I don't have exact matches between the paired end input, it is still be possible to run the job? Thanks, Delong

2 1

Bioinformatics Training Course in Portugal
by Pedro Fernandes 02 Oct '13

02 Oct '13

Course Announcement NOTE: please apply as soon as possible, the period for applications is exceptionally short due to operational reasons *ARANGS13* Automated and reproducible analysis of NGS data IMPORTANT DATES for ARANGS13 Deadline for applications: October 8th 2013 Notification of acceptance dates: October 15th 2013 Course date: October 21st - October 24th 2013 Course Description: Next generation sequencing (NGS) technologies for DNA have resulted in a yet bigger deluge of data. Researchers are learning that analysing such data sets is becoming the bottleneck in their work. In many cases, several steps in these analyses are fairly generic (e.g. quality control filtering, alignment to reference sequences, typing) so that off-the-shelf pipelines can be applied. In other cases, novel research approaches require development of new analysis pipelines. Either way, all analysis steps should be repeatable and any changes made to the data (e.g. renaming, annotation, alignment) should be recorded so that the provenance of the results is clear and inferences are reproducible. In this brief workshop we will establish several best practices of reproducibility and provenance recording in the (comparative) analysis of data obtained by NGS. In doing so we will encounter the commonly used technologies that enable these best practices by working through use cases that illustrate the underlying principles. Building on the basis of workflow development, we will further illustrate how custom-built workflows can be manipulated using graphical platforms (e.g. Galaxy, Taverna, etc.). Best practices Standardized project organization Projects 'runnable' without user intervention No loss of data, metadata, parameters or source code through versioning Sharing of scripts and workflows Technologies Next generation sequencing platforms File formats (e.g. FASTQ, SAM/BAM, GFF3) Command-line executables, command line scripting and batching High-level programming with domain-specific toolkits Revision control systems Workflow environments (both visual and command line) Use cases Phylogenetic placement of metagenomic data Typing of pathogens Comparative analysis of multicellular genomic data Post-assembly: handling richly annotated genomes More information, including application instructions, available at http://gtpb.igc.gulbenkian.pt/bicourses/ARANGS13/ Thank you Pedro Fernandes GTPB coordinator -- Pedro Fernandes Instituto Gulbenkian de Ciência Apartado 14 2781-901 OEIRAS PORTUGAL Tel +351 21 4407912 http://gtpb.igc.gulbenkian.pt

1 0

question about samtools in Galaxy
by Robert Jackman 02 Oct '13

02 Oct '13

Hi, Can anyone tell me if the ability to randomly sample a sam or bam file (view -s) is available via Galaxy samtools? I can't find it but it might be an option that I am missing. sincerely, Robert Jackman raiseal(a)gmail.com rjackman(a)bu.edu

2 1

Galaxy STARTUP ERROR!
by giovanni pascarella 02 Oct '13

02 Oct '13

Hi! I am trying to launch Galaxy locally but I'm getting the following error: tkx417:galaxy pascarellagiovanni$ sh run.sh Traceback (most recent call last): File "./scripts/paster.py", line 33, in <module> serve.run() File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/serve.py", line 1049, in run invoke(command, command_name, options, args[1:]) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/serve.py", line 1055, in invoke exit_code = runner.run(args) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/serve.py", line 220, in run result = self.command() File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/serve.py", line 643, in command app = loadapp( app_spec, name=app_name, relative_to=base, global_conf=vars) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 350, in loadapp return loadobj(APP, uri, name=name, **kw) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 374, in loadobj global_conf=global_conf) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 399, in loadcontext global_conf=global_conf) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 423, in _loadconfig return loader.get_context(object_type, name, global_conf) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 561, in get_context section) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 620, in _context_from_explicit value = import_string(found_expr) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/util/pastescript/loadwsgi.py", line 125, in import_string return pkg_resources.EntryPoint.parse("x=" + s).load(False) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/pkg_resources.py", line 1954, in load entry = __import__(self.module_name, globals(),globals(), ['__name__']) File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/web/__init__.py", line 4, in <module> from framework import expose File "/Users/pascarellagiovanni/Desktop/galaxy/lib/galaxy/web/framework/__init__.py", line 40, in <module> from babel.support import Translations File "/Users/pascarellagiovanni/Desktop/galaxy/eggs/Babel-0.9.4-py2.7.egg/babel/support.py", line 29, in <module> from babel.dates import format_date, format_datetime, format_time, LC_TIME File "/Users/pascarellagiovanni/Desktop/galaxy/eggs/Babel-0.9.4-py2.7.egg/babel/dates.py", line 34, in <module> LC_TIME = default_locale('LC_TIME') File "/Users/pascarellagiovanni/Desktop/galaxy/eggs/Babel-0.9.4-py2.7.egg/babel/core.py", line 642, in default_locale return '_'.join(filter(None, parse_locale(locale))) File "/Users/pascarellagiovanni/Desktop/galaxy/eggs/Babel-0.9.4-py2.7.egg/babel/core.py", line 763, in parse_locale raise ValueError('expected only letters, got %r' % lang) ValueError: expected only letters, got 'utf-8' Anybody knows what this is about? Thanks, Giovanni

1 0

Re: [galaxy-user] main galaxy server is down?
by Jennifer Jackson 01 Oct '13

01 Oct '13

Hi Boaz, The NGS queue was believed to be moving yesterday (incorrectly by me), and is now under review again. The grey queued jobs will eventually execute, so I wouldn't delete them quite yet, but jobs are not processing at this time. We will update the banner on the public Main server if this is extended. http://wiki.galaxyproject.org/Support#Dataset_status_and_how_jobs_execute For high-priority work, a move to an alternate Galaxy solution is an option. A cloud Galaxy is one good choice - this is a sure thing but has associated costs. Other public servers are also potential choices - each has different tools available and other requirements - so check out these and see what may work. Links for you to review are here: http://wiki.galaxyproject.org/BigPicture/Choices http://wiki.galaxyproject.org/Cloud http://wiki.galaxyproject.org/PublicGalaxyServers Good luck, and thanks for the patience during our upgrade, Jen Galaxy team On 10/1/13 1:53 AM, Boaz Shaanan wrote: > Thanks Jennifer, My job is still in the queue, so I'll keep it that way. From other people's complaints, it looks as if it's the mapping jobs that mostly have problems (mine is one like that too - a Lastz run). Any particular reason or just the general server performance issue. > > Thanks, > > Boaz > > > Boaz Shaanan, Ph.D. > Dept. of Life Sciences > Ben-Gurion University of the Negev > Beer-Sheva 84105 > Israel > > E-mail: bshaanan(a)bgu.ac.il > Phone: 972-8-647-2220 Skype: boaz.shaanan > Fax: 972-8-647-2992 or 972-8-646-1710 > > > > > > ________________________________________ > From: Jennifer Jackson [jen(a)bx.psu.edu] > Sent: Tuesday, October 01, 2013 12:27 AM > To: בעז שאנן > Cc: galaxy-user(a)lists.bx.psu.edu > Subject: Re: [galaxy-user] main galaxy server is down? > > Hello Boaz, > > Performance should be improved by now. Please allow your jobs to run if > still queued. If any error due to fileserver or cluster issues, please > simply re-run. Our apologies for these inconveniences, improvements are > due very soon! > > Best, > > Jen > Galaxy team > > On 9/27/13 2:37 PM, Boaz Shaanan wrote: >> Hi, >> >> Is the main galaxy server down? Or very sloooow? I have a Lastz job (not too demanding and one that has been run several times before) waiting for a long time already. >> >> Thanks, >> >> Boaz >> >> >> >> Boaz Shaanan, Ph.D. >> Dept. of Life Sciences >> Ben-Gurion University of the Negev >> Beer-Sheva 84105 >> Israel >> >> E-mail: bshaanan(a)bgu.ac.il >> Phone: 972-8-647-2220 Skype: boaz.shaanan >> Fax: 972-8-647-2992 or 972-8-646-1710 >> >> >> >> >> ___________________________________________________________ >> The Galaxy User list should be used for the discussion of >> Galaxy analysis and other features on the public server >> at usegalaxy.org. Please keep all replies on the list by >> using "reply all" in your mail client. For discussion of >> local Galaxy instances and the Galaxy source code, please >> use the Galaxy Development list: >> >> http://lists.bx.psu.edu/listinfo/galaxy-dev >> >> To manage your subscriptions to this and other Galaxy lists, >> please use the interface at: >> >> http://lists.bx.psu.edu/ >> >> To search Galaxy mailing lists use the unified search at: >> >> http://galaxyproject.org/search/mailinglists/ > -- > Jennifer Hillman-Jackson > http://galaxyproject.org > > -- Jennifer Hillman-Jackson http://galaxyproject.org

1 0

problem with logging in after resetting password
by Siew-Lan Ang 01 Oct '13

01 Oct '13

Hello, I am having problems accessing my galaxy account. I forgot my password and asked for it to be reset. When the new password was sent to me, it didn't work when I tried it. I've resetted 4x and each time the new password hasn't worked to log. Hope you can solve the problem. Regards, Siew-Lan Ang ____________________ Siew-Lan Ang NIMR The Ridgeway London, NW7 1AA UK Tel: 44 (0)2088162426 Fax: 44 (0)2088162523

1 0

bwa indexing not automatic?
by Joshua Orvis 01 Oct '13

01 Oct '13

I installed a new copy of galaxy today and then added the bwa_wrappers tool. After I upload my reference genome and left/right reads I get output like this each time I try to run bwa for illumina: The alignment failed. Error aligning sequence. [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwt_restore_bwt] fail to open file '/seq/gscidA/www/gscid_devel/htdocs/galaxy-dist/database/files/000/dataset_4.dat.bwt'. Abort! /bin/sh: line 1: 11607 Aborted bwa aln -t 4 -I /seq/gscidA/www/gscid_devel/htdocs/galaxy-dist/database/files/000/dataset_4.dat /seq/gscidA/www/gscid_devel/htdocs/galaxy-dist/database/files/000/dataset_2.dat > /tmp/tmpMuj4l2/tmpUIdxXT If I manually do the 'bwa index' command on dataset_4.dat it works but in the past this seemed to happen automatically. Any clue what's going on here?

2 1