September 2015 - galaxy-dev - lists.galaxyproject.org

Picard sam2fastq tool can output file including both single and paired reads?
by Dooley, Damion 16 Sep '15

16 Sep '15

A researcher here was trying to use Galaxy Picard sam2fastq tool to generate a single fastq file from a sam file that has a mix of paired and unpaired data. She was thinking the second input on the sheet might control this: > Do you want to output a fastq file per read group (two fastq files per read group if the group is paired) > YesNo > OUTPUT_PER_RG; default=False But preliminary testing indicates there is no output scenario that includes both paired and unpaired data in the output (esp. as just one file). I just wanted to verify that this was the case, since the docs don't talk about this scenario? We're planning instead to write a little Galaxy tool that does what she gets accomplished on the command line: > samtools view -bS in.sam > out.bam > bamtools convert -format fastq -in in.bam > out.fastq which includes unpaired reads too. Thanks for feedback, d. Hsiao lab, BC Public Health Microbiology & Reference Laboratory, BC Centre for Disease Control 655 West 12th Avenue, Vancouver, British Columbia, V5Z 4R4 Canada

1 0

Mothur count.seqs
by Shane Sturrock 15 Sep '15

15 Sep '15

I’ve had a report from my users that the count.seqs function in Mothur doesn’t work and just produces an empty table - this is also what I’m seeing when testing it both on the current July 2015 distribution, plus I went back to my backup server still running the May 2015 version. I’ve attached a test set and using the latest Mothur installation and running count.seqs on this seems to work according to the logs but the output file is empty. Datatype needs to be set to names when this is imported of course. I would like to get this working because at the moment my users have to go back to the CLI version to do their work. Shane Dr. Shane Sturrock NZGL BioIT Admin nzgl(a)biomatters.com <mailto:nzgl@biomatters.com>

1 0

simplifying Galaxy interface to the max (zen-like)
by Piotr Grabowski 15 Sep '15

15 Sep '15

Dear Galaxy-Devs, I am developing a tool for biologists, I managed to integrate it with Galaxy (it's a KNIME/R-based machine learning workflow). This tool doesn't require a lot of interaction from the user - just a list of IDs pasted-in and maybe 1-2 text entry fields - then it's just Execute. Even though the inner works of Galaxy are great and I got a grip of it, we believe that for our needs the workflow-based system with a tool and data list is a bit confusing for non-computational people who don't want to spend too much time on learning the interface (I know it's not complicated, but we know how people are...). So what we're aiming for is to put some sort of Google Search-like front, as simple as possible. Could anyone point me in the right direction, how to do it ? Or maybe anyone ever did something like this ? I could imagine that it should be easy, since we want to remove almost all elements, not write them from scratch. Any ideas ? Best, Piotr

3 3

FastQC galaxy issue
by Hakeem Almabrazi 14 Sep '15

14 Sep '15

Hi, I have encountered the following issue when I try to use FastQC tool in Galaxy. The fastqc file is validated using the fastqvalidator tool and the same files have been processed by other tools (i.e bwa) without any complaints about the fastqc . Also, if I ran the fastqc from the command line it gets executed without any issue too. I have updated my galaxy repository in case there is new updates and the FastQC version is v0.11.2 Is this something to do with the FastQC wrapper in galaxy? If it helps, the fastq files are in the file system and I link to them into galaxy using the options Link and Fastqqsanger as data type. Any help will be highly appreciated. ........... Fatal error: Exit code 1 () Failed to process L-20417_S7_L007_R2_001.fastq uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@' at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158) at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:89) at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:104) at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62) at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:122) at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:95) at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:308) Traceback (most recent call last): File "/gpfs/home/galaxyadmin/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/fastqc/8c650f7f76e9/fastqc/rgFastQC.py", line 162, in <module> fastqc_runner.run_fastqc() File "/gpfs/home/galaxyadmin/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/fastqc/8c650f7f76e9/fastqc/rgFastQC.py", line 136, in run_fastqc self.copy_output_file_to_dataset() File "/gpfs/home/galaxyadmin/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/fastqc/8c650f7f76e9/fastqc/rgFastQC.py", line 109, in copy_output_file_to_dataset with open(result_file[0], 'rb') as fsrc: IndexError: list index out of range Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.

3 5

GCC2017 Needs a Host!
by Dave Clements 14 Sep '15

14 Sep '15

Hello all, The Galaxy Community Conference has been held annually since 2010. GCC2016 <http://galaxyproject.org/GCC2016> will be held June 25-29 at Indiana University <http://indiana.edu/> in Bloomington, Indiana, United States. *We are now seeking proposals <https://wiki.galaxyproject.org/Documents?action=AttachFile&do=view&target=G…> from organizations interested in hosting GCC2017.* Galaxy's informal policy of rotating between North America and elsewhere every other year is now a formal policy: Hosts for GCC2017 need to be located outside of North America. GCC draws 200+ participants from data-intensive life science research. Participants come from around the world, from all career stages, and do research spanning the tree of life. Universities, hospitals and medical schools, research organizations, and industry are all represented, including some of the largest and most influential research organizations in the world. What do you need to host GCC2017, you ask? In approximately decreasing order of importance: - *Enthusiasm to plan and organize several events over 5 days for more than 200 people.* - Space for 4-6 parallel training sessions over two full days, with each space able to accommodate 25 to 75 participants. - Central meeting space with capacity for 250-300 people. - Affordability (or access to sponsorship funds; there's a reason we've always held GCC in academic settings) - Nearby space for breakouts, poster sessions and sponsors. - Nearby space for lunch, coffee breaks. - Good wifi for all events (that's 200+ people and their devices). - Space for hackathons for 2 days before the event. - Easy to get to by air. - Nearby, affordable housing, or easy walking distance or easy public transport from lodging options to conference facilities - Close proximity to a pub and other social hubs. Hosts get copious (and enthusiastic) organizational support from Galaxy's Outreach Team (who have been helping to organize GCC since 2011), and even more copious gratitude (even *adulation!)* from the Galaxy Community. See the *full call for host proposals <https://wiki.galaxyproject.org/Documents?action=AttachFile&do=view&target=G…>* for more details on what should be covered in proposals. Proposals are due by the end of 30 October 2015. If you are interested in possibly hosting GCC2017 then please contact Galaxy Outreach <outreach(a)galaxyproject.org> with any questions. Our goal is to announce GCC2017 at GCC2016 this coming June. Hoping to work with you in 2017! Dave Clements and the Galaxy Team <https://wiki.galaxyproject.org/GalaxyTeam> -- http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ https://wiki.galaxyproject.org/

1 0

Removing LWR Support from Galaxy
by John Chilton 14 Sep '15

14 Sep '15

Hello All, The LWR has been deprecated for over a year now. It has been renamed and replaced by Pulsar which has many advantages over the previous LWR code base. Instructions on migrating from the LWR to Pulsar can be found here - http://pulsar.readthedocs.org/en/latest/upgrading.html. There have been several great efforts to cleanup the Galaxy codebase lately and I think it may be time to remove legacy LWR support to help with these. Would there be any objections to removing LWR support from Galaxy with say release 15.10 next month? Thanks, -John

1 0

Subtleties of <version_command> and escaping dollar signs
by Peter Cock 14 Sep '15

14 Sep '15

Hello all, I've just noticed that dollar signs for accessing environment variables must be escaped in the XML <command> tag, but must not be escaped in the <version_command> tag: https://github.com/peterjc/pico_galaxy/commit/4613a08139a3dfa07c3b0411ac8a9… I presume this is because <command> is parsed as a Cheetah template where dollar means a Python variable (like the input and output parameters), while for the <version_command> this does not happen? Has anyone else been caught out by this? Is it worth adding a note about this to the wiki? https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax Regards, Peter

2 3

Opening Galaxy with a specified History Name
by rbrown1422＠comcast.net 14 Sep '15

14 Sep '15

To all, I would like to open a Galaxy session programmatically (schedule Galaxy with their user name and password) and have the Galaxy session open with a history name I specify. So if there was work done previously under that history name, those output files will appear in the history window -- without the user having to change the history name manually to see them. Is this possible? Thanks, bob

2 3

Process stuck in unix 'T' State
by Calvin Morrison 14 Sep '15

14 Sep '15

hi guys, I have a script that is 'running' according to galaxy but it never completes. When i looked in the terminal, the processes were stuck in a Unix 'T' state, which means they're of course never going to finish. Is galaxy causing these scripts to go into a T state or is the problem somewhere else? I'm at a loss Thanks, Calvin Morrison

2 1

Galaxy configuration
by Clinton Chee 14 Sep '15

14 Sep '15

Dear Galaxy devs / support, I am trying to understand the following snippet from job_conf.xml (I want to use drmaa v1) ------------ <destination id="pbs_drmaa_orion" runner="drmaa" tags="merc"> <param id="destination">galaxy@merc</param> </destination> ------------ Based on someone's configuration above, I am trying to customize to my system. I'm trying to understand where/how the parameters are associated? - tags? - destination in param id? is this the queue? I also checked the DRMAA specification but cannot find any "destination" as a keyword. - galaxy@merc? is merc pointing to the hostname, and galaxy the queue name? (I know the instution's cluster is called "merc", but I don't know if the "merc" in the tag is for convenience or is being read as the hostname? I've looked through: https://wiki.galaxyproject.org/Admin/Config/Performance/ProductionServer https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster https://wiki.galaxyproject.org/Admin/Config/Jobs - describes job_conf.xml but cannot make sense, eg. for "tags" is says "Tags to which this destination belongs." Eg. "tags="longwalltime,bigcluster"" Do you have more detailed documention on configuration? Thanks Clinton > > -- > ------------------------------------------------------------------------ > *PBS WorksTM named #1 software product!* > HPCwire Reader’s Choice Awards 2014 > * > For more information on Altair’s award-winning HPC > workload management suite, please visit pbsworks.com > <http://pbsworks.com/> * > ------------------------------------------------------------------------ > Clinton Chee (PhD) > PBS Application Engineer > Altair - Innovation Intelligence > Mob: 61 (0)402 058 983 > Email: chee(a)altair.com > Skype: live:clinton.chee > Twitter: @xtechnotes > <https://twitter.com/xtechnotes>!function(d,s,id){var > js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore > <http://platform.twitter.com/widgets.js%27;fjs.parentNode.insertBefore>(js,fjs);}}(document, > 'script', 'twitter-wjs'); > PBSWorks: Facebook <https://www.facebook.com/pbsworks>| Google+ > <https://plus.google.com/116621495421190215237>| Linkedin > <http://www.linkedin.com/groups?gid=2112011> | Twitter > <https://twitter.com/PBSWorks>| YouTube > <http://www.youtube.com/thepbsworks> > Australia Toll Free: 1800 174 396 > Urgent / Emergency: pbs-support(a)india.altair.com

3 5