January 2015 - galaxy-dev - lists.galaxyproject.org

data sharing without duplication?
by Fernandez Edgar 30 Jan '15

30 Jan '15

Good morning Gents, I hope everyone is well. I was wondering if it's possible to share data on Galaxy without all the users downloading the data to their history. Let me explain, teachers will be sharing data with their students. And the students will need to work with that specific data. However, I would like them to be able to read that data without replicating it on their history. Is that possible? Cordialement / Regards, Edgar Fernandez System Administrator (Linux) Direction Générale des Technologies de l'Information et de la Communication * Bur. : 1-514-343-6111 poste 16568 Université de Montréal PAVILLON ROGER-GAUDRY, bureau X-218

3 2

Nothing being tested on Test and main Tool Shed?
by Peter Cock 30 Jan '15

30 Jan '15

Hello all, I am currently hoping to review the automated test results for some repositories which I have recently updated, in one case for dependency handling, the other functional changes: https://testtoolshed.g2.bx.psu.edu/view/peterjc/mummer https://testtoolshed.g2.bx.psu.edu/view/peterjc/ncbi_blast_plus These have not yet been tested. On further investigation of a sample of my other tools, it appears none of them have been tested on the Test Tool Shed since 2014-09-15, e.g. https://testtoolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://testtoolshed.g2.bx.psu.edu/view/peterjc/sample_seqs https://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats https://testtoolshed.g2.bx.psu.edu/view/peterjc/effectivet3 https://testtoolshed.g2.bx.psu.edu/view/peterjc/clinod Similarly, some of my tools on the Main Tool Shed appear not to have been tested since 2014-09-21, e.g. https://toolshed.g2.bx.psu.edu/view/peterjc/seq_rename https://toolshed.g2.bx.psu.edu/view/peterjc/effectivet3 https://toolshed.g2.bx.psu.edu/view/peterjc/clinod or 2014-10-27, https://toolshed.g2.bx.psu.edu/view/peterjc/sample_seqs https://toolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats Is there a known problem with the automated tool testing (previously every second night) on the Tool Sheds? Or have you had to further reduce the testing cycle? Testing less frequently seems fine, say fortnightly, if this can be supplemented by testing updated tools everynight. That would give Tool Authors prompt feedback on their updates, but also catch regressions where changes in Galaxy break a previously working tool. Regards, Peter

4 17

Galactic News!!! February 2015 Edition
by Dave Clements 30 Jan '15

30 Jan '15

1 0

SGE integration
by Ryan G 29 Jan '15

29 Jan '15

I'm trying to get my instance of Galaxy working with Sun Grid Engine. The page https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster is not 100% clear on how to do this but I here's what my job_conf.xml looks like: <?xml version="1.0"?> <job_conf> <plugins> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/> <plugin id="sge" type="runner" load="galaxy.jobs.runners.drmaa:DRMAAJobRunner">  <param id="drmaa_library_path">/apps/sys/sge/ge2011.11/lib/linux-x64/libdrmaa.so</param> </plugin> </plugins> <handlers> <handler id="sge"/> </handlers> <destinations default="sge"> <destination id="local" runner="local" /> <destination id="sge" runner="sge"> </destination> </destinations> </job_conf> I tried to submit a job via Galaxy and at first the log files indicated the shell script got created in the job_working_dir and I was a job ID, but when the job completed, Galaxy didn't know about it. Now, I have no jobs runnings and I can't submit anything via Galaxy. It doesn't appear to submit them to SGE or even create the directory for the job. Pastor.log doesn't show anything either. So, I guess I could use a little help with getting Galaxy running with SGE. I have a simple set up so nothing fancy. Any ideas where to get started?

1 0

Re: [galaxy-dev] NFS configuration questions + NFS /etc/fstab configuration issue with AMI issue
by Dannon Baker 29 Jan '15

29 Jan '15

Hey Scott, Glad you figured out the other two issues. Which file exactly are you editing that is getting overwritten? And when does it get overwritten? (on reboot, randomly, cluster terminate/start, etc?) -Dannon On Sun Jan 25 2015 at 12:20:12 PM Scott Jeschonek <scottj(a)averesystems.com> wrote: > Revising all of the below… > > I figured out the configuration location and file system issues. > > The only issue I have now is that my configured universe_wsgi.ini file is > being overwritten with defaults. I modified it in the s3 bucket and > locally but to no avail. Any ideas? > > > > On Jan 24, 2015, at 12:17 PM, Scott Jeschonek <scottj(a)averesystems.com> > wrote: > > Hi, > > I’m setting up a ‘local’ Galaxy server in AWS and had a few questions that > I can’t seem to answer. > > 1. *Directory configuration in universe_wsgi.ini* — I want to use an NFS > from another server. The purpose would be to deposit results there. I may > also want to point to another NFS server as a centralized reference / index > file system. I tried changing the path for the Indices setting and > restarted Galaxy, but CloudMan is showing the file system as “error”. > Basically I am trying to wrap my head around the steps to re-pointing to > NFS mounts (is it just change the paths in universe_wsgi.ini then restart > everything?) > > 2. *CloudMan AMI and fstab* — I’m using the latest AMI Cloudman instance > and it seems to have an issue with NFS configurations in the fstab. I am > not quite sure what is going on, the entry is correct. I’m able to > manually mount the filesystems as well so the paths are correct. Is there > something specific I need to change on that image? > > Thanks in advance for any assistance! > > Scott Jeschonek > Avere Systems > > > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > https://lists.galaxyproject.org/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/

2 1

Data manager .loc file not installing in correct place
by Jeremy Liu 29 Jan '15

29 Jan '15

Galaxy Devs, I have a (semi) working data manager and I'm stuck on what exactly is going wrong. Most of it appears to be working properly, the intended file (pouya_test_motifs.bed.bgz) is being downloaded to the right directory (/galaxy-dist/tool-data/motifs). However, the .loc file that describes the data table isn't being set to the right place. It should be going to (/galaxy-dist/tool-data/motif_databases.loc) but it ends up in (./database/tmp/tmp-toolshed-gmfcrTJ8T4R/motif_databases.loc). I think it's because /galaxy-dist/tool_data_conf.xml isn't being updated. In addition, I have been working off of the example data manager data_manager_fetch_genome_all_fasta. What is the usage of tool_data_table_conf.xml.sample? Does it get automatically merged with tool_data_table_conf.xml in the galaxy root directory? For reference, the tools is "region_motif_data_manager" in the testtoolshed. Has anyone else run into this problem? Thanks! Jeremy Liu jeremy.liu(a)yale.edu PO Box 207077 New Haven, CT 06520-7077 352.871.1258

2 2

Galaxy XML from python's argparse
by Eric Rasche 26 Jan '15

26 Jan '15

Howdy devs, I put together a library recently, and since it seems to be functional I thought I'd share with the rest of -dev in case it's of interest to anyone. If anyone has feedback, bugs/issues, or PRs, I'd be happy to receive them! https://github.com/erasche/gxargparse So, what is it? gxargparse is a drop in replacement for argparse which can generate Galaxy Tool XML on demand. When I say "drop in" replacement, I mean it. Through some python magic, as soon as you `pip install gxargparse`, your argparse will (maybe*) be transparently wrapped by gxargparse, and you'll have the --generate_galaxy_xml flag available, which will generate Galaxy Tool XML. This means *no code changes required*, and free tool XML generation. However, beware that it is "free tool XML"; you will likely need to make some manual corrections to it before publishing tools (repeat labels for instance). However, if you're converting an argparse tool with hundreds of arguments for use in Galaxy, this could save you a lot of initial manual work. * I say *maybe* because it depends a bit on your python module load order, which is something completely outside of my control. The package comes with a command line tool <https://github.com/erasche/gxargparse#it-doesnt-work> which spits out a path you can stick in PYTHONPATH to fix this issue. Where to get it? Now available on pypi (gxargparse <https://mail.yandex.ru/re.jsx?h=a,u3kuzrWUkilmdvbgYLueaQ&l=aHR0cHM6Ly9weXBp…>) and github (erasche/gxargparse <https://mail.yandex.ru/re.jsx?h=a,JjiBfNrf6Qgagy0RTH93mw&l=aHR0cHM6Ly9naXRo…>). I *strongly* recommend against installing it system wide, as any bugs in it could render all argparse based python tools broken on your system. It's much more reasonable to use it in a virtualenv. (gx Known Problems - argument_groups are not dealt with specially - prefix_chars and other lesser used features are not (yet) supported - anything with a repeat is a bit of a hack - no translation from argparse to conditionals/which yet figured out. Bugs reports/suggestions are welcome <https://github.com/erasche/gxargparse/issues/>! Cheers, Eric -- Eric Rasche Programmer II Center for Phage Technology Rm 312A, BioBio Texas A&M University College Station, TX 77843 404-692-2048 esr(a)tamu.edu rasche.eric(a)yandex.ru

1 0

Re: [galaxy-dev] CloudMan + Ansible + AWS
by Brad Chapman 26 Jan '15

26 Jan '15

Enis, John and all; I spotted ansible-cloudman on GitHub today which reminded me I've been meaning to write about the approach we setup late last year to run bcbio on AWS. It uses elasticluster (https://github.com/gc3-uzh-ch/elasticluster) which has the advantage of being all Ansible scripts and bootstrapping from standard images -- so no more making AMIs. It also uses SLURM instead of SGE, which is a nice change. We wrote an interface that automates all of the stuff you need to setup on AWS: IAM users, VPCs and what not. It is a pretty streamlined process from the command line, including specifying the cluster size and stopping/starting it: https://bcbio-nextgen.readthedocs.org/en/latest/contents/cloud.html#aws-set… I also wrote up some benchmarking work using to give an idea of using it in practice: http://bcb.io/2014/12/19/awsbench/ All of the code is here: https://github.com/chapmanb/bcbio-nextgen-vm As always, happy to overlap/share with whatever y'all decide to do. We could make bcbio specific stuff optional as needed, although it is pretty lightweight -- just the driver scripts and a Docker image of bcbio. It's basically a ready to use cluster with this little extra added so hopefully could be useful for future plans with CloudMan. Hope this is useful, Brad

2 1

data collections - workflow - bug?
by Torsten Houwaart 26 Jan '15

26 Jan '15

Hello Galaxy Devs, I was using data collections (for the first time) for a new workflow of ours and I ran into this problem. There was no complaint by the workflow-editor and I could start the workflow but then <see below> happened. If you need more information about the workflow or otherwise let me know. Best, Torsten H. job traceback: Traceback (most recent call last): File "/usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/runners/__init__.py", line 565, in finish_job job_state.job_wrapper.finish( stdout, stderr, exit_code ) File "/usr/local/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 1250, in finish self.sa_session.flush() File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/scoping.py", line 114, in do return getattr(self.registry(), name)(*args, **kwargs) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/session.py", line 1718, in flush self._flush(objects) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/session.py", line 1789, in _flush flush_context.execute() File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/unitofwork.py", line 331, in execute rec.execute(self) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/unitofwork.py", line 475, in execute uow File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/persistence.py", line 59, in save_obj mapper, table, update) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/orm/persistence.py", line 485, in _emit_update_statements execute(statement, params) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py", line 1449, in execute params) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement compiled_sql, distilled_params File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py", line 1698, in _execute_context context) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/base.py", line 1691, in _execute_context context) File "/usr/local/galaxy/galaxy-dist/eggs/SQLAlchemy-0.7.9-py2.7-linux-x86_64-ucs2.egg/sqlalchemy/engine/default.py", line 331, in do_execute cursor.execute(statement, parameters) DBAPIError: (TransactionRollbackError) deadlock detected DETAIL: Process 3144 waits for ShareLock on transaction 2517124; blocked by process 3143. Process 3143 waits for ShareLock on transaction 2517123; blocked by process 3144. HINT: See server log for query details. 'UPDATE workflow_invocation SET update_time=%(update_time)s WHERE workflow_invocation.id = %(workflow_invocation_id)s' {'update_time': datetime.datetime(2015, 1, 26, 14, 20, 4, 155440), 'workflow_invocation_id': 5454}

2 1

Tool development - Selecting a single item from input dataset.
by Vimalkumar Velayudhan 26 Jan '15

26 Jan '15

Hi all, I am trying to create a select box with the possibility of selecting only a single item from the input dataset (figure 1). This works fine but the option for selecting multiple files is still visible (figure 2). The multiple="false" attribute has no effect. Figure: http://i.imgur.com/oJVFCoF.png I have the following in my XML. <param format="tabular" name="ribo_files" type="data" label="Select Ribo-Seq alignment file" multiple="false" > </param> Any suggestions? galaxy-dist revision 5f4c13d622b8 Regards, Vimalkumar Velayudhan

3 3