August 2015 - galaxy-dev - lists.galaxyproject.org

Creating new dataset collections in a workflow
by Aaron Petkau 10 Aug '15

10 Aug '15

Hey, So, I've been working on a tool which will product a new dataset collection as output. I was following some of the instructions from https://bitbucket.org/galaxy/galaxy-central/pull-requests/582/allow-tools-t…. I managed to get the tool itself working, but when I go to use it in a workflow I'm getting errors. Mainly: History does not include a dataset collection of the correct type or containing the correct types of datasets I'm wondering if there's something I'm doing wrong, or if tools which product dataset collections are not supported within workflows? I'm working with the second case in that merge requests, using an input list as the structure for my output list. Thanks, Aaron

2 4

Remote Hackathon for Tools and Dataset Collections
by John Chilton 10 Aug '15

10 Aug '15

Hello all, We are planning a remote hackathon for Galaxy tool developers to hack on dataset collections. Dataset collections enable MapReduce style workflows in Galaxy and have come a long way over the last year+. Several groups are using dataset collections and newer tools to express workflows of various degrees of complexity that were not possible in Galaxy before. The remote nature of this should give people who don't have the opportunity to come to GCC hackathons (which have always been productive and a lot of fun) a chance to participate in a Galaxy hackathon. Hopefully having a well defined topic will allow us to accomplish a lot and let people who don't have particular tasks in mind find something to work on very quickly. We have been collecting ideas to work on here - https://github.com/galaxyproject/tools-iuc/issues/239, but we expect to attract the most participation by simply getting tool developers interested in getting help adding collection support to their existing tools and workflows to show up and participate. If you are interested in participating in the hackathon but not interested in actual tool development - we will assemble a list of smaller, manageable Python and JavaScript tasks to work on and certainly documentation is a chronically lacking for collections so we could use help there and no actual coding would be required. This is the first remote hackathon we have organized and we encourage ideas or advice about how to organize it please let us know. We are currently thinking a two day event in which a core group of us would be available on IRC all day and we would have 4 google hangouts across those days to organize, answer questions, and report progress. We are also currently thinking September 17th and 18th. Thanks, Galaxy IUC

1 0

cancelled by slurm -> job is fine
by Alexander Vowinkel 07 Aug '15

07 Aug '15

Hi, I have a big problem here. Jobs that are cancelled by slurm appear to galaxy as finished normally. For me this is especially bad because all following workflow steps go on working with corrupted/empty/whatever data. in the job stderr I can find: slurmd[w4]: *** JOB 194 CANCELLED AT 2015-08-06T04:04:38 *** slurmd[w4]: Unable to unlink domain socket: No such file or directory slurmd[w4]: unlink(/tmp/slurm/slurmd_spool/job00194/slurm_script): No such file or directory slurmd[w4]: rmdir(/tmp/slurm/slurmd_spool/job00194): No such file or directory In the galaxy log for that job: galaxy.jobs.runners.drmaa DEBUG 2015-08-06 04:02:10,050 (4278) submitting file /mnt/galaxy/tmp/job_working_directory/004/4278/galaxy_4278.sh galaxy.jobs.runners.drmaa INFO 2015-08-06 04:02:10,056 (4278) queued as 192galaxy.jobs DEBUG 2015-08-06 04:02:10,185 (4278) Persisting job destination (destination id: slurm_cluster) [...] galaxy.jobs.runners.drmaa DEBUG 2015-08-06 04:04:39,525 (4278/192) state change: job finished normallygalaxy.jobs DEBUG 2015-08-06 04:04:45,806 job 4278 ended (finish() executed in (5290.522 ms)) galaxy.datatypes.metadata DEBUG 2015-08-06 04:04:45,837 Cleaning up external metadata files galaxy.datatypes.metadata DEBUG 2015-08-06 04:04:46,100 Failed to cleanup MetadataTempFile temp files from /mnt/galaxy/tmp/job_working_directory/004/4278/metadata_out_HistoryDatasetAssociation_9426_Ib1Niz: No JSON object could be decoded galaxy.datatypes.metadata DEBUG 2015-08-06 04:04:46,397 Failed to cleanup MetadataTempFile temp files from /mnt/galaxy/tmp/job_working_directory/004/4278/metadata_out_HistoryDatasetAssociation_9427_8X77j4: No JSON object could be decoded Can someone please check that? Best, Alexander

1 0

Data upload as part of workflow?
by Scott Szakonyi 06 Aug '15

06 Aug '15

Hello all, I have a request to make data upload functions available as part of workflow. The majority of data retrieval functions are not available as workflow items in the default system. Is there a simple way to enable individual upload tools for workflow use, or would that involve serious mucking around in the core Galaxy software? Thanks! -- Scott B. Szakonyi Research Programmer *Center for Research Computing* 107 Information Technology Center Notre Dame, IN 46556 http://crc.nd.edu

2 1

Externalizing Galaxy config in docker-galaxy-stable
by Oksana Korol 06 Aug '15

06 Aug '15

Hi, My question is to the docker-galaxy-stable community. I would like to use or extend this image with some Galaxy settings externalized. For instance, I would like to define my own GALAXY_UID and GALAXY_HOME environment variables. I have tried the -e setting when I build and run the container, but that doesn't work: > docker run -d -p 8080:80 -p 8021:21 -e "GALAXY_UID=1777" -e "GALAXY_HOME=/home/galaxy/env_test" --name galaxy-env-test bgruening/galaxy-stable ... > docker exec -ti galaxy-env-test bash ># getent passwd galaxy galaxy:x:1450:1450:Galaxy user:/home/galaxy: As you can see from above, GALAXY_UID is 1450, as hardcoded in the Dockerfile, and not 1777, as I've specified. Same goes for the home directory. Is there any other way that I can set those variables? If not, what would be the best way to proceed, since, ideally, I would like to extend the galaxy-stable Docker image, and not change the existing one. Currently, I don't see any other way but to fork https://github.com/bgruening/docker-galaxy-stable and change the Dockerfile to externalize those (and other) variables. I hope I can get better suggestions than this. Cheers, Oksana

7 22

Galaxy run.sh process crashing
by Hans Vasquez-Gross 05 Aug '15

05 Aug '15

Hi, We have been running galaxy for the last year fine on our website to run short running scripts (< 10 seconds) and have had no issues. However, we just upgraded to the latest galaxy source and setup the tools necessary for NGS mapping and assembly. Our galaxy instance keeps crashing when jobs are running. It seems to only happen when jobs have been running for a long(ish) amount of time. I've already converted the database to use PostGreSQL. The paster.log doesn't have any informative error messages. Any suggestions on how to fix or further troubleshoot this issue would be appreciated. Thank you, -Hans

3 4

LSF cluster wierd behaviours!
by Hakeem Almabrazi 04 Aug '15

04 Aug '15

Hello everyone, I have posted this earlier but I am afraid it did not go through I hope :). I was able to setup galaxy to work with our HPC cluster using the LSF scheduler. So far so good except with few exceptions: 1) I noticed one thing that submitting a job after a long period (for example overnight) the jobs do not get executed and more will not show up as jobs in the queue when I execute the "bjobs" command from the command line. As if the jobs were never submitted to the LSF. However, if I submit a job from the command (i.e >bsub sleep -5), then I check the jobs in the queue using the bjobs command I see this job as well as the other jobs that were submitted and could not see them before. Weird .... Has anyone seen this behavior before? Is this related to galaxy setup? Is there anything I should try out to get rid of such behavior? 2) Also related to LSF setup. Every time I restart galaxy it will not restart rather it will crash. Then if I start it again it will start after that. Here is the error I keep seeing after the first restart "galaxy.jobs.runners.state_handler_factory DEBUG 2015-08-04 08:12:17,484 Loaded 'failure' state handler from module galaxy.jobs.runners.state_handlers.resubmit " Any idea to get rid of this as well? Is this a job still in the database that I need to clean manually? If so can you tell me what table(s) to look into to clear out. 3. Finally, how do I control the resources (i.e cores for a job ) given to a submitted job on Galaxy? Thank you in advance for any tips or hints to resolve these issues. Best regards, Hak Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.

2 5

Sample Tracking Feature in Galaxy
by Hakeem Almabrazi 04 Aug '15

04 Aug '15

Hi, I am trying to see if we can use Tracking Sample Features in Galaxy to collect some data from the technisians. However, I have no idea where to start. I guess I do not know how to set it up in Galaxy. Here is what I have done so far. From the admin panel, I create two forms; Sequencing Request Form, and a Sequencing Sample Form When I want to create a new Sequence Request Type it tells me that I need the two forms "Creating a new request type requires two form definitions, a Sequencing Request Form, and a Sequencing Sample Form, which must be created first. Click the Create new form button to create them." However, I have already created them. The Sequence Request Form creator does not see that.How to go behind this point? Also from some online tutorials, there is a menu called "Lab" should show up in the main menu of Galaxy. I do not see that in mine. Is there something that I need to setup here. I appreciate your help. Regards Disclaimer: This email and its attachments may be confidential and are intended solely for the use of the individual to whom it is addressed. If you are not the intended recipient, any reading, printing, storage, disclosure, copying or any other action taken in respect of this e-mail is prohibited and may be unlawful. If you are not the intended recipient, please notify the sender immediately by using the reply function and then permanently delete what you have received. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Sidra Medical and Research Center.

2 3

passing the name attribute from the DB to a sniffer
by Josh Woodring 04 Aug '15

04 Aug '15

Dear all, I am working on porting crop modelling tools into a custom version of galaxy; unfortunately though, the files are rigidly formatted in similar ways and not easily distinguished. As such, while I can ensure that the outputs from tools I write create valid files; I can't write a simple sniffer that will figure out the type of an uploaded file from keywords or content formatting alone. It seems that my most practical option is going to be trying to import the name attribute from the history_dataset_association and test if the name has the right extension and internal formatting style. I just don't know how I could access that information in the sniffer code. Does anyone know how to do this or where I can add things properly in the code myself? Thanks Josh -- from Joshua Woodring (woodring.josh94(at)gmail.com) Nil mihi rescribas, tu tamen ipse veni!

2 1

install_tool_shed_repositories.py returns HTTP Error 400: Bad Request
by Mic 04 Aug '15

04 Aug '15

Hello, I found this shed_tool_conf.xml ( https://raw.githubusercontent.com/galaxyproject/usegalaxy-playbook/c55aa042…) file on use galaxy-playbook. I copied information for snpSift_filter out of the xml file and pasted it to the command below: * $ python /Users/lorencm/galaxy/scripts/api/install_tool_shed_repositories.py -a 4bb8f6eb85efca463d2e60b9704d6caf -l http://localhost:8080 <http://localhost:8080> -n snpSift_filter -o pcingola -r c052639fa666 --panel-section-name 'snpEff' --repository-deps --tool-deps -u http://toolshed.g2.bx.psu.edu/ <http://toolshed.g2.bx.psu.edu/>* However, I received the following error massage: * HTTP Error 400: Bad Request* * {"traceback": "Traceback (most recent call last):\n File \"/Users/lorencm/galaxy/lib/galaxy/web/framework/decorators.py\", line 251, in decorator\n rval = func( self, trans, *args, **kwargs)\n File \"/Users/lorencm/galaxy/lib/galaxy/webapps/galaxy/api/tool_shed_repositories.py\", line 246, in install_repository_revision\n payload )\n File \"/Users/lorencm/galaxy/lib/tool_shed/galaxy_install/install_manager.py\", line 704, in install\n changeset_revision )\n File \"/Users/lorencm/galaxy/lib/tool_shed/galaxy_install/install_manager.py\", line 505, in __get_install_info_from_tool_shed\n raise exceptions.RequestParameterInvalidException( invalid_parameter_message )\nRequestParameterInvalidException: No information is available for the requested repository revision.\nOne or more of the following parameter values is likely invalid:\ntool_shed_url: http://toolshed.g2.bx.psu.edu/\nname <http://toolshed.g2.bx.psu.edu/\nname>: snpSift_filter\nowner: pcingola\nchangeset_revision: c052639fa666\n\n", "err_msg": "No information is available for * I also posted it on Github ( https://github.com/galaxyproject/usegalaxy-playbook/issues/8) How is it possible to fix the problem? Thank you in advance. Mic

2 1