Running jobs as real user and extra_file_path
by Louise-Amélie Schmitt
Hi everyone,
I just wanted to ask how the extra_file_path is handled in case of job
running as the real user since the file_path is only writable by the
galaxy user. Any clue?
Thanks,
L-A
5 years, 9 months
error loading files into galaxy
by Hakeem Almabrazi
Hi,
I started getting the following error whenever I try to load a file into Galaxy local.
Traceback (most recent call last):
File "/usr/local/galaxy/galaxy-dist/tools/data_source/upload.py", line 8, in <module>
from galaxy import eggs
ImportError: cannot import name eggs
I will appreciate if someone can tell me what could happen to cause such issue and how to resolve it.
Regards,
6 years, 10 months
Upload issue in local installation
by Batsal Devkota
I installed galaxy locally in a linux server. However, I cannot upload the files (no matter how small, I have tried few kb size fasta files). When I try to upload, the link to the file shows up in the History and gets a new number (purple box). When I click on the link I get 'Dataset is uploading' forever.
In the terminal window where I start galaxy, I get the following error report:
92.17.41.13 - - [02/Aug/2012:15:33:32 -0400] "GET / HTTP/1.1" 200 - "-" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:32 -0400] "GET /root/tool_menu HTTP/1.1" 200 - "http://redhat:8080/" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:32 -0400] "GET /history HTTP/1.1" 200 - "http://redhat:8080/" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:33 -0400] "POST /root/user_get_usage HTTP/1.1" 200 - "http://redhat:8080/history" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:40 -0400] "GET /tool_runner?tool_id=upload1 HTTP/1.1" 200 - "http://redhat:8080/root/tool_menu" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:59 -0400] "POST /tool_runner/upload_async_create HTTP/1.1" 200 - "http://redhat:8080/" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:59 -0400] "GET /tool_runner/upload_async_message HTTP/1.1" 200 - "http://redhat:8080/" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:59 -0400] "GET /history HTTP/1.1" 200 - "http://redhat:8080/tool_runner/upload_async_message" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:33:59 -0400] "POST /root/user_get_usage HTTP/1.1" 200 - "http://redhat:8080/history" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:34:04 -0400] "POST /root/history_item_updates HTTP/1.1" 200 - "http://redhat:8080/history" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
92.17.41.13 - - [02/Aug/2012:15:34:08 -0400] "POST /root/history_item_updates HTTP/1.1" 200 - "http://redhat:8080/history" "Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20100101 Firefox/13.0.1"
The last line keeps going for ever... writes new line every 4 secs.
I am stuck and don't know where to look. Please help.
Batsal.
6 years, 11 months
filtering a <param> of type 'data' so only one type is available
by Dan Tenenbaum
Hi,
I have a tool wrapper with a <param> of type="data".
Currently, this renders as a text box with a drop down list that has
52 items in it (the number of things I have in my history, I guess).
I'd like to filter this list so that only some items in the history
(in my case any item whose name ends with '.rda') are shown in this
dropdown.
It seems like maybe the 'format' parameter to the 'param' tag is what
I want, but:
1) I tried format="rda" and that didn't seem to change anything.
2) "rda" is not in the datatypes.conf file which the documentation
suggests is required?
(.rda is an extension used for serialized R objects.)
The tool I'm working on only accepts rda files as input and the reason
I'm asking for this is that it is all too easy to accidentally feed it
a file of some other type. If the dropdown could be filtered so that
only items which will work with the tool are shown, that would be
great.
Is there a way to do that?
Thanks,
Dan
8 years, 3 months
Workflow param error leads to template error
by Paul-Michael Agapow
Looking for pointers on what might be causing this problem:
A user had a moderately complicated workflow that when run under certain parameter values results in a traceback that ends:
Module workflow_run_mako:232 in render_body
>> for ic in oc.input_step.module.get_data_inputs():
AttributeError: 'WorkflowStep' object has no attribute 'module'
After some puzzling, I realized one of the parameters was outside the allowed range (a max on an integer param) and that Galaxy was trying to render "run.mako" flagging up the error but erroring out. Why Galaxy tries to call this non-existent member is unclear to me. Any insight or places I should start exploring?
Galaxy version: various (started with an 6-month old one, upgraded to latest production to see if it would fix error, it didn't)
Hosting OS: CentOS 6
------
Paul Agapow (paul-michael.agapow(a)hpa.org.uk)
Bioinformatics, Health Protection Agency (UK)
**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of the HPA, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. HTTP://www.HPA.org.uk
**************************************************************************
8 years, 4 months
Re: [galaxy-dev] Job execution order mixed-up
by Jean-Francois Payotte
Hi John,
Thank you for your answer and for trying to help. This is greatly
appreciated!
I didn't really made any progress in tracking down this error, and
hopefully this weird behaviour will not happen anymore with the November
4th, distribution.
But here are my answers to your questions, in case it would ring a bell:
Has this behaviour been reported with any other workflow?
It has been reported with 2 different workflows as of now. These 2
workflows doesn't have anything in common, except that they are huge (one
of them has 37 steps, producing a total of about 110 datasets).
Are you running Galaxy as a single process or multiple processes? If
multiple processes, how many web, handler and manager processes do you
have and are they all on the same machine?
We are running Galaxy in multiple processes with 5 web servers, 3 job
handlers and no manager (I believe the manager was rendered obsolete in
one of the latest Galaxy distributions). All these processes are run on
the same machine.
Have you made any modifications to Galaxy that could result in this
behaviour?
No.
What is the value of track_jobs_in_database in your universe_wsgi.ini
configuration file?
We never touched this part of the configuration file and the line still
reads: "#track_jobs_in_database = None".
After reading your answer, I've decided to modify this line to:
"track_jobs_in_database = True"
Unfortunately, running one of the faulty workflows several times (5x), I
noticed that one of them was still showing this strange behaviour where
some jobs were executed before their inputs were ready.
Do you think this issue could be related to the fact that we are using
Galaxy with the multiple processes configuration? We implemented this
configuration some time ago because some of our users were complaining
about the slow responsiveness of the web interface.
Would you recommend using Galaxy without the multiple processes
configuration? (Lets say if updating to November 4th distribution doesn't
fix this issue)
I guess you are probably using the multiple processes configuration as
well on Galaxy main?
Thanks again for your help!
Jean-François
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Posted by John Chilton on Nov 09, 2013; 2:50pm
Hello Jean-François,
Have you made any progress tracking down this error? This appears very
serious, but to tell you the truth I have no clue what could cause it. The
distribution you are using is pretty old at this point I feel like if it
was a bug the exhibited under relatively standard parameter combinations
someone else would have reported it by now.
Can you tell me some things: has this been reported with any other
workflows? Is there anything special about this workflow? Can you rebuild
the workflow and see if the error occurs again?
Additional questions if the problem is not restricted to the workflow:
are you running Galaxy as a single process or multiple processes? If
multiple processes, how many web, handler, and manager processes do you
have? Are they all on the same machine? Have you made any modifications to
Galaxy that could result in this behavior? What is the value of
track_jobs_in_database in your universe_wsgi.ini configuration file?
-John
On Thu, Nov 7, 2013 at 10:34 AM, Jean-Francois Payotte <[hidden email]>
wrote:
Dear Galaxy mailing-list,
Once again I come seeking for your help. I hope someone already had this
issue or will have an idea on where to look to solve it. :)
One of our users reported having workflows failing because some steps were
executed before all their inputs where ready.
You can find a screenshot attached, where we can see that step (42) "Sort
on data 39" has been executed while step (39) is still waiting to run
(gray box).
This behaviour has been reproduced with at least two different Galaxy
tools (one custom, and the sort tool which comes standard with Galaxy).
This behaviour seems to be a little bit random, as running two times a
workflow where this issue occurs, only one time did some steps were
executed in the wrong order.
I could be wrong, but I don't think this issue is grid-related as, from my
understanding, Galaxy is not using SGE job dependencies functionality.
I believe all jobs stays in some internal queues (within Galaxy) until all
input files are ready, and only then the job is submitted to the cluster.
Any help or any hint on what to look at to solve this issue would be
greatly appreciated.
We have updated our Galaxy instance to August 12th distribution on October
1st, and I believe we never experienced this issue before the update.
Many thanks for your help,
Jean-François
8 years, 6 months
Using Mesos to Enable distributed computing under Galaxy?
by Kyle Ellrott
I think one of the aspects where Galaxy is a bit soft is the ability to do
distributed tasks. The current system of split/replicate/merge tasks based
on file type is a bit limited and hard for tool developers to expand upon.
Distributed computing is a non-trival thing to implement and I think it
would be a better use of our time to use an already existing framework. And
it would also mean one less API for tool writers to have to develop for.
I was wondering if anybody has looked at Mesos (
http://mesos.apache.org/). You can see an overview of the Mesos
architecture at
https://github.com/apache/mesos/blob/master/docs/Mesos-Architecture.md
The important thing about Mesos is that it provides an API for C/C++,
Java/Scala and Python to write distributed frameworks. There are already
implementations of frameworks for common parallel programming systems such
as:
- Hadoop (https://github.com/mesos/hadoop)
- MPI (
https://github.com/apache/mesos/blob/master/docs/Running-torque-or-mpi-on...
)
- Spark (http://spark-project.org)
And you can find example Python framework at
https://github.com/apache/mesos/tree/master/src/examples/python
Integration with Galaxy would have three parts:
1) Add a system config variable to Galaxy called 'MESOS_URL' that is then
passed to tool wrappers and allows them to contact the local mesos
infrastructure (assuming the system has been configured) or pass a null if
the system isn't available.
2) Write a tool runner that works as a mesos framework to executes single
cpu jobs on the distributed system.
3) For instances where mesos is not available at a system wide level (say
they only have access to an SGE based cluster), but the user wants to run
distributed jobs, write a wrapper that can create a mesos cluster using the
existing queueing system. For example, right now I run a Mesos system under
the SGE queue system.
I'm curious to see what other people think.
Kyle
8 years, 7 months
user specific access/options
by Jennifer Jackson
Hi Petr,
I am going to forward your email to the galaxy-dev list so that the
development community can offer comments/suggestions.
Best,
Jen
Galaxy team
On 8/30/11 2:27 AM, Petr Novak wrote:
> Hi everybody,
> I am developing the application on Galaxy server. One of the requirement
> is to create user specific list of options. Is it possible to access
> somehow $__user_email__ in <options> tag or in <conditional> ?. I did
> not found documentation how to used cheetah in galaxy tool xml files but
> from files provided with galaxy, cheetah is used only in <command> and
> <config> tag. Is that rigth? If it can be used in any part of xml
> definition file it would make much easier to generate xml dynamicaly
> based on the $__user_email__
> Does anybody have an idea how to manage this problem?
> Petr Novak
>
>
> ___________________________________________________________
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org. Please keep all replies on the list by
> using "reply all" in your mail client. For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
> http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
> http://lists.bx.psu.edu/
--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/Support
8 years, 8 months
Default value for data_column not working?
by Peter Cock
Hi all,
I'm trying to write a new tool working with tabular data (specifically
a Reciprocal Best Hits (RBH) tool using BLAST style tabular output).
I want the user to be able to choose a column number (from one of the
input files), but I have a default column in mind. However, Galaxy
doesn't seem to obey the default column number:
<inputs>
<param name="a_vs_b" type="data" format="tabular" label="Hits
from querying A against B" description="Tabular file, e.g. BLAST
output" />
<param name="b_vs_a" type="data" format="tabular" label="Hits
from querying B against A" description="Tabular file, e.g. BLAST
output" />
<param name="id1" type="data_column" data_ref="a_vs_b"
multiple="False" numerical="False" value="1" label="Column containing
query ID" help="This is column 1 in standard BLAST tabular output" />
<param name="id2" type="data_column" data_ref="a_vs_b"
multiple="False" numerical="False" value="2" label="Column containing
match ID" help="This is column 2 in standard BLAST tabular output"/>
<param name="score" type="data_column" data_ref="a_vs_b"
multiple="False" numerical="False" value="12" label="Column containing
containing score to rank on" help="The bitscore is column 12 in
standard BLAST tabular output"/>
</inputs>
I've tried giving the default column value numerically (as shown), and
also using value="c2" etc. Regardless, Galaxy just defaults to the
first column.
Is this a bug, or am I doing something wrong?
Thanks,
Peter
8 years, 8 months