On Fri, Dec 5, 2014 at 1:25 PM, Fernandez Edgar <edgar.fernandez@umontreal.ca> wrote:

Hello again Nate,

 

I’ve uninstalled and reinstalled a fresh package of galaxy and torque.

I’ve also configure, make and install pbs_drmaa.

 

Finally, when I start galaxy, the main works but the handler0 doesn’t.

Here is the error:

galaxy.jobs.manager DEBUG 2014-12-05 13:12:10,035 Starting job handler

galaxy.jobs INFO 2014-12-05 13:12:10,035 Handler 'handler0' will load all configured runner plugins

galaxy.jobs.runners.state_handler_factory DEBUG 2014-12-05 13:12:10,040 Loaded 'failure' state handler from module galaxy.jobs.runners.state_handlers.resubmit

Traceback (most recent call last):

  File "/home/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py", line 44, in app_factory

    app = UniverseApplication( global_conf = global_conf, **kwargs )

  File "/home/galaxy/galaxy-dist/lib/galaxy/app.py", line 136, in __init__

    self.job_manager = manager.JobManager( self )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/manager.py", line 23, in __init__

    self.job_handler = handler.JobHandler( app )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py", line 32, in __init__

    self.dispatcher = DefaultJobDispatcher( app )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py", line 715, in __init__

    self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 626, in get_job_runner_plugins

    rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get( 'kwds', {} ) )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 61, in __init__

    drmaa = __import__( "drmaa" )

  File "build/bdist.linux-x86_64/egg/drmaa/__init__.py", line 63, in <module>

  File "build/bdist.linux-x86_64/egg/drmaa/session.py", line 39, in <module>

  File "build/bdist.linux-x86_64/egg/drmaa/helpers.py", line 36, in <module>

  File "build/bdist.linux-x86_64/egg/drmaa/wrappers.py", line 56, in <module>

  File "/usr/lib64/python2.6/ctypes/__init__.py", line 353, in __init__

    self._handle = _dlopen(self._name, mode)

OSError: /usr/local/pbs-drmaa/lib/: cannot read file data: Is a directory

Removing PID file handler0.pid

Exception in thread Thread-1 (most likely raised during interpreter shutdown):

Traceback (most recent call last):

  File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner

  File "/usr/lib64/python2.6/threading.py", line 484, in run

  File "/home/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/update_repository_manager.py", line 93, in __restarter

  File "/home/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/update_repository_manager.py", line 133, in sleep

  File "/usr/lib64/python2.6/threading.py", line 137, in release

<type 'exceptions.TypeError'>: 'NoneType' object is not callable

Please advise!


Hi Fernandez,

Since you've switched to the DRMAA job running plugin, you need to provide the path to libdrmaa.so - I believe you have set $DRMAA_LIBRARY_PATH to /usr/local/pbs-drmaa/lib/ where it should be /usr/local/pbs-drmaa/lib/libdrmaa.so

--nate

 

Cordialement / Regards,

Edgar Fernandez

 

De : Nate Coraor [mailto:nate@bx.psu.edu]
Envoyé : December-05-14 11:27 AM


À : Fernandez Edgar
Cc : galaxy-dev@bx.psu.edu
Objet : Re: [galaxy-dev] galaxy with torque

 

On Fri, Dec 5, 2014 at 9:13 AM, Fernandez Edgar <edgar.fernandez@umontreal.ca> wrote:

Hello Nate,

 

Thank you for this explanation, this clears up a lot and I have a better understanding of how galaxy works.

I still have some points that I would like more clarifications:

 

1.       What are the purpose of the 4 variables defined under my [server:handler0] section in the file config/galaxy.ini?

 

At a minimum, use = egg:Paste#http is required as this tells PasteScript, which is used to start the Galaxy server, what web server module will be used. The other variables are optional to some degree but you must set unique/unused ports for each [server:].

 

2.       What does the load (="galaxy.jobs.runners.pbs:PBSJobRunner") under the plugin tag defines in the file config/job_conf.xml?

 

This instructs Galaxy to load a specific module and class as a job running plugin. In this case it is the PBSJobRunner:

 

 

3.       Finally, when I start galaxy, I see my two process:

a.       galaxy   11104     1 84 08:43 ?        00:00:05 python ./scripts/paster.py serve config/galaxy.ini --server-name=main --pid-file=main.pid --log-file=main.log --daemon

b.      galaxy   11112     1 83 08:43 ?        00:00:05 python ./scripts/paster.py serve config/galaxy.ini --server-name=handler0 --pid-file=handler0.pid --log-file=handler0.log –daemon

However, the second one fails and here is the error message I’ve been getting:

galaxy.jobs.manager DEBUG 2014-12-05 08:43:19,059 Starting job handler

galaxy.jobs INFO 2014-12-05 08:43:19,059 Handler 'handler0' will load all configured runner plugins

Traceback (most recent call last):

  File "/home/galaxy/galaxy-dist/lib/galaxy/webapps/galaxy/buildapp.py", line 44, in app_factory

    app = UniverseApplication( global_conf = global_conf, **kwargs )

  File "/home/galaxy/galaxy-dist/lib/galaxy/app.py", line 136, in __init__

    self.job_manager = manager.JobManager( self )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/manager.py", line 23, in __init__

    self.job_handler = handler.JobHandler( app )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py", line 32, in __init__

    self.dispatcher = DefaultJobDispatcher( app )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/handler.py", line 715, in __init__

    self.job_runners = self.app.job_config.get_job_runner_plugins( self.app.config.server_name )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/__init__.py", line 586, in get_job_runner_plugins

    module = __import__( module_name )

  File "/home/galaxy/galaxy-dist/lib/galaxy/jobs/runners/pbs.py", line 32, in <module>

    raise Exception( egg_message % str( e ) )

Exception:

 

The 'pbs' runner depends on 'pbs_python' which is not installed or not

configured properly.  Galaxy's "scramble" system should make this installation

simple, please follow the instructions found at:

 

    http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster

 

Additional errors may follow:

/home/galaxy/galaxy-dist/eggs/pbs_python-4.3.5-py2.6-linux-x86_64-ucs4.egg/_pbs.so: undefined symbol: log_record

 

Removing PID file handler0.pid

 

 

Please try the following:

 

In /home/galaxy/galaxy-dist/eggs.ini, change the version of pbs_python to 4.4.0. Then, re-scramble pbs_python:

 

% cd /home/galaxy/galaxy-dist

% rm -rf ./eggs/pbs_python-4.3.5-py2.6-linux-x86_64-ucs4.egg

% LIBTORQUE_DIR=/usr/local/torque/lib python ./scripts/scramble.py -e pbs_python

 

If pbs_python 4.4.0 does not work, you'll need to use the DRMAA interface to Torque instead.

 

--nate

 

Any suggestions is more than welcome since I have a lot of pressure to make this work.

Thanks gents !!!

 

Cordialement / Regards,

Edgar Fernandez

 

De : Nate Coraor [mailto:nate@bx.psu.edu]
Envoyé : December-04-14 4:09 PM
À : Fernandez Edgar
Cc : John Chilton; galaxy-dev@bx.psu.edu
Objet : Re: [galaxy-dev] galaxy with torque

 

On Thu, Dec 4, 2014 at 11:03 AM, Fernandez Edgar <edgar.fernandez@umontreal.ca> wrote:

Good morning gents,

 

I found one of your previous answers on the internet.

And it made me figured out my problem with job_conf.xml

So I finally made galaxy start without a glitch.

 

First, I added the following lines in the config/galaxy.ini file:

[server:handler0]

use = egg:Paste#http

port = 9010

use_threadpool = True

threadpool_workers = 10

I’ve also changed the config/job.conf.xml:

<?xml version="1.0"?>

<job_conf>

  <plugins>

    <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner"/>

  </plugins>

  <handlers>

    <handler id="handler0"/>

  </handlers>

  <destinations default="torque">

    <destination id="torque" runner="pbs"/>

  </destinations>

</job_conf>

 

Now, I’m uncertain what needs to listen to the port number 9010 under [server:handler0] section.

 

Hi Edgar,

 

There are many ways to run Galaxy servers. By default, if starting using the provided run.sh script, Galaxy starts and runs in a single process, which is defined by the [server:main] section in galaxy.ini. If your intent is to run Galaxy in this default single process setup, you can remove the [server:handler0] section from galaxy.ini and set your handler id in job_conf.xml to "main" like so:

 

<handlers default="main">

  <handler id="main/>

</handlers>

 

However, for most Galaxy servers that see a moderate amount of use, it is a good idea to run multiple processes. At a minimum, this would be one process which serves web requests (which is typically proxied by a traditional webserver such as Apache) but which does not handle running Galaxy jobs, and a second process which does not serve web requests, but which handles running jobs. In that case you can keep galaxy.ini as you have it now (with both a [server:main] section and a [server:handler0] section), and with the handler in job_conf.xml defined as <handler id="handler0"/>.

 

However, run.sh run without arguments will only start the server process defined as [server:main]. To start all server processes in galaxy.ini, use:

 

% GALAXY_RUN_ALL=1 sh run.sh --daemon

 

Documentation on multiprocess Galaxy setups can be found at:

 

 

Other documentation on running a "production" Galaxy service (including using a proxy server) can be found at:

 

 

--nate

 

 

Cordialement / Regards,

Edgar Fernandez

 

De : John Chilton [mailto:jmchilton@gmail.com]
Envoyé : December-03-14 9:53 AM
À : Fernandez Edgar


Cc : galaxy-dev@bx.psu.edu
Objet : Re: [galaxy-dev] galaxy with torque

 

That handle id seems wrong... at least it is not what I am used to. It needs to match a server section specified in your galaxay ini file - usually this is a simple string like handler0 or something.

 

It looks like the specific error is :

 

/home/galaxy/galaxy-dist/eggs/pbs_python-4.3.5-py2.6-linux-x86_64-ucs4.egg/_pbs.so: undefined symbol: log_record

 

I am not sure what causes this - some subtle incompatibility.

 

So the pbs_python 4.4.0 egg is available on eggs.galaxyproject.org (http://eggs.galaxyproject.org/pbs_python/pbs_python-4.4.0.tar.gz) - I think it may be needed for newer torque versions - do you want to update the version specified in eggs.ini, delete the old egg, and rescramble the egg with 4.4.0?

 

-John

 

P.S. Since you are setting up a new server I would strongly suggest using postgres instead of MySQL - but the previous comment about it not needing to be accessed on the compute servers is correct. 

 

 

On Tue, Dec 2, 2014 at 3:10 PM, Fernandez Edgar <edgar.fernandez@umontreal.ca> wrote:

Hello again,

 

I’m very close in making pbs_python work but I’m hitting a new wall.

So I’ve created the file config/job_conf.xml which looks like this

<?xml version="1.0"?>

<job_conf>

    <plugins>

        <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner"/>

    </plugins>

    <handlers default="gavroche.esi.umontreal.ca">

        <handler id="gavroche.esi.umontreal.ca" tags="pbs"/>

    </handlers>

    <destinations default="pbs_default">

        <destination id="pbs_default" runner="pbs" tags="mycluster"/>

        <destination id="pbs_longjobs" runner="pbs" tags="mycluster,longjobs">

            <param id="Resource_List">walltime=72:00:00</param>

        </destination>

    </destinations>

</job_conf>

gavroche.esi.umontreal.ca is my torque server.

I know that in your documentation it doesn’t say to put in a handlers tag but galaxy doesn’t parse the xml without it.

 

Now, once I try to start galaxy, I get the error you see in the file paster.log attached to this email.

Can anyone help please?

 

Cordialement / Regards,

Edgar Fernandez

 

De : Fernandez Edgar
Envoyé : December-02-14 1:33 PM
À : 'Rémy Dernat'
Cc : John Chilton; galaxy-dev@bx.psu.edu
Objet : RE: Re : [galaxy-dev] galaxy with torque

 

Thank you for that correction.

 

Just a small FYI (maybe it will be useful to update the wiki)…

 

I had to export three variables to make the scramble possible:

 

export PBS_PYTHON_INCLUDEDIR=/usr/local/torque/include/

export PBSCONFIG=/usr/local/torque/bin/pbs-config

export LIBTORQUE_DIR=/usr/local/torque/lib/libtorque.so

python scripts/scramble.py -e pbs_python

 

Cordialement / Regards,

Edgar Fernandez

 

De : Rémy Dernat [mailto:remy.d1@gmail.com]
Envoyé : December-02-14 11:49 AM
À : Fernandez Edgar
Cc : John Chilton; galaxy-dev@bx.psu.edu
Objet : Re: Re : [galaxy-dev] galaxy with torque

 

Sorry for answer 7. There is no benefit to do that. Once the egg is done, there is nothing to do from here, except if you change your python version... If that variable is empty, that is normal, because it is not an environment variable, it is just used by the following python command:

LIBTORQUE_DIR=/path/to/libtorque python scripts/scramble.py -e pbs_python

 

2014-12-02 16:27 GMT+01:00 Rémy Dernat <remy.d1@gmail.com>:

Hi Edgar,

 

You are right. It is very annoying... 

 

So, to answer your questions:

1/ First answer of google / wikipedia with DRMAA : http://en.wikipedia.org/wiki/DRMAA

It is a pattern to talk with any DRM system (SGE, torque, whatever...)

 

2/ This is a library (python version) for your Torque installation.

 

3/ MySQL access is only needed by your galaxy frontend.

 

4/ Internet access is not required for your compute node (except the galaxy one), but it is better if you want to use a package manager on your compute nodes, for example...

 

5/ On my part, I use permissions like 760 on galaxy directory. It depends on your needs... Some applications might need an access to your galaxy installation, but you should split binaries, the galaxy installation and your data (datasets, libraries...). But do not forget to share this folders by NFS (if needed).

 

6/ Sorry, no idea; but I see no reason for that to become unavailable, if your proxy is well configured.

 

7/ You have to put this command line into a file which will be sourced like $HOME_GALAXY/.bashrc or your environment file ("environment_setup_file" in universe_wsgi.ini or config/galaxy.ini)

 

Regards,
Remy

 

2014-12-02 13:59 GMT+01:00 Fernandez Edgar <edgar.fernandez@umontreal.ca>:

Hi everyone,

 

I’m guessing Remy, you click on the send button by mistake on your HTC device.

It happens to me ALL the time…

 

I wanted to take this opportunity to add some questions to the three questions in my previous email.

So here goes:

 

4.       Is it necessary that my torque compute nodes have internet access?

Because right now, only my torque server and my galaxy server has internet access.

However, communication between the submit node (a.k.a. galaxy server) and the torque server is enabled.

Likewise, between the torque server and the compute nodes.

5.       Furthermore, would I disable any galaxy functionalities if I change the permissions on the whole galaxy installation directory like so: chmod –R 700 galaxy_install_directory

I have created a user galaxy for running everything that is galaxy.

6.       I’ve actually installed my galaxy server using a proxy server (as described on your web site). Can I still use the Report Tool functionalities on galaxy?

7.       Any instructions on making PBS (running jobs via TORQUE resource manager), because I have successfully scrambled the egg pbs_python with the command:  

LIBTORQUE_DIR=/path/to/libtorque python scripts/scramble.py -e pbs_python.

However, LIBTORQUE_DIR is empty.

 

Once again, thank you in advance for all your help!

 

Cordialement / Regards,

Edgar Fernandez

 

De : remy.d1@gmail.com [mailto:remy.d1@gmail.com]
Envoyé : December-02-14 6:34 AM
À : Fernandez Edgar; John Chilton


Cc : galaxy-dev@bx.psu.edu
Objet : Re : [galaxy-dev] galaxy with torque

 

1 a drmaa d

Envoyé depuis mon HTC

----- Reply message -----
De : "Fernandez Edgar" <edgar.fernandez@umontreal.ca>
Pour : "John Chilton" <jmchilton@gmail.com>
Cc : "galaxy-dev@bx.psu.edu" <galaxy-dev@bx.psu.edu>
Objet : [galaxy-dev] galaxy with torque
Date : mer., nov. 26, 2014 19:09


Hi John,

First, thank you very much for your prompt answer.
It's extremely appreciated.

Secondly, I have some other questions: whatever answers you can provide me with, will be greatly helpful.
Please, forgive my beginner level understanding of an DRM system.

1. Once I compile, make and install the Torque Submit Node code against the server running Galaxy, what exactly is the purpose for an DRMAA ?

2. What is exactly the PBS step describe here: https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster
galaxy_user@galaxy_server% LIBTORQUE_DIR=/path/to/libtorque python scripts/scramble.py -e pbs_python
What does it do exactly ?

3. All servers which includes my torque server, torque compute nodes and torque submit node (a.k.a galaxy server) have the galaxy user defines and its home shared on all of them. This means all of them have access (via NFS) of the installation directory of galaxy. But what about the MySQL server access. Does the torque server or the compute nodes need access to that service?

I hope I'm not sending too many emails/questions...
Thank you very much!

Cordialement / Regards,
Edgar Fernandez

-----Message d'origine-----
De : John Chilton [mailto:jmchilton@gmail.com]
Envoyé : November-25-14 12:07 PM
À : Fernandez Edgar
Cc : galaxy-dev@bx.psu.edu
Objet : Re: [galaxy-dev] galaxy with torque

I am not sure we have a walkthrough for Torque specifically - but if you have Galaxy up and running and you can qsub commands to torque - hopefully you have done all of the hard parts.

You will need a DRMAA library for your torque setup - https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster
suggests compiling pbs_drmaa and outlines how to set it up. After that you just need to add a plugin and default destination to your job_conf.xml file - also outlined on that wiki page.

Other good resources to consult if you are scaling up your Galaxy this way are:
https://wiki.galaxyproject.org/Admin/Config/Performance/ProductionServer
https://wiki.galaxyproject.org/Events/GCC2014/TrainingDay/AdminWalkthrough

Good luck and let us know if you encounter any problems.

-John

On Fri, Nov 21, 2014 at 2:30 PM, Fernandez Edgar <edgar.fernandez@umontreal.ca> wrote:
> Hello all,
>
>
>
> My name is Edgar Fernandez. I’m a sys. admin. at University of Montreal.
>
> I’ve contacted you a while back about installing galaxy and I’ve
> successfully done it on a redhat 6 server.
>
>
>
> I see myself in a situation where I need to utilise all my redhat
> servers (who are identical to the server running the galaxy website).
>
>
>
> I’ve also successfully installed a server torque with compute notes
> and clients nodes.
>
>
>
> What are the last step to make the link between galaxy and torque?
>
> Also, once that connection is made, how will galaxy keep track of the
> jobs sent?
>
> I mean who will it know this job that just finished is for this user
> and not another ?
>
>
>
> Also, my torque installation is so that my server running galaxy is a
> submit node and a client node.
>
> I hope this is not a problem.
>
>
>
> Please help!
>
>
>
> Cordialement / Regards,
>
>
>
> Edgar Fernandez
>
> System Administrator (Linux)
>
> Direction Générale des Technologies de l'Information et de la
> Communication
>
> (  Bur. : 1-514-343-6111 poste 16568
>
>
>
> Université de Montréal
>
> PAVILLON ROGER-GAUDRY, bureau X-218
>
>
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/

 

 


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

 


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/