Marius,
Thanks for this suggestion! John also suggested this on IRC. I had thought
about this, but I initially avoided it for some technical reasons. However,
I'm happy to say this is working for me. Thanks again for the pointers!
Phil
On Thu, Dec 1, 2016 at 11:33 AM, Marius van den Beek <m.vandenbeek(a)gmail.com
I hope John can give some more input on this,
but if you're really short on time you could try the ssh runner
directly from galaxy without pulsar. I'm very happy with this driving a
Torque 4 cluster.
The relevant part of my job_conf.xml looks like this:
```
<destination id="bicalc" runner="cli">
<param id="shell_plugin">SecureShell</param>
<param id="job_plugin">Torque</param>
<param id="shell_username">myusername</param>
<param
id="shell_hostname">submitnode.curie.fr</param>
<param id="job_Resource_List">walltim
e=2:00:00,nodes=1:ppn=8,mem=32gb</param>
<param id="remote_metadata">true</param>
<env
file="/bioinfo/guests/mvandenb/galaxy/.venv/bin/activate"
/>
</destination>
```
You'll only have to be able to do a passwordless ssh login using your
galaxy user.
Best,
Marius
On 01/12/2016 15:35, Philip Blood wrote:
Hi Folks,
I'm working on a time-sensitive project (just a day or two left to sort it
out) that requires I be able to submit jobs from a remote resource (TACC)
to my resources (Pittsburgh Supercomputing Center) using a shared
filesystem for data rather than staging via Pulsar. I've tried to set it up
so that Pulsar does not do any staging, but is just handling the remote job
submission and tool dependencies. However, Pulsar continues to use a local
staging directory for the data and then copies the data back to the Galaxy
directory.
What I'm hoping to do is have all the data stay in the shared filesystem
for the entire course of the job. The quick test below does not have input
data, but just issues commands on the remote compute node and generates
output.
I'm using latest stable releases of Galaxy (16.07, installed on a TACC VM)
and Pulsar (installed via pip at PSC). If anyone can provide some quick
pointers I'd appreciate it.
Here are relevant parts of my configuration and some of the log output:
*Shared filesystem:* /hopper
*Local staging dir created by Pulsar:* /usr/local/packages/pu
lsar/etc/files/staging
*galaxy.ini*
file_path = /hopper/sy3l67p/blood/galaxy_home/database/files
new_file_path = /hopper/sy3l67p/blood/galaxy_home/database/tmp
job_working_directory = /hopper/sy3l67p/blood/galaxy_h
ome/database/job_working_directory
tool_dependency_dir = /hopper/sy3l67p/blood/galaxy_h
ome/database/dependencies
dependency_resolvers_config_file = /hopper/sy3l67p/blood/galaxy_h
ome/config/dependency_resolvers_conf.xml
*job_conf.xml *
<destination id="pulsar" runner="pulsar_rest">
<param id="url">https://128.182.99.126:8913/</param>
<param id="default_file_action">none</param>
<param id="file_action_config">/home/
tg455546/galaxy/config/file_actions.yaml</param>
<param id="dependency_resolution">remote</param>
<param id="submit_native_specification">-q
batch</param>
</destination>
*file_actions.yml*
paths:
# If Galaxy, the Pulsar, and the compute nodes all mount the same
directory
# staging can be disabled altogether for given paths.
- path: /hopper/sy3l67p/blood/galaxy_home/database/files
action: none
- path: /hopper/sy3l67p/blood/galaxy_home/database/job_working_directory
action: none
*app.yml*
---
manager:
type: queued_drmaa
dependency_resolvers_config_file: /usr/local/packages/pulsar/etc
/dependency_resolvers_conf.xml
tool_dependency_dir: /usr/local/packages/pulsar/dependencies
*Galaxy paster.log*
galaxy.jobs DEBUG 2016-12-01 08:07:35,770 (50) Working directory for job
is: /hopper/sy3l67p/blood/galaxy_home/database/job_working_directory\
/000/50
pulsar.client.staging.down INFO 2016-12-01 08:08:08,305 collecting output
output with action FileAction[action_type=copy]
pulsar.client.client DEBUG 2016-12-01 08:08:08,814 Copying path
[/usr/local/packages/pulsar/etc/files/staging/50/working/output] to
[/hopper/\
sy3l67p/blood/galaxy_home/database/files/000/dataset_50.dat]
*Pulsar uwsgi.log*
2016-12-01 09:07:38,174 INFO [pulsar.managers.base.base_dr
maa][[manager=_default_]-[action=preprocess]-[job=50]] Submitting DRMA\
A job with nativeSpecification [-q batch]
t #78bc [ 539.97] -> drmaa_allocate_job_template
t #78bc [ 539.97] <- drmaa_allocate_job_template =0: jt=0x7f69cc002f80
t #78bc [ 539.97] -> drmaa_set_attribute(jt=0x7f69cc002f80,
name='drmaa_remote_command', value='/usr/local/packages/pulsar/etc/\
files/staging/50/command.sh')
t #78bc [ 539.97] -> fsd_template_set_attr(drmaa_re
mote_command=/usr/local/packages/pulsar/etc/files/staging/50/command.sh)
t #78bc [ 539.97] <- drmaa_set_attribute =0
t #78bc [ 539.97] -> drmaa_set_attribute(jt=0x7f69cc002f80,
name='drmaa_output_path', value=':/usr/local/packages/pulsar/etc/fi\
les/staging/50/stdout')
t #78bc [ 539.97] -> fsd_template_set_attr(drmaa_ou
tput_path=:/usr/local/packages/pulsar/etc/files/staging/50/stdout)
t #78bc [ 539.97] <- drmaa_set_attribute =0
t #78bc [ 539.97] -> drmaa_set_attribute(jt=0x7f69cc002f80,
name='drmaa_job_name', value='pulsar_50')
t #78bc [ 539.97] -> fsd_template_set_attr(drmaa_job_name=pulsar_50)
t #78bc [ 539.97] <- drmaa_set_attribute =0
t #78bc [ 539.97] -> drmaa_set_attribute(jt=0x7f69cc002f80,
name='drmaa_error_path', value=':/usr/local/packages/pulsar/etc/fil\
es/staging/50/stderr')
t #78bc [ 539.97] -> fsd_template_set_attr(drmaa_er
ror_path=:/usr/local/packages/pulsar/etc/files/staging/50/stderr)
t #78bc [ 539.97] -> drmaa_set_attribute(jt=0x7f69cc002f80,
name='drmaa_native_specification', value='-q batch')
t #78bc [ 539.97] -> fsd_template_set_attr(drmaa_native_specification=-q
batch)
t #78bc [ 539.97] <- drmaa_set_attribute =0
t #78bc [ 539.97] -> drmaa_run_job(jt=0x7f69cc002f80)
t #78bc [ 539.97] -> pbsdrmaa_session_run_impl(jt=0x7f69cc002f80,
bulk_idx=-1)
t #78bc [ 539.97] -> fsd_template_set_attr(Checkpoint=u)
t #78bc [ 539.97] -> fsd_template_set_attr(Keep_Files=n)
t #78bc [ 539.97] -> fsd_template_set_attr(Priority=0)
t #78bc [ 539.97] -> pbsdrmaa_write_tmpfile
t #78bc [ 539.97] <- pbsdrmaa_write_tmpfile=/tmp/pbs_drmaa.n6VpuR
t #78bc [ 539.97] -> fsd_template_set_attr(Job_Name=pulsar_50)
t #78bc [ 539.97] -> fsd_template_set_attr(Output_P
ath=/usr/local/packages/pulsar/etc/files/staging/50/stdout)
t #78bc [ 539.97] -> fsd_template_set_attr(Error_Pa
th=/usr/local/packages/pulsar/etc/files/staging/50/stderr)
t #78bc [ 539.97] -> fsd_template_set_attr(Variable
_List=PBS_O_WORKDIR=/usr/local/packages/pulsar/etc)
t #78bc [ 539.97] -> pbsdrmaa_submit_apply_native_s
pecification({native_specification=(null)})
d #78bc [ 539.97] * self->destination_queue = batch
t #78bc [ 539.97] -> fsd_template_set_attr(submit_args=-q batch)
d #78bc [ 539.97] * set attr: submit_args = -q batch
d #78bc [ 539.97] * set attr: Job_Name = pulsar_50
d #78bc [ 539.97] * set attr: Variable_List =
PBS_O_WORKDIR=/usr/local/packages/pulsar/etc
d #78bc [ 539.97] * set attr: Priority = 0
d #78bc [ 539.97] * set attr: Output_Path =
/usr/local/packages/pulsar/etc/files/staging/50/stdout
d #78bc [ 539.97] * set attr: Keep_Files = n
d #78bc [ 539.97] * set attr: Error_Path =
/usr/local/packages/pulsar/etc/files/staging/50/stderr
d #78bc [ 539.97] * set attr: Checkpoint = u
2016-12-01 09:08:06,707 INFO [pulsar.client.staging.down][
[manager=_default_]-[action=postprocess]-[job=50]] collecting output o\
utput with action FileAction[action_type=copy]
2016-12-01 09:08:06,708 INFO [pulsar.managers.stateful][[m
anager=_default_]-[action=postprocess]-[job=50]] Status of job [50] ch\
anged to [complete]. No callbacks enabled.
--
Philip D. Blood, Ph.D.
Senior Computational Scientist Voice: (412) 268-9329
Pittsburgh Supercomputing Center Fax: (412) 268-5832
Carnegie Mellon University Email: blood(a)psc.edu
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
--
Philip D. Blood, Ph.D.
Senior Computational Scientist Voice: (412) 268-9329
Pittsburgh Supercomputing Center Fax: (412) 268-5832
Carnegie Mellon University Email: blood(a)psc.edu