[GSoC2021] [OGI] Participation in Google Summer of Code 2021
by Robin Haw
Dear All,
The Open Genome Informatics team serves as an “umbrella" organization to support the efforts of many open-access open-source bioinformatics projects for <https://summerofcode.withgoogle.com/> Google Summer of Code (GSoC<https://summerofcode.withgoogle.com/>). Among this list of projects are Reactome and GMOD and its software projects -- JBrowse; Galaxy; WormBase; and others.
Call for 2021 Project Ideas and Mentors: We are seeking project ideas to post and attract talented students to this year’s Summer of Code competition. If you have a project idea for which you would like to mentor a student, please contact Robin Haw, Marc Gillespie, and Scott Cain (emails above).
You can also submit your ideas here<http://gmod.org/wiki/GSOC_Project_Ideas_2021>.
For more information please refer to the Open Genome Informatics page on the GMOD.org website<http://gmod.org/wiki/GSoC><http://gmod.org/wiki/GSoC#Google_Summer_of_Code_2020>.
The mentoring organization application deadline with GSoC is February 19th at 2 pm EST. So, if you are interested in taking part with the team please let us know as soon as possible.
Please forward this to others who might be interested in taking part.
If you have any questions please let us know.
Thanks,
Robin, Marc, and Scott.
10 months, 2 weeks
Galaxy install problems
by Luc Cornet
Dear Galaxy admins,
I am contacting you due to multiple problems during the installation of our own server of galaxy.
We have been struggling to install galaxy on our HPC system for 1.5 year. We have faced so many problems while installing galaxy that we have lost count.
Yet, our main problem was and remains the use of pulsar...
Even after following two training sessions, switched to a brand new CentOS 8 cluster (while CentOS 8 will die at the end of the year), we are still not able to use our galaxy instance.
We followed the ansible documentation and we are now able to reproduce the training sessions (i.e, execute an analysis with pulsar on our HPC system). Nevertheless, we are not able to connect pulsar with slurm (which makes it unusable). We do not want to use DRMAA because it is not well maintained and compatible with slurm. Instead, we choose to use CLI (as mentioned in the galaxy docs) but we are stuck (see below the pulsarservers.yaml file).
Is it possible to use something else than DRMAA to connect pulsar to the scheduler?
Thanks,
Best regards
Luc Cornet
- - pulsarservers.yaml - -
# Put your Galaxy server's fully qualified domain name (FQDN) (or the FQDN of the RabbitMQ server) above.
pulsar_root: /opt/pulsar
pulsar_pip_install: true
pulsar_pycurl_ssl_library: openssl
pulsar_systemd: true
pulsar_systemd_runner: webless
pulsar_create_user: false
pulsar_user: {name: pulsar, shell: /bin/bash}
pulsar_optional_dependencies:
- pyOpenSSL
# For remote transfers initiated on the Pulsar end rather than the Galaxy end
- pycurl
# drmaa required if connecting to an external DRM using it.
- drmaa
# kombu needed if using a message queue
- kombu
# amqp 5.0.3 changes behaviour in an unexpected way, pin for now.
- 'amqp==5.0.2'
# psutil and pylockfile are optional dependencies but can make Pulsar
# more robust in small ways.
- psutil
pulsar_yaml_config:
conda_auto_init: True
conda_auto_install: True
staging_directory: "{{ pulsar_staging_dir }}"
persistence_directory: "{{ pulsar_persistence_dir }}"
tool_dependency_dir: "{{ pulsar_dependencies_dir }}"
# The following are the settings for the pulsar server to contact the message queue with related timeouts etc.
message_queue_url: "pyamqp://galaxy_au:{{ rabbitmq_password_galaxy_au }}@{{ galaxy_server_url }}:5671//pulsar/galaxy_au?ssl=1"
managers:
_default_:
type: queued_cli
job_plugin: slurm
native_specification: "-p batch --tasks=1 --cpus-per-task=2 --mem-per-cpu=1000 -t 10:00"
min_polling_interval: 0.5
amqp_publish_retry: True
amqp_publish_retry_max_retries: 5
amqp_publish_retry_interval_start: 10
amqp_publish_retry_interval_step: 10
amqp_publish_retry_interval_max: 60
# We also need to create the dependency resolver file so pulsar knows how to
# find and install dependencies for the tools we ask it to run. The simplest
# method which covers 99% of the use cases is to use conda auto installs similar
# to how Galaxy works.
pulsar_dependency_resolvers:
- name: conda
args:
- name: auto_init
value: true
------------
Luc Cornet, PhD
Bio-informatician
Mycology and Aerobiology
Sciensano
1 year, 6 months
Re: Running slurm job with pulsar
by Luc Cornet
Dear Gianmauro,
Thanks for your answer. I think that my question was not clear enough.
Below you will find some answer to your suggestions and maybe it will be more clear for you to guide me.
* check the Pulsar' log for error messages
->I look into staging directory but as my test analysis is executed with success, I have no error log.
When a lunch a test from the galaxy GUI, it is executed with success on the cluster.
My problem is that is not executed in a job, not using srun or sbatch command (so not using the scheduler).
I would like pulsar to be able to submit a job on the cluster, just like others users, and not execute analysis directly "in the terminal".
* verify if your Pulsar server can reach the cluster trough ssh
->The pulsar server is on the HPC cluster. The connection between galaxy (rabbitmq) and the cluster (pulsar) is fine.
* in the staging directory of your job should be a command.sh file. You
can try to run it manually (sbatch command.sh or something similar) and
Yes indeed I have this file in staging directory (see below).
We i execute the command.sh file with sbatch command.sh, it fails immediately which is normal since command.sh is not a slurm job.
total 48
-rw-r--r-- 1 pulsar pulsar 4 Jun 17 14:15 use_metadata_directory
-rw-r--r-- 1 pulsar pulsar 10 Jun 17 14:15 tool_version
-rw-r--r-- 1 pulsar pulsar 59 Jun 17 14:15 tool_id
drwxr-xr-x 2 pulsar pulsar 6 Jun 17 14:15 tool_files
drwxr-xr-x 2 pulsar pulsar 6 Jun 17 14:15 metadata
drwxr-xr-x 2 pulsar pulsar 6 Jun 17 14:15 configs
drwxr-xr-x 5 pulsar pulsar 30 Jun 17 14:15 ..
-rw-r--r-- 1 pulsar pulsar 2551 Jun 17 14:15 launch_config
drwxr-xr-x 2 pulsar pulsar 46 Jun 17 14:15 inputs
-rw-r--r-- 1 pulsar pulsar 4 Jun 17 14:15 preprocessed
-rwx------ 1 pulsar pulsar 5441 Jun 17 14:15 command.sh
-rw-r--r-- 1 pulsar pulsar 0 Jun 17 14:15 stdout
drwxr-xr-x 2 pulsar pulsar 6 Jun 17 14:15 home
drwxr-xr-x 2 pulsar pulsar 58 Jun 17 14:15 working
-rw-r--r-- 1 pulsar pulsar 4 Jun 17 14:15 running
-rw-r--r-- 1 pulsar pulsar 546 Jun 17 14:15 stderr
drwxr-xr-x 2 pulsar pulsar 26 Jun 17 14:15 outputs
-rw-r--r-- 1 pulsar pulsar 1 Jun 17 14:15 return_code
-rw-r--r-- 1 pulsar pulsar 10 Jun 17 14:15 final_status
-rw-r--r-- 1 pulsar pulsar 0 Jun 17 14:15 postprocessed
drwxr-xr-x 9 pulsar pulsar 4096 Jun 17 14:15 .
------------
Luc Cornet, PhD
Bio-informatician
Mycology and Aerobiology
Sciensano
----- Mail original -----
De: "Gianmauro Cuccuru" <gmauro(a)informatik.uni-freiburg.de>
À: "Luc Cornet" <luc.cornet(a)uliege.be>, "HelpGalaxy" <galaxy-dev(a)lists.galaxyproject.org>
Cc: "Colignon David" <David.Colignon(a)uliege.be>, "Baurain Denis" <Denis.Baurain(a)uliege.be>, "Pierre Becker" <Pierre.Becker(a)sciensano.be>
Envoyé: Vendredi 18 Juin 2021 11:48:30
Objet: Re: [galaxy-dev] Running slurm job with pulsar
Hi Luc,
I am not a Slurm expert but I can suggest you several things:
* check the Pulsar' log for error messages
* verify if your Pulsar server can reach the cluster trough ssh
* in the staging directory of your job should be a command.sh file. You
can try to run it manually (sbatch command.sh or something similar) and
see if it works
Cheers,
Gianmauro
On 17.06.21 20:18, Luc Cornet wrote:
> Dear all,
>
> I am trying to launch a slurm job with pulsar using CLI (instead of drmaa).
> The pulsar playbook below pass without problem but the analyses is still run out of slurm.
> The analyses is excited with success but not in a slrum job.
>
> Can you help me to launch slurm with pulsar ?
> What did I miss?
>
> Thanks,
> Luc
>
>
> ```
> # Put your Galaxy server's fully qualified domain name (FQDN) (or the FQDN of the RabbitMQ server) above.
>
> pulsar_root: /opt/pulsar
>
> pulsar_pip_install: true
> pulsar_pycurl_ssl_library: openssl
> pulsar_systemd: true
> pulsar_systemd_runner: webless
>
> pulsar_create_user: false
> pulsar_user: {name: pulsar, shell: /bin/bash}
>
> pulsar_optional_dependencies:
> - pyOpenSSL
> # For remote transfers initiated on the Pulsar end rather than the Galaxy end
> - pycurl
> # drmaa required if connecting to an external DRM using it.
> - drmaa
> # kombu needed if using a message queue
> - kombu
> # amqp 5.0.3 changes behaviour in an unexpected way, pin for now.
> - 'amqp==5.0.2'
> # psutil and pylockfile are optional dependencies but can make Pulsar
> # more robust in small ways.
> - psutil
>
> pulsar_yaml_config:
> conda_auto_init: True
> conda_auto_install: True
> staging_directory: "{{ pulsar_staging_dir }}"
> persistence_directory: "{{ pulsar_persistence_dir }}"
> tool_dependency_dir: "{{ pulsar_dependencies_dir }}"
> # The following are the settings for the pulsar server to contact the message queue with related timeouts etc.
> message_queue_url: "pyamqp://galaxy_au:{{ rabbitmq_password_galaxy_au }}@{{ galaxy_server_url }}:5671//pulsar/galaxy_au?ssl=1"
> managers:
> _default_:
> type: queued_cli
> job_plugin: slurm
> native_specification: "-p batch --tasks=1 --cpus-per-task=2 --mem-per-cpu=1000 -t 10:00"
> min_polling_interval: 0.5
> amqp_publish_retry: True
> amqp_publish_retry_max_retries: 5
> amqp_publish_retry_interval_start: 10
> amqp_publish_retry_interval_step: 10
> amqp_publish_retry_interval_max: 60
>
> # We also need to create the dependency resolver file so pulsar knows how to
> # find and install dependencies for the tools we ask it to run. The simplest
> # method which covers 99% of the use cases is to use conda auto installs similar
> # to how Galaxy works.
> pulsar_dependency_resolvers:
> - name: conda
> args:
> - name: auto_init
> value: true
> ```
>
> ------------
> Luc Cornet, PhD
> Bio-informatician
> Mycology and Aerobiology
> Sciensano
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> %(web_page_url)s
>
> To search Galaxy mailing lists use the unified search at:
> http://galaxyproject.org/search/
--
Gianmauro Cuccuru
UseGalaxy.eu
Bioinformatics Group
Department of Computer Science
Albert-Ludwigs-University Freiburg
Georges-Köhler-Allee 106
79110 Freiburg, Germany
1 year, 7 months
Running slurm job with pulsar
by Luc Cornet
Dear all,
I am trying to launch a slurm job with pulsar using CLI (instead of drmaa).
The pulsar playbook below pass without problem but the analyses is still run out of slurm.
The analyses is excited with success but not in a slrum job.
Can you help me to launch slurm with pulsar ?
What did I miss?
Thanks,
Luc
```
# Put your Galaxy server's fully qualified domain name (FQDN) (or the FQDN of the RabbitMQ server) above.
pulsar_root: /opt/pulsar
pulsar_pip_install: true
pulsar_pycurl_ssl_library: openssl
pulsar_systemd: true
pulsar_systemd_runner: webless
pulsar_create_user: false
pulsar_user: {name: pulsar, shell: /bin/bash}
pulsar_optional_dependencies:
- pyOpenSSL
# For remote transfers initiated on the Pulsar end rather than the Galaxy end
- pycurl
# drmaa required if connecting to an external DRM using it.
- drmaa
# kombu needed if using a message queue
- kombu
# amqp 5.0.3 changes behaviour in an unexpected way, pin for now.
- 'amqp==5.0.2'
# psutil and pylockfile are optional dependencies but can make Pulsar
# more robust in small ways.
- psutil
pulsar_yaml_config:
conda_auto_init: True
conda_auto_install: True
staging_directory: "{{ pulsar_staging_dir }}"
persistence_directory: "{{ pulsar_persistence_dir }}"
tool_dependency_dir: "{{ pulsar_dependencies_dir }}"
# The following are the settings for the pulsar server to contact the message queue with related timeouts etc.
message_queue_url: "pyamqp://galaxy_au:{{ rabbitmq_password_galaxy_au }}@{{ galaxy_server_url }}:5671//pulsar/galaxy_au?ssl=1"
managers:
_default_:
type: queued_cli
job_plugin: slurm
native_specification: "-p batch --tasks=1 --cpus-per-task=2 --mem-per-cpu=1000 -t 10:00"
min_polling_interval: 0.5
amqp_publish_retry: True
amqp_publish_retry_max_retries: 5
amqp_publish_retry_interval_start: 10
amqp_publish_retry_interval_step: 10
amqp_publish_retry_interval_max: 60
# We also need to create the dependency resolver file so pulsar knows how to
# find and install dependencies for the tools we ask it to run. The simplest
# method which covers 99% of the use cases is to use conda auto installs similar
# to how Galaxy works.
pulsar_dependency_resolvers:
- name: conda
args:
- name: auto_init
value: true
```
------------
Luc Cornet, PhD
Bio-informatician
Mycology and Aerobiology
Sciensano
1 year, 7 months
Release of Galaxy 21.05
by Marius van den Beek
Dear Community,
The Galaxy Committers team is pleased to announce the release of Galaxy
21.05.
The release announcement for developers and admins can be found at
https://docs.galaxyproject.org/en/master/releases/21.05_announce.html and
user facing release notes are at
https://docs.galaxyproject.org/en/master/releases/21.05_announce_user.html.
A few release highlights are:
¡Galaxy, ahora en español!
------------------------------------
Thanks to Wendi Bacon, the Spanish language translation of Galaxy has been
finalised and merged, so if you prefer to use Galaxy in Spanish, now you
can!
This update will be part of an ongoing project from Spanish speakers within
the
Galaxy community to keep the Galaxy interface localisation up to date,
and to produce some Spanish language training materials in the GTN.
Bugfixes and Stability
-----------------------------
This release of Galaxy features fewer user-facing changes, as a huge amount
of
developer time went into making this a maintenance release with better
testing,
better stability, and more bugfixes.
But watch out, this is all in preparation for the next release of Galaxy,
21.09,
which will have some of the biggest changes in years!
New development stack
--------------------------------
Galaxy release 21.09 will ship with a new web framework (fastAPI
<https://fastapi.tiangolo.com/>),
Celery <https://docs.celeryproject.org/en/stable/index.html> task queue and
process management using Circus <https://circus.readthedocs.io/en/latest/>.
You can preview this new stack now by running *APP_WEBSERVER=dev ./run.sh*.
Celery for background tasks
--------------------------------------
Galaxy can now run certain tasks in the background. The Celery workers are
currently not required, but if activated can perform certain long-running
tasks,
such as creating history export archives. Celery tasks will bridge the gap
between
rapid requests that can be handled during a web request and jobs that
require extensive
and relatively slow setup.
More robust selection of job handlers
-------------------------------------------------
Job throughput can be increased by starting Galaxy with multiple external
job
handler processes. Jobs were traditionally assigned to a job handler process
by the web handler or workflow handler process that created the job. Since
Release 19.01 Galaxy has supported additional mechanisms that use database
serialization techniques to let job handlers assign processes to themselves.
This mechanism is more robust and doesn't require that all job handler
processes be alive and known by the web handler process. Galaxy now
determines
the best method for assigning jobs based on the database in use, if the
assignment
method is not set explicitly. Older job assignment methods will be removed
in Galaxy
release 21.09. For more details see the Job Handler Assignment Methods
section
<https://docs.galaxyproject.org/en/release_21.05/admin/scaling.html#job-ha...>
of the Galaxy documentation.
Check out the full release notes
<https://docs.galaxyproject.org/en/master/releases/21.05_announce.html> for
a lot more details, there are many more
enhancements, new visualizations and bugfixes as well as instructions for
upgrading your Galaxy installation.
Thanks for using Galaxy!
1 year, 7 months
User Data for Research Purpose
by Shamse Tasnim Cynthia
Hello,
I am Shamse Tasnim Cynthia, currently pursuing my M.Sc in Computer Science
at the University of Saskatchewan, Canada, under the supervision of Dr.
Banani Roy. My current thesis work involves working on adaptive user
interfaces for Software Workflow Management Systems. It has been found that
existing SWFMSs do not provide special interaction techniques or visual
elements to assist with composing complex workflows and do not consider
scientists' preferences based on domain expertise, gender, or other
individual preferences. To overcome this, we want to investigate how user
preferences will affect modelling in different environments. In this
regard, I need user data to analyze user preferences and investigate usage
logs to refine the user model.
Galaxy, being the most popular SWfMS which is installed on over 168
research servers worldwide, certainly holds a huge amount of user data that
can be used in my research.
Therefore, it will be very helpful for me if I can get the data to complete
my research work and it will be a great opportunity for me to contribute to
the Software Workflow developing community.
Thank you.
Regards
Shamse Tasnim Cynthia
iSE Lab
Department of Computer Science,
University of Saskatchewan,
Saskatoon, Saskatchewan, Canada.
Contact: uji657(a)usask.ca
tasnim.cynthia99(a)gmail.com
1 year, 7 months
Pulsar connection
by Luc Cornet
Dear Galaxy admins,
We have a problem of connection with pulsar on our HPC cluster.
We use ansible-galaxy to install the couple galaxy-pulsar, like in the training admin session.
The two playbook run with success but when we launch an analysis (simple bwa test), the job remains pending without any error message.
The same two playbooks work very well on a virtual machine (centos8) and the bwa test works too. We simply modify the pulsar servers to use ou HPC centos8 instead of the VM and we don't understand why it doesn't work anymore. From the cluster, we can see that there is no connection from the pulsar user.
Can you, please, help us to fix that?
Thanks
Luc
------------
Luc Cornet, PhD
Bio-informatician
Mycology and Aerobiology
Sciensano
1 year, 7 months