Monitoring Dashboard for Galaxy
by evan clark
I remember there previously being an additional tool that ran on a
different port from galaxy that allowed for monitoring of performance
and jobs running. Is this tool still packaged with galaxy and if so how
can it be activated.
3 years, 10 months
galaxy config from 17.05 to 18.05
by Fernandez Edgar
Hello gents,
I would greatly appreciate some help configuring my new installation of galaxy 18.05.
I found most of the configuration I needed but I'm missing the following configuration in green.
Here is my galaxy 17.05:
[galaxy@esilbac3a ~]$ diff galaxy-17.05/config/galaxy.ini galaxy-17.05/config/galaxy.ini.sample
32c32
< port = 7112
---
> #port = 8080
37c37
< host = 0.0.0.0
---
> #host = 127.0.0.1
50,59d49
< # ---- HTTP PBS DRMAA HANDLER -----------------------------------------------
<
< # Configuration of the pbs drmaa handler
<
< [server:handler0]
< use = egg:Paste#http
< port = 8088
< use_threadpool = True
< threadpool_workers = 10
<
74c64
< prefix = /galaxy-prod
---
> prefix = /galaxy
95c85
< filter-with = proxy-prefix
---
> #filter-with = proxy-prefix
109c99
< database_connection = mysql://galaxy:qaz1wsx2@localhost/galaxy_db_prod2?unix_socket=/var/lib/mysql/mysql.sock
---
> #database_connection = sqlite:///./database/universe.sqlite?isolation_level=IMMEDIATE
119c109
< database_engine_option_pool_recycle = 7200
---
> #database_engine_option_pool_recycle = -1
152c142
< file_path = /home/galaxy/galaxy-prod/database/files
---
> #file_path = database/files
155c145
< new_file_path = /home/galaxy/galaxy-prod/database/tmp
---
> #new_file_path = database/tmp
161c151
< tool_config_file = /home/galaxy/galaxy-prod/config/tool_conf.xml,/home/galaxy/galaxy-prod/config/shed_tool_conf.xml
---
> #tool_config_file = config/tool_conf.xml,config/shed_tool_conf.xml
197c187
< tool_dependency_dir = /home/galaxy/galaxy-prod/tool-data/toolshed.dependency.dir
---
> #tool_dependency_dir = database/dependencies
401c391
< job_working_directory = /home/galaxy/galaxy-prod/database/jobs_directory
---
> #job_working_directory = database/jobs_directory
466c456
< smtp_server = smtp.umontreal.ca:25
---
> #smtp_server = None
475c465
< smtp_ssl = False
---
> #smtp_ssl = False
486c476
< error_email_to = edgar.fernandez(a)umontreal.ca
---
> #error_email_to = None
493c483
< email_from = Galaxy ESI Project <rootbac(a)esi.umontreal.ca>
---
> #email_from = None
679c669
< apache_xsendfile = True
---
> #apache_xsendfile = False
693c683
< upstream_gzip = False
---
> #upstream_gzip = False
848c838
< debug = False
---
> #debug = False
861c851
< use_interactive = False
---
> use_interactive = True
989c979
< id_secret = 3s1G@l@xyPr0j3ct
---
> #id_secret = USING THE DEFAULT IS NOT SECURE!
1038c1028
< admin_users = rootbac(a)esi.umontreal.ca
---
> #admin_users = None
1041c1031
< require_login = True
---
> #require_login = False
1049c1039
< allow_user_creation = False
---
> #allow_user_creation = True
1052c1042
< allow_user_deletion = True
---
> #allow_user_deletion = False
1055c1045
< allow_user_impersonation = True
---
> #allow_user_impersonation = False
1060c1050
< allow_user_dataset_purge = True
---
> #allow_user_dataset_purge = True
1066c1056
< new_user_dataset_access_role_default_private = True
---
> #new_user_dataset_access_role_default_private = False
1200c1190
< enable_quotas = True
---
> #enable_quotas = False
1234c1224
< job_config_file = /home/galaxy/galaxy-prod/config/job_conf.xml
---
> #job_config_file = config/job_conf.xml
1292c1282
< retry_job_output_collection = 20
---
> #retry_job_output_collection = 0
1310c1300
< cleanup_job = onsuccess
---
> #cleanup_job = always
Please help me find what I am missing.
Best regards,
Edgar Fernandez
Administrateur Système (Linux)
Technologies de l'Information
Université de Montréal
PAVILLON ROGER-GAUDRY, bureau X-210
* Bur. : 1-514-343-6111 poste 16568
3 years, 10 months
Re: [galaxy-dev] collections with more than 25,000 items
by Mohammad Heydarian
We initially had issues with Collections containing thousands of datasets
that was related to the limit of jobs in the Slurm queue - excessively
increasing this limit fixed our issue.
Cheers,
Mo Heydarian
On Thu, Aug 30, 2018 at 11:18 AM Peter Cock <p.j.a.cock(a)googlemail.com>
wrote:
> There is a sweet spot for splitting your BLAST query fasta file
> by sequence - one big file with 25000 sequences is not great,
> but one sequence per file is the worst possible option.
>
> This is due to all the extra overheads, you would have 25000
> jobs submitted to the cluster, each of which would load the
> BLAST binary and database off disk etc. And there are also
> going to be Galaxy overheads with a large collection as well.
>
> I would suggest somewhere around 500 to 1000 gene sequences
> per FASTQ query file is likely a safe choice. If you have very
> long sequences (e.g. chromosomes or contigs), then use less.
>
> As to the number of threads for each BLAST job, more is better,
> but what to pick will depend on your cluster and how often there
> are threads free on nodes. I would suggest trying 4, 8 or 16 threads.
>
> I hope that helps.
>
> Peter
>
>
> On Thu, Aug 30, 2018 at 3:50 PM Jochen Bick <jochen.bick(a)usys.ethz.ch>
> wrote:
> >
> > Thanks Peter,
> >
> > so my idea was to split my problem into single blast jobs and run them
> > only on one core...
> > So my file has 25000 sequences and I'm blasting them against all NCBI
> > proteins (nr). This just take to long time. I guess because the database
> > is also very big? I tested this on the first 10 sequences and it took
> > about 10mins. But maybe this is still not faster than running all at
> once?
> > How many cores would you give such a job?
> >
> > Cheers Jochen
> >
> > On 30.08.2018 16:44, Peter Cock wrote:
> > > If there are any limits, it would be down to the Galaxy Admin's job
> > > settings - something generic with collections.
> > >
> > > Personally I've not done this - I tend to concatenate FASTA files
> > > to make large files with multiple sequences instead.
> > >
> > > (And then we have the optional task splitting enabled so that Galaxy
> > > breaks up the multiple-sequence FASTA file into chunks which
> > > get shared out on our cluster for better throughput before
> > > concatenating the output back into a single file.)
> > >
> > > Peter
> > > On Thu, Aug 30, 2018 at 3:37 PM Jochen Bick <jochen.bick(a)usys.ethz.ch>
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> is there any limit to run BLAST jobs from a collection of single FASTA
> > >> files? I started a job but is does not get executed... its just
> sending
> > >> for about an hour.
> > >>
> > >> Cheers Jochen
> > >>
> > >> --
> > >> ETH Zurich
> > >> *Jochen Bick*
> > >> Animal Physiology
> > >> Institute of Agricultural Sciences
> > >> Postal address: Universitätstrasse 2 / LFW B 58.1
> > >> Office: Tannenstrasse 1 / TAN D 6.2
> > >> 8092 Zurich, Switzerland
> > >>
> > >> Phone +41 44 632 28 25
> > >> jochen.bick(a)usys.ethz.ch <mailto:jochen.bick@usys.ethz.ch>
> > >> www.ap.ethz.ch
> > >> ___________________________________________________________
> > >> Please keep all replies on the list by using "reply all"
> > >> in your mail client. To manage your subscriptions to this
> > >> and other Galaxy lists, please use the interface at:
> > >> https://lists.galaxyproject.org/
> > >>
> > >> To search Galaxy mailing lists use the unified search at:
> > >> http://galaxyproject.org/search/
> >
> > --
> > ETH Zurich
> > *Jochen Bick*
> > Animal Physiology
> > Institute of Agricultural Sciences
> > Postal address: Universitätstrasse 2 / LFW B 58.1
> > Office: Tannenstrasse 1 / TAN D 6.2
> > 8092 Zurich, Switzerland
> >
> > Phone +41 44 632 28 25
> > jochen.bick(a)usys.ethz.ch <mailto:jochen.bick@usys.ethz.ch>
> > www.ap.ethz.ch
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
> https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
> http://galaxyproject.org/search/
3 years, 10 months
Re: [galaxy-dev] collections with more than 25,000 items
by Peter Cock
There is a sweet spot for splitting your BLAST query fasta file
by sequence - one big file with 25000 sequences is not great,
but one sequence per file is the worst possible option.
This is due to all the extra overheads, you would have 25000
jobs submitted to the cluster, each of which would load the
BLAST binary and database off disk etc. And there are also
going to be Galaxy overheads with a large collection as well.
I would suggest somewhere around 500 to 1000 gene sequences
per FASTQ query file is likely a safe choice. If you have very
long sequences (e.g. chromosomes or contigs), then use less.
As to the number of threads for each BLAST job, more is better,
but what to pick will depend on your cluster and how often there
are threads free on nodes. I would suggest trying 4, 8 or 16 threads.
I hope that helps.
Peter
On Thu, Aug 30, 2018 at 3:50 PM Jochen Bick <jochen.bick(a)usys.ethz.ch> wrote:
>
> Thanks Peter,
>
> so my idea was to split my problem into single blast jobs and run them
> only on one core...
> So my file has 25000 sequences and I'm blasting them against all NCBI
> proteins (nr). This just take to long time. I guess because the database
> is also very big? I tested this on the first 10 sequences and it took
> about 10mins. But maybe this is still not faster than running all at once?
> How many cores would you give such a job?
>
> Cheers Jochen
>
> On 30.08.2018 16:44, Peter Cock wrote:
> > If there are any limits, it would be down to the Galaxy Admin's job
> > settings - something generic with collections.
> >
> > Personally I've not done this - I tend to concatenate FASTA files
> > to make large files with multiple sequences instead.
> >
> > (And then we have the optional task splitting enabled so that Galaxy
> > breaks up the multiple-sequence FASTA file into chunks which
> > get shared out on our cluster for better throughput before
> > concatenating the output back into a single file.)
> >
> > Peter
> > On Thu, Aug 30, 2018 at 3:37 PM Jochen Bick <jochen.bick(a)usys.ethz.ch> wrote:
> >>
> >> Hi,
> >>
> >> is there any limit to run BLAST jobs from a collection of single FASTA
> >> files? I started a job but is does not get executed... its just sending
> >> for about an hour.
> >>
> >> Cheers Jochen
> >>
> >> --
> >> ETH Zurich
> >> *Jochen Bick*
> >> Animal Physiology
> >> Institute of Agricultural Sciences
> >> Postal address: Universitätstrasse 2 / LFW B 58.1
> >> Office: Tannenstrasse 1 / TAN D 6.2
> >> 8092 Zurich, Switzerland
> >>
> >> Phone +41 44 632 28 25
> >> jochen.bick(a)usys.ethz.ch <mailto:jochen.bick@usys.ethz.ch>
> >> www.ap.ethz.ch
> >> ___________________________________________________________
> >> Please keep all replies on the list by using "reply all"
> >> in your mail client. To manage your subscriptions to this
> >> and other Galaxy lists, please use the interface at:
> >> https://lists.galaxyproject.org/
> >>
> >> To search Galaxy mailing lists use the unified search at:
> >> http://galaxyproject.org/search/
>
> --
> ETH Zurich
> *Jochen Bick*
> Animal Physiology
> Institute of Agricultural Sciences
> Postal address: Universitätstrasse 2 / LFW B 58.1
> Office: Tannenstrasse 1 / TAN D 6.2
> 8092 Zurich, Switzerland
>
> Phone +41 44 632 28 25
> jochen.bick(a)usys.ethz.ch <mailto:jochen.bick@usys.ethz.ch>
> www.ap.ethz.ch
3 years, 10 months
collections with more than 25,000 items
by Jochen Bick
Hi,
is there any limit to run BLAST jobs from a collection of single FASTA
files? I started a job but is does not get executed... its just sending
for about an hour.
Cheers Jochen
--
ETH Zurich
*Jochen Bick*
Animal Physiology
Institute of Agricultural Sciences
Postal address: Universitätstrasse 2 / LFW B 58.1
Office: Tannenstrasse 1 / TAN D 6.2
8092 Zurich, Switzerland
Phone +41 44 632 28 25
jochen.bick(a)usys.ethz.ch <mailto:jochen.bick@usys.ethz.ch>
www.ap.ethz.ch
3 years, 10 months
java options
by Matthias Bernt
Dear list,
I'm struggling to set java options for galaxy tools. Currently I use
`<env file="job_setup_script.sh"/> in my job_conf.xml and in the script
I explored two ways to set java options:
`_JAVA_OPTIONS="-Xmx5G -Xms1G -Djava.io.tmpdir=/work/songalax/tmp"`
But then java prints "Picked up _JAVA_OPTIONS: ..." to stderr which is
be interpreted as error by some tools. If I'm correct this is currently
the default .. I have seen this happening, but can't remember the name
of the tool.
So I switched to this 'trick':
`alias java='java -Xmx5G -Xms1G -Djava.io.tmpdir=/work/songalax/tmp'`
but the problem is that the tool script is called and not sourced and
therefore aliases are not used (could this be changed?). Furthermore,
for tools which explicitly set the java options my settings would be
ignored anyway (an example is the MSGFPlusAdapter which calles `java
-Xmx3500m` .. so my -Xmx is overwritten .. I passing GALAXY_MEMORY_MB to
the corresponding parameter might be a solution for this tool).
Any thought or suggestions on how to set the parameters in a production
environment are very welcome.
Cheers,
Matthias
--
-------------------------------------------
Matthias Bernt
Bioinformatics Service
Molekulare Systembiologie (MOLSYB)
Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
Helmholtz Centre for Environmental Research GmbH - UFZ
Permoserstraße 15, 04318 Leipzig, Germany
Phone +49 341 235 482296,
m.bernt(a)ufz.de, www.ufz.de
Sitz der Gesellschaft/Registered Office: Leipzig
Registergericht/Registration Office: Amtsgericht Leipzig
Handelsregister Nr./Trade Register Nr.: B 4703
Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
Prof. Dr. Dr. h.c. Georg Teutsch
Administrative Geschäftsführerin/ Administrative Managing Director:
Prof. Dr. Heike Graßmann
-------------------------------------------
3 years, 10 months
European Galaxy Days, 19/20 November 2018 in Freiburg, Germany.
by Hans-Rudolf Hotz
Dear all
We are happy to announce the European Galaxy Days which will be held 19
and 20 November 2018 in Freiburg, Germany.
https://galaxyproject.org/events/2018-europe-dev/
Similar to the events we organized in 2016, 2014 and 2012 the aim is to
discuss the status of the Galaxy project, new developments, interfaces
to other systems, extensions and best practice in reproducible research.
The program is planed as follows:
Monday, November 19th
We intend to have a full day of talks from you. We especially encourage
Galaxy User to present their work with Galaxy.
Please indicate your interest in presenting when you register (we will
then get in contact with you). Or contact us directly.
Tuesday, November 20th
This day is more on the technical side, with presentation, tutorials and
hands on exercises. Currently we plan to discuss/present:
- Galaxy and Machine learning
- Single Cell RNA Seq analysis with Galaxy
- Combining Galaxy with Shiny
Registration is now open (space is limited). There will be no conference
fee. Though, you need to cover for your food and accommodation (we are
currently looking for sponsors to cover the lunches on the two days). We
recommend you register soon to secure your spot:
https://tinyurl.com/EGD2018
Looking forward to see many of you in Freiburg
Regards,
Jean-François, Bjoern and Hans-Rudolf
3 years, 10 months
Permanently Deleted Data
by Emm L
Hello,
I am emailing in desperate hope for some help here. I trimmed my files and waited for days for them to finish. When they were finished, I meant to purge my deleted data to make space in my account...I accidentally purged my entire history. The galaxy account I am using is under the email emm.elle(a)outlook.com
I was wondering if someone in IT could get me back my history. The history name is WES for ML3.
Thank you very much.
Michelle Lim
3 years, 10 months
tool datatypes
by Matthias Bernt
Dear list,
just a request for links to documentation: How can I realize tool
specific data types. I'm just developing a set of tools that need their
own data types, but I don't want to add them to Galaxy's core data types
(yet).
I've seen examples of tools that had a datatypes_conf.xml. So I created
one, but it seems that it is ignored.
Cheers,
Matthias
--
-------------------------------------------
Matthias Bernt
Bioinformatics Service
Molekulare Systembiologie (MOLSYB)
Helmholtz-Zentrum für Umweltforschung GmbH - UFZ/
Helmholtz Centre for Environmental Research GmbH - UFZ
Permoserstraße 15, 04318 Leipzig, Germany
Phone +49 341 235 482296,
m.bernt(a)ufz.de, www.ufz.de
Sitz der Gesellschaft/Registered Office: Leipzig
Registergericht/Registration Office: Amtsgericht Leipzig
Handelsregister Nr./Trade Register Nr.: B 4703
Vorsitzender des Aufsichtsrats/Chairman of the Supervisory Board:
MinDirig Wilfried Kraus
Wissenschaftlicher Geschäftsführer/Scientific Managing Director:
Prof. Dr. Dr. h.c. Georg Teutsch
Administrative Geschäftsführerin/ Administrative Managing Director:
Prof. Dr. Heike Graßmann
-------------------------------------------
3 years, 10 months
Unable to install tools from toolshed/mercurial issue in release_18.05 instance
by Peter Briggs
Dear devs
I've encountered what appears to be a subtle bug with release 18.05, which breaks the installation of tools from the toolshed, and appears to be a result of not having mercurial (hg) available in /usr/bin on the system that Galaxy is installed on (in this case Scientific Linux 6.5).
When attempting to install a tool (e.g. devteam/fastqc) from the main toolshed via the admin interface, after clicking "install" the tool installation status goes immediately to "Error". The tool repository isn't cloned to "shed_tools" and no dependencies are installed.
I've been unable to find any error messages in the logs. However, attempting to install via the API does return the message:
Error cloning repository: [Errno 2] No such file or directory
which comes from the "clone_repository" function in lib/tool_shed/util/hg_util.py (when something goes wrong with the "hg clone ..." command).
Installing Mercurial 1.3 via yum on the server and attempting tool installation again gives a slightly different error via the API:
Error cloning repository: Command '['hg', 'clone', '-r', u'17', u'https://toolshed.g2.bx.psu.edu/repos/devteam/fastqc', u'/XXXXXXXXXXXXXX/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/fastqc/c15237684a01/fastqc']' returned non-zero exit status 255
Output was:
abort: No such file or directory: /XXXXXXXXXXXXXXXXXX/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/fastqc/c15237684a01/fastqc
Uninstalling the system Mercurial and instead installing version 3.7.3 and making a link from /usr/bin/hg seems to fix the problem, and tools can be installed without problems.
I couldn't find any evidence of this being reported before, and I don't know if I've missed some configuration detail which means that Galaxy isn't picking up hg from its virtualenv instead of /usr/bin.
Has anyone else encountered this problem? and is there a fix/workaround for it (other than horrible links to /usr/bin/hg)?
Any advice gratefully received!
Best wishes
Peter
--
Peter Briggs peter.briggs(a)manchester.ac.uk
Bioinformatics Core Facility University of Manchester
B.1083 Michael Smith Bldg Tel: (0161) 2751482
3 years, 10 months