Re: [galaxy-dev] Slow repsonses viewing histories

15 Jul 2015

      Hi Christian and Carl,

Thanks both for the replies.

To answer your questions in reverse order. I have about XX histories in my account each with an average of about XX datasets. Total data in my account is about 1TB.

It is indeed an admin account and other users with close to 1TB of data do not have a similar slow down. Although their data is spread over far fewer histories. Is there a way then to prevent the file_name attribute being requested for admin accounts so I can see if this speeds things back up again?

Although the Galaxy server is running on my iMac the data is stored external on a large directly attached NAS. I think I first noticed this slow down after deleting and purging a bunch of older histories to free space on the NAS. I have tried running some of the cleanup_datasets scripts but they are actually returning errors and not running right now (can give you the error messages if necessary).

The slowdown is actually getting worse now and it is even slow to display tool pages, as well as often getting this error if it is really slow:
Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /history/list<http://iworm.anat.ucl.ac.uk:8080/history/list>.

Reason: Error reading from remote server

I am running through an apache proxy - perhaps the apache settings need tweaking too? (I forget right now where I set these up!).

As for the database itself, I am running PostgreSQL 9.3 and I tweaked the settings in my universe_wsgi.ini as per the instructions on https://wiki.galaxyproject.org/Admin/Config/Performance/ProductionServer#Adv...

So my settings are:

# -- Database

# By default, Galaxy uses a SQLite database at 'database/universe.sqlite'.  You
# may use a SQLAlchemy connection string to specify an external database
# instead.  This string takes many options which are explained in detail in the
# config file documentation.
database_connection = postgresql://*******:*******@localhost:5432/galaxy_prod

# If the server logs errors about not having enough database pool connections,
# you will want to increase these values, or consider running more Galaxy
# processes.
database_engine_option_pool_size = 10
database_engine_option_max_overflow = 20

# If using MySQL and the server logs the error "MySQL server has gone away",
# you will want to set this to some positive value (7200 should work).
#database_engine_option_pool_recycle = -1

# If large database query results are causing memory or response time issues in
# the Galaxy process, leave the result on the server instead.  This option is
# only available for PostgreSQL and is highly recommended.
database_engine_option_server_side_cursors = True

# Create only one connection to the database per thread, to reduce the
# connection overhead.  Recommended when not using SQLite:
database_engine_option_strategy = threadlocal

# Log all database transactions, can be useful for debugging and performance
# profiling.  Logging is done via Python's 'logging' module under the qualname
# 'galaxy.model.orm.logging_connection_proxy'
database_query_profiling_proxy = False

# -- Files and directories

Let me know if you think these settings are appropriate or need further tweaks.

Thanks again for your responses so far,

Richard

On 13 Jul 2015, at 16:31, Carl Eberhard <carlfeberhard@gmail.com<mailto:carlfeberhard@gmail.com>> wrote:

Hi, Richard

How many histories are on your account? How many datasets (roughly)?

Are you using an Admin account to view the histories and does the slow down still occur for regular users with large amounts of data?

One of the exposed attributes of datasets (for admins - not other users generally) is the file_name. I've noticed that retrieving this attribute from the file system can be slow.

Christian also provides good advice.

On Thu, Jul 9, 2015 at 4:12 AM, Christian Brenninkmeijer <christian.brenninkmeijer@manchester.ac.uk<mailto:christian.brenninkmeijer@manchester.ac.uk>> wrote:
Hi Richard,

I am relatively new to galaxy so if you get a different response from one of the core team ignore this.

One thing I would check is the underlying database.
What do you have set for "database_connection" in your galaxy.ini file.

Especially if you are using the default sqlite this could be the issue. As that is store in a single file on disk.

Whichever database you have make sure it has enough resources to handle what will now be a large size.

Christian
________________________________
From: galaxy-dev [galaxy-dev-bounces@lists.galaxyproject.org<mailto:galaxy-dev-bounces@lists.galaxyproject.org>] on behalf of Poole, Richard [r.poole@ucl.ac.uk<mailto:r.poole@ucl.ac.uk>]
Sent: Wednesday, July 08, 2015 9:04 PM
To: galaxy-dev@lists.galaxyproject.org<mailto:galaxy-dev@lists.galaxyproject.org>
Subject: [galaxy-dev] Slow repsonses viewing histories

Hi all,

I am having trouble right now with my own personal account on my production server. Grid refreshes are taking a huge amount of time (e.g. when viewing ‘saved histories’ or even generating the dataset list for a single history). My account is very full of data (1TB), could it be this?

There are no obvious messages in the logs though so I am a bit stumped as to why.I do not have the same trouble when impersonating other users with fairly full accounts. Perhaps a database issue (I do not know how to ‘cleanup’ the database or indeed Galaxy user accounts). Any thoughts?

Thanks,
Richard

Richard J Poole PhD
Wellcome Trust Fellow
Department of Cell and Developmental Biology
University College London
21 University Street, London WC1E 6DE
Office (518 Rockefeller): +44 20 7679 6577<tel:%2B44%C2%A020%207679%206577> (int. 46577)
Lab (529 Rockefeller): +44 20 7679 6133<tel:%2B44%2020%C2%A07679%206133> (int. 46133)

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/