Hi Pi, The wiki for deleting datasets is out of date, and I will be updating it shortly. There is a collection of shell scripts included in the scripts/cleanup_datasets directory. In order to delete no longer needed datasets from disk, the scripts can be used in the following order (assuming you have not used library functions): delete_userless_histories.sh purge_histories.sh purge_datasets.sh I will send a message after the wiki has been updated.
In addition: 1. What if I ran the script without -r and later decide I want to delete the associated files anyway to free up some space? How do I then know what files to delete?
This is an excellent feature for us to add to the script.
2. If I understand correctly, I should be able to remove associated data sets -r, but even when purging stuff the entries will still remain in the database... How do I really, really, Yes-Ok-I-accept-I- know-what-I'm-doing-Delete outdated stuff :) ?
There are several database tables which Galaxy expects to exist (for Job reporting, etc.) and should not have entries deleted. Datasets are an example of this, when a Dataset is purged, the purged flag is set to True, but the entry is kept. Deleting entries from the dataset tables is not recommended. Thanks for using Galaxy, Dan
Hi Erick, Greg et alia,
I've setup Galaxy with a MySQL DB too, but I cannot get rid off old stuff. According to the wiki, running the script with ... -1 or -3 or -5 should show me what the script would do with -2, -4 or -6. When I ran with -1 it told me:
-------- # 2009-07-29 14:03:22 - Handling stuff older than 1 days
# Datasets will NOT be removed from disk.
# The following datasets and associated userless histories have been deleted # Deleted 0 histories.
Elapsed time: 0.21 --------
That was I bit weird, because I know there should be stuff to delete. So I tried my luck with -2 to perform the actual cleanup and viola:
-------- # 2009-07-29 14:04:25 - Handling stuff older than 1 days
# Datasets will NOT be removed from disk.
# The following datasets and associated deleted histories have been purged 1 4 5 6 7 8 9 10 11 12 13 14
<..cut a lot of white space..>
15 16 # Purged 14 histories.
Elapsed time: 1.17 --------
Running with -3, -4 and -5 all gave me 0 in either purged data sets or folders, but I know there must be stuff associated with user accounts older than 1 day that should be purged... The -6 option does not seem to work at all as I got this error: "cleanup_datasets.py: error: no such option: -6". Do I miss something?
In addition: 1. What if I ran the script without -r and later decide I want to delete the associated files anyway to free up some space? How do I then know what files to delete? 2. If I understand correctly, I should be able to remove associated data sets -r, but even when purging stuff the entries will still remain in the database... How do I really, really, Yes-Ok-I-accept-I- know-what-I'm-doing-Delete outdated stuff :) ?
Cheers,
Pi
On 23Jul2009, at 5:17 PM, Erick Antezana wrote:
Greg,
please see in-line:
2009/7/23 Greg Von Kuster <ghv2@psu.edu> Hi Erick,
Erick Antezana wrote: Greg,
I manage to set my connection string so that we could use a remote mysql server. Thanks.
w.r.t. the datasets purging, I used the scripts to clean deleted libraries, folders, datasets, userless history ... I've seen that one must speficy the span of time in days. What about the data that was added mistakenly for instance today and that we want to immediately delete it? I tried to launch the script with "-d 0" but the data is still there... Am I missing something?
No, I don't think so. It's possible that your system clock is off from your database time.
both servers (mysql and the one where galaxy is running) have the same time.
Is your database storing time as local time?
how can I see that?
The cleanup script uses the update_time for the objects being deleted.
In which file can I find the SQL command that actually deletes and purges the data?
I am no longer using the sqlite DB created in our first trials. I guess I can safely delete (from the command line) all the files under the directory database?
Maybe. Did you keep any data that refers to them in your tables when you migrated to mysql? If so, you'll need to keep them.
no, I have no data referring to anything... I just deleted (to save space) all those files and I have no problems at all (so far ;-) )
have the purge_*.sh scripts tested with mysql?
Yes
last question (already asked before): are there any plans to support Oracle?
Not sure why it wouldn't already be supported, although we don't use it here. Just needs a different URL - sqlalchemy supports Oracle.
good to know that, I will try to find some time to test it and let you know.
cheers, Erick
thanks, Erick
2009/7/22 Greg Von Kuster <ghv2@psu.edu <mailto:ghv2@psu.edu>>
Erick,
To use a different database than the sqlite that come with the Galaxy distribution all that is needed is to change the config setting, prviding the URL that points to your mysql database. See the mysql documentation for the connection URL, as the URL differs depending upon whether you database is installed locally or not.
The config setting is the "database_connection" setting, and could look something like this:
database_connection = mysql:///greg_test?unix_socket=/var/run/mysqld/mysqld.sock
Greg Von Kuster Galaxy Development Team
Erick Antezana wrote:
Hello,
I would like to use MySQL instead of sqlite to store my data. I coudn't find on the Galaxy web site a HOWTO or some guidelines to do it. I only found some lines that might need to be changed/enabled in the universe_wsgi.ini file:
#database_file = database/universe.sqlite database_connection = mysql:///galaxy #database_engine_option_echo = true #database_engine_option_echo_pool = true #database_engine_option_pool_size = 10 #database_engine_option_max_overflow = 20
Could you point out to some doc or briefly describe what I need to do in order to go for mysql?
Are there any plans to support other DBMS's (like Oracle for instance)?
thanks, Erick
------------------------------------------------------------------------
_______________________________________________ galaxy-user mailing list galaxy-user@bx.psu.edu <mailto:galaxy-user@bx.psu.edu>
http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
_______________________________________________ galaxy-user mailing list galaxy-user@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
------------------------------------------------------------- Biomolecular Mass Spectrometry and Proteomics Utrecht University
Visiting address: H.R. Kruyt building room O607 Padualaan 8 3584 CH Utrecht The Netherlands
Mail address: P.O. box 80.082 3508 TB Utrecht The Netherlands
phone: +31 (0)6-143 66 783 email: pieter.neerincx@gmail.com skype: pieter.online ------------------------------------------------------------
_______________________________________________ galaxy-user mailing list galaxy-user@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user