where are my deleted datasets?
Hi, I managed to enable the possibility to 'Add datasets' via the 'Upload directory of files' option for non-admin users. It works fine: I can import (as admin and non-admin user) specific datasets into my history (copy and/or no-copy) so that I could launch some tools over them, etc. However, if I delete a specific dataset (as admin) that was imported as non-admin and then login again as non-admin, I can still see the imported datasets in my history pane. Where is actually located the reference to the (deleted) dataset? I was able to find the copied files at .../database/files/000/* , these will be cleaned out (if deleted) by the purge scripts eventually... I am using SQLAlchemy with mysql for the DB connection. cheers, Erick
Hello Erick, On Mar 22, 2010, at 12:32 PM, Erick Antezana wrote:
Hi,
I managed to enable the possibility to 'Add datasets' via the 'Upload directory of files' option for non-admin users. It works fine:
Just to confirm, this option is available only for uploading datasets to a data library ( but not a history ).
I can import (as admin and non-admin user) specific datasets into my history (copy and/or no-copy) so that I could launch some tools over them, etc. However, if I delete a specific dataset (as admin) that was imported as non-admin and then login again as non-admin, I can still see the imported datasets in my history pane.
I assume when you "delete a specific dataset (as admin) that was imported as non-admin" , you are marking the dataset as deleted in the data library. Doing this will not alter (delete) the history item that points to the deleted library dataset. This is why any user that imported the library dataset into their history before it was marked deleted in the library will still see the item in their history that points to the deleted library dataset.
Where is actually located the reference to the (deleted) dataset? I was able to find the copied files at .../database/files/000/* , these will be cleaned out (if deleted) by the purge scripts eventually... I am using SQLAlchemy with mysql for the DB connection.
When you first upload a file to a data library, an association object (a LibraryDatasetDatasetAssociation object) is created that associates the library dataset with the file on disk (stored in ~/database/files/000/*). At this point, there is only 1 reference to the disk file. However, when you import the library dataset into a history, another association object (a HistoryDatasetAssociation object) is created that associates the history item with the same file on disk. This, of course, creates another reference to the same disk file. In order for the ~/scripts/cleanup_datasets/cleanup_datasets.py script to remove the file from disk, every reference to the disk file must be marked as deleted. In other words, not only the library dataset must be marked as deleted, but every history item that was created by importing the library dataset into a history must be marked as deleted as well. If any of these association objects remain undeleted, the disk file will not be removed.
cheers, Erick _______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
Greg Von Kuster Galaxy Development Team greg@bx.psu.edu
participants (2)
-
Erick Antezana
-
Greg Von Kuster