Question on finding external_filename and _extra_files_path from psql DB.
Dear Galaxy Team, As a part of in-house automated cleaning of data set from our production galaxy service, I am finding difficulty to get the galaxy assigned file name (galaxy-root/database/001/dataset_001.dat) for each data set from the corresponding database. We are using PostgreSQL and when I queried through the table name called 'dataset', couldn't figure out the values for external_filename or _extra_files_path. For example:
select * from dataset order by id DESC limit 10; id | create_time | update_time | state | deleted | purged | purgable | external_filename | _extra_files_path | file_size
70805 | 2011-04-22 20:49:55.319709 | 2011-04-22 20:50:26.643807 | ok | f | f | t | | | 421593 Could you please let me know where I can locate the file name from other tables or do I need to set up any configuration. Thanks in advance, Vipin Friedrich Miescher Laboratory of the Max Planck Society Spemannstrasse 39, 72076 Tuebingen, Germany
Vipin, The numerical portion of the file path is not saved in the database, rather it is generated on the fly by the function directory_hash_id in lib/galaxy/model/__init__.py. The filename itself should be named in the format dataset_<dataset id>.dat. So, using that directory_hash_id function, you should be able to reconstruct the full path to the file on disk with just the unencoded dataset id and the root galaxy files directory prefix. -Dannon On Apr 25, 2011, at 4:16 PM, Vipin TS wrote:
Dear Galaxy Team,
As a part of in-house automated cleaning of data set from our production galaxy service, I am finding difficulty to get the galaxy assigned file name (galaxy-root/database/001/dataset_001.dat) for each data set from the corresponding database. We are using PostgreSQL and when I queried through the table name called 'dataset', couldn't figure out the values for external_filename or _extra_files_path.
For example:
select * from dataset order by id DESC limit 10; id | create_time | update_time | state | deleted | purged | purgable | external_filename | _extra_files_path | file_size
70805 | 2011-04-22 20:49:55.319709 | 2011-04-22 20:50:26.643807 | ok | f | f | t | | | 421593
Could you please let me know where I can locate the file name from other tables or do I need to set up any configuration.
Thanks in advance, Vipin Friedrich Miescher Laboratory of the Max Planck Society Spemannstrasse 39, 72076 Tuebingen, Germany ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Dannon, Thanks for pointing the same. Vipin,
The numerical portion of the file path is not saved in the database, rather it is generated on the fly by the function directory_hash_id in lib/galaxy/model/__init__.py. The filename itself should be named in the format dataset_<dataset id>.dat. So, using that directory_hash_id function, you should be able to reconstruct the full path to the file on disk with just the unencoded dataset id and the root galaxy files directory prefix.
-Dannon
On Apr 25, 2011, at 4:16 PM, Vipin TS wrote:
Dear Galaxy Team,
As a part of in-house automated cleaning of data set from our production galaxy service, I am finding difficulty to get the galaxy assigned file name (galaxy-root/database/001/dataset_001.dat) for each data set from the corresponding database. We are using PostgreSQL and when I queried through the table name called 'dataset', couldn't figure out the values for external_filename or _extra_files_path.
For example:
select * from dataset order by id DESC limit 10; id | create_time | update_time | state | deleted | purged | purgable | external_filename | _extra_files_path | file_size
70805 | 2011-04-22 20:49:55.319709 | 2011-04-22 20:50:26.643807 | ok | f | f | t | | | 421593
Could you please let me know where I can locate the file name from other tables or do I need to set up any configuration.
Thanks in advance, Vipin Friedrich Miescher Laboratory of the Max Planck Society Spemannstrasse 39, 72076 Tuebingen, Germany ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Dannon Baker
-
Vipin TS