Hello,
I’ve developed a few tools for Galaxy and I think I
ran into a bug that even exists in the latest version. As you know a
Galaxy server maintains the same external dataset ID (i.e. viewable in the web
URLs) to the filesystem internal dataset ID (i.e. names in database/files/000/)
if no user in the Galaxy server has yet shared any histories (and their
datasets). But once sharing starts the external dataset IDs start
differing from the internal dataset ID, and they are always higher and Galaxy
maintains this transparently. But this behavior seems to be broken with
the output files_path property.
If you have a tool which uses the output files_path property
like this one I have:
<command
interpreter="perl">search.pl $query_list $output1
$output1.files_path</command>
On my test server I’ve shared one history with a
single dataset. So my external-internal offset is 1. The above tool
then produces the following command:
perl
/home/hermida/galaxy/galaxy_dist/tools/omics_data_miner/search.pl
/home/hermida/galaxy/galaxy_dist/database/files/000/dataset_28.dat /home/hermida/galaxy/galaxy_dist/database/files/000/dataset_31.dat
/home/hermida/galaxy/galaxy_dist/database/tmp/dataset_32_files
Galaxy is not generating the correct files_path, it should
end in dataset_31_files not 32. This bug causes the tool to completely break
when you are trying to view your output in the browser and I tried to
circumvent the bug with symlinking magic but it can’t fix the problem because
symlinks start stepping on existing directories once you run the tool more than
once.
Thanks for any help on how to fix the problem,
Leandro