Hi Everyone

 

I’m throwing this out there for some feedback and recommendations.

 

 

Objective: Facilitate transferring large files (> 2GB) from an HPC cluster (and its associated fast tier storage) to galaxy for my clients.  I enabled the FTP upload option in galaxy but it involves users learning to copy files over FTP.

 

So, I created a galaxy folder in each users’ home directory on the HPC Cluster that symbolically links to the FTP upload folder for galaxy.  Hence, users can use either FTP to upload files (drag and drop in windows) or simply copy files into this folder from an ssh session on the cluster.  The problem with that strategy was that galaxy had to be the owner of the file (similar to the ProFTPd configuration that sets the UID and GID of uploads files to galaxy’s UID/GID).  Otherwise, galaxy threw errors when it tried deleting the original file from the FTP upload folder.  I could have added the galaxy user to the same group as all user but this meant that users would have to ensure the correct permissions are set on files so that galaxy can read and delete the file thereafter.  The alternative involved modifying the upload.py tool to chown/chmod files that were being uploaded.  Upload.py now sudo executes an external script that sets ownership to the galaxy user and corrects the permissions if required (see attachment for code modification).  The galaxy user has sudo rights on this script and the script restricts chown/chmod to the ftp folder path for security reasons.

 

I was planning to clean up the code and make it production ready by adding an option in universe_wsgi.ini for this “feature”, but I thought I would check with the galaxy devs first. Am I taking the wrong approach?  Is there a better alternative?

 

As an alternative, I thought about locating the handler code for dataset.type == file and possibly making it support the SETGID sticky bits on folders.  In that case, the FTP upload folder would have the sticky bit set for UID and can assume the role of the user to upload that file.

 

Your input is much appreciated.

 

Iyad Kandalaft

 

Bioinformatics Application Developer

Agriculture and Agri-Food Canada | Agriculture et Agroalimentaire Canada

KW Neatby Bldg | éd. KW Neatby 

960 Carling Ave| 960, avenue Carling

Ottawa, ON | Ottawa (ON) K1A 0C6

E-mail Address / Adresse courriel: Iyad.Kandalaft@agr.gc.ca

Telephone | Téléphone 613- 759-1228

Facsimile | Télécopieur 613-759-1701

Government of Canada | Gouvernement du Canada