Depending on how you set things up - either Galaxy, Nginx, or Apache are creating a file for the upload that is incoming - in the above case I imagine it is this file - /home/galaxy/wkdir/galaxy/database/tmp/tmpp6j83l. This file is outside of Galaxy's data model for tools and jobs - it is just a free floating file in some ways - so the Pulsar client doesn't know it needs to transfer it or how to modify the job description to handle it. It is kind of a very special case for data source tools. Stepping back a minute, the upload tool is an odd thing to run with pulsar, it would basically need to send a file to pulsar server, pulsar would run a very lightweight script that wouldn't modify the file at all, and then send the same file back to Galaxy with some metadata. It will really slow down uploads to do them this way I think - if at all possibly you should try to run this tool closer to Galaxy or at least not transfer the file (if it is on a shared directory or something). Hacking up Galaxy to find these uploaded files would be possible and I think I would merge a PR to implement that, but it is such a particular use case (upload shouldn't probably be running on disk that isn't shared with Galaxy) that I don't think it is worth the effort of implementing this right now. Is this fair? Can you find some shared disk to use for these paths - /home/galaxy/wkdir/galaxy/database/tmp/tmpp6j83l. -John On Thu, Sep 24, 2015 at 9:42 PM, Raj Ayyampalayam <raj76@uga.edu> wrote:
Hello,
I am setting up a new instance of Galaxy (based on the latest dev branch on github repo) configured to submit the jobs to a remote cluster via pulsar. I used the usegalaxy playbook to get the configuration parameters for the galaxy job_conf.xml and pulsar app.yml files. I was partially successful in running the jobs via pulsar, the regular tools (analysis tools, etc..) are running OK on the remote cluster. I am having trouble getting the upload tool to run on the the remote cluster.
Here is the error I see in the galaxy error report for the job: It looks like the upload tool running on the remote machine is trying to access a file in the database/tmp area of the galaxy server machine.
Traceback (most recent call last): File "/escratch4/apps/galaxy_scratch/pulsar/files/staging/46/tool_files/upload.py", line 430, in <module> __main__() File "/escratch4/apps/galaxy_scratch/pulsar/files/staging/46/tool_files/upload.py", line 405, in __main__ registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) File "/panfs/pstor.storage/home/qbcglab/galaxy_run/galaxy/lib/galaxy/datatypes/registry.py", line 94, in load_datatypes tree = galaxy.util.parse_xml( config ) File "/panfs/pstor.storage/home/qbcglab/galaxy_run/galaxy/lib/galaxy/util/__init__.py", line 178, in parse_xml root = tree.parse( fname, parser=ElementTree.XMLParser( target=DoctypeSafeCallbackTarget() ) ) File "/usr/local/python/2.7.8/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse source = open(source, "rb") IOError: [Errno 2] No such file or directory: '/home/galaxy/wkdir/galaxy/database/tmp/tmpp6j83l'
I've created a Gist with the contents of job_conf.xml, app.yml, file action file, error content at https://gist.github.com/raj76/7be86f6a56714deef050 .
Any suggestions on how to debug this issue?
Thanks, -Raj
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/