Importing file via FTP can silently fail, [Errno ftp error] 550
Hi all, Using either the legacy upload tool, or the new upload widget, the following FTP URL *appears* to load into Galaxy correctly giving a green history entry. However, on closer inspection it is an empty file with this peep text: Unable to fetch ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz [Errno ftp error] 550 pathogens: No such file or directory Clearly this should have been treated as a failure (red history entry). Should I file this as a bug on Trello? It seems there is a problem with the Sanger folder permissions which trips up Galaxy and curl (but does not bother Firefox and wget): $ curl -O ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (9) Server denied you to change to the given directory Using curl -v for verbose mode it tries to walk the path (as per the https://www.ietf.org/rfc/rfc1738.txt standard) and fails with the command "CWD pathogens" matching the Galaxy error. The alternative curl FTP methods work: $ curl -O ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz --ftp-method nocwd % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 21.8M 100 21.8M 0 0 7656k 0 0:00:02 0:00:02 --:--:-- 7654k and: $ curl -O ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz --ftp-method singlecwd % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 21.8M 100 21.8M 0 0 7767k 0 0:00:02 0:00:02 --:--:-- 7766k You can also trick curl by replacing (some of the) slashes with the escaped version %2f which has the result of combining the CWD commands, e.g. $ curl -O ftp://ftp.sanger.ac.uk/pub%2fpathogens%2fBursaphelenchus%2fxylophilus%2fAssembly-v1.2/BurXv1.2.supercontigs.fa.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 21.8M 100 21.8M 0 0 6827k 0 0:00:03 0:00:03 --:--:-- 6829k This encoding trick does not seem to work within Galaxy. I plan to report this to Sanger tomorrow. Should they fix this quickly additional test cases might be useful (does Galaxy have an FTP server which might be used for testing data importing like this?). Is this a common issue? Should Galaxy automatically use an FTP CWD workaround in this situation? Regards, Peter
On Wed, Feb 25, 2015 at 5:33 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Hi all,
Using either the legacy upload tool, or the new upload widget, the following FTP URL *appears* to load into Galaxy correctly giving a green history entry.
However, on closer inspection it is an empty file with this peep text:
Unable to fetch ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz [Errno ftp error] 550 pathogens: No such file or directory
Clearly this should have been treated as a failure (red history entry).
Should I file this as a bug on Trello?
It seems there is a problem with the Sanger folder permissions which trips up Galaxy and curl (but does not bother Firefox and wget):
$ curl -O ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (9) Server denied you to change to the given directory
Using curl -v for verbose mode it tries to walk the path (as per the https://www.ietf.org/rfc/rfc1738.txt standard) and fails with the command "CWD pathogens" matching the Galaxy error.
The reply from Sanger was that "pathogens" is actually an alias for "project/pathogens" and as a workaround this longer URL can be used with curl or Galaxy: ftp://ftp.sanger.ac.uk/pub/project/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz The problem with Galaxy failing to detect the FTP error with the shorter URL remains: ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz Regards, Peter
Yeah - that is unfortunate - I agree completely that the resulting datasets should be red. I have created a Trello card here: https://trello.com/c/A6LrdjUU -John On Thu, Feb 26, 2015 at 5:19 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Wed, Feb 25, 2015 at 5:33 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Hi all,
Using either the legacy upload tool, or the new upload widget, the following FTP URL *appears* to load into Galaxy correctly giving a green history entry.
However, on closer inspection it is an empty file with this peep text:
Unable to fetch ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz [Errno ftp error] 550 pathogens: No such file or directory
Clearly this should have been treated as a failure (red history entry).
Should I file this as a bug on Trello?
It seems there is a problem with the Sanger folder permissions which trips up Galaxy and curl (but does not bother Firefox and wget):
$ curl -O ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (9) Server denied you to change to the given directory
Using curl -v for verbose mode it tries to walk the path (as per the https://www.ietf.org/rfc/rfc1738.txt standard) and fails with the command "CWD pathogens" matching the Galaxy error.
The reply from Sanger was that "pathogens" is actually an alias for "project/pathogens" and as a workaround this longer URL can be used with curl or Galaxy:
ftp://ftp.sanger.ac.uk/pub/project/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz
The problem with Galaxy failing to detect the FTP error with the shorter URL remains:
ftp://ftp.sanger.ac.uk/pub/pathogens/Bursaphelenchus/xylophilus/Assembly-v1.2/BurXv1.2.supercontigs.fa.gz
Regards,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Thu, Feb 26, 2015 at 2:24 PM, John Chilton <jmchilton@gmail.com> wrote:
Yeah - that is unfortunate - I agree completely that the resulting datasets should be red. I have created a Trello card here:
-John
Thanks John, Does the new "Download from URL or upload files from disk" interface still end up calling the old Python script for the upload tool? i.e. https://github.com/galaxyproject/galaxy/blob/dev/tools/data_source/upload.py Peter
Yes - I believe it does. The top level layer (controller) is a little different because it is coming through the API - but the next 10 or so layers related to uploading are the same with the new method :). -John On Thu, Feb 26, 2015 at 9:57 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Thu, Feb 26, 2015 at 2:24 PM, John Chilton <jmchilton@gmail.com> wrote:
Yeah - that is unfortunate - I agree completely that the resulting datasets should be red. I have created a Trello card here:
-John
Thanks John,
Does the new "Download from URL or upload files from disk" interface still end up calling the old Python script for the upload tool? i.e.
https://github.com/galaxyproject/galaxy/blob/dev/tools/data_source/upload.py
Peter
Il 26.02.2015 15:57 Peter Cock ha scritto:
On Thu, Feb 26, 2015 at 2:24 PM, John Chilton wrote:
Yeah - that is unfortunate - I agree completely that the resulting datasets should be red. I have created a Trello card here: https://trello.com/c/A6LrdjUU [1] -John
Thanks John,
Does the new "Download from URL or upload files from disk" interface still end up calling the old Python script for the upload tool? i.e.
https://github.com/galaxyproject/galaxy/blob/dev/tools/data_source/upload.py Yes, I think so. Nicola Connetti gratis il mondo con la nuova indoona: hai la chat, le chiamate, le video chiamate e persino le chiamate di gruppo. E chiami gratis anche i numeri fissi e mobili nel mondo! Scarica subito l’app Vai su https://www.indoona.com/
On Thu, Feb 26, 2015 at 2:24 PM, John Chilton <jmchilton@gmail.com> wrote:
Yeah - that is unfortunate - I agree completely that the resulting datasets should be red. I have created a Trello card here:
-John
I just commented on the Trello issue - as John implied with the issue title, this is more general than a rare FTP failure. e.g. http://www.galaxyproject.org/test_does_not_exist.txt Using this in Galaxy gives the 404 error page as a "green" HTML file. It should be "red" in the error state (while showing the user facing 404 message as the file contents is reasonable). Peter
participants (3)
-
John Chilton
-
Nicola Soranzo
-
Peter Cock