Hello

 

Like Matthias (http://gmod.827538.n3.nabble.com/Fatal-Error-while-uploading-a-File-with-special-Characters-td4051157.html)  I can’t upload a file with special characters.

 

My test file is : Frédéric.txt

 

The database itself seems to be ok, I can see that the insertion is ok in “history_dataset_association” :

id            history_id           dataset_id          create_time      update_time                 copied_from_history_dataset_association_id hid         name    info

328         1             330         2017-03-01 08:13:56       2017-03-01 08:14:05       NULL     293         Frédéric.txt                 uploaded txt file              1 line

 

But until I change Frédéric.txt to Frederic.txt within the name column, my history is broken.

 

Errors are the same as described by Matthias.

 

galaxy.web.framework.decorators ERROR 2017-03-01 09:14:18,321 Uncaught exception in exposed API method:

Traceback (most recent call last):

  File "/softs/bioinfo/galaxy-prod/lib/galaxy/web/framework/decorators.py", line 284, in decorator

    rval = _format_return_as_json( rval, jsonp_callback, pretty=trans.debug )

  File "/softs/bioinfo/galaxy-prod/lib/galaxy/web/framework/decorators.py", line 317, in _format_return_as_json

    json = safe_dumps( rval, **dumps_kwargs )

  File "/softs/bioinfo/galaxy-prod/lib/galaxy/util/json.py", line 67, in safe_dumps

    dumped = json.dumps( obj, allow_nan=False, **kwargs )

  File "/usr/local/lib/python2.7/json/__init__.py", line 251, in dumps

    sort_keys=sort_keys, **kw).encode(obj)

  File "/usr/local/lib/python2.7/json/encoder.py", line 209, in encode

    chunks = list(chunks)

  File "/usr/local/lib/python2.7/json/encoder.py", line 431, in _iterencode

    for chunk in _iterencode_list(o, _current_indent_level):

  File "/usr/local/lib/python2.7/json/encoder.py", line 332, in _iterencode_list

    for chunk in chunks:

  File "/usr/local/lib/python2.7/json/encoder.py", line 390, in _iterencode_dict

    yield _encoder(value)

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 2: invalid continuation byte

 

 

I have read a lot of threads about this kind of error (mostly from stackoverflow)

It is related to Galaxy or it’s related to my python installation ?

 

For a production server, it means that I will have to update manually the database each time a user try to upload such a file (find his id, find the broken history and the file name to correct !)

 

I’m running Galaxy 17.01 with a mysql (5.7.11-4, UTF-8) database.

I’ve modified lib/galaxy/datatypes/binary.py (for XLS datatypes) and lib/galaxy/jobs/runners/__init__.py (truncate the job name for PBS-PRO)

 

# On branch release_17.01

# Your branch is ahead of 'origin/release_17.01' by 3 commits.

#

# Changed but not updated:

#   (use "git add <file>..." to update what will be committed)

#   (use "git checkout -- <file>..." to discard changes in working directory)

#

#       modified:   lib/galaxy/datatypes/binary.py

#       modified:   lib/galaxy/jobs/runners/__init__.py

 

Thanks in advance!

 

Fred

---

Frederic Sapet

Bioinformatics Research Engineer

BIOGEMMA - Upstream Genomics Group

Centre de Recherche de Chappes

CS 90126

63720 CHAPPES

FRANCE

Tel : +33 (0)4 73 67 88 54

Fax : +33 (0)4 73 67 88 99

E-mail : frederic.sapet@biogemma.com