Perfect, Galaxy will also need to add the function that was deleted by merge, in *galaxy/datatypes/sequence.py:206*: def get_split_commands_sequential(is_compressed, input_name, output_name, start_sequence, sequence_count): """ Does a brain-dead sequential scan & extract of certain sequences >>> Sequence.get_split_commands_sequential(True, './input.gz', './output.gz', start_sequence=0, sequence_count=10) ['zcat "./input.gz" | ( tail -n +1 2> /dev/null) | head -40 | gzip -c > "./output.gz"'] >>> Sequence.get_split_commands_sequential(False, './input.fastq', './output.fastq', start_sequence=10, sequence_count=10) ['tail -n +41 "./input.fastq" 2> /dev/null | head -40 > "./output.fastq"'] """ start_line = start_sequence * 4 line_count = sequence_count * 4 # TODO: verify that tail can handle 64-bit numbers if is_compressed: cmd = 'zcat "%s" | ( tail -n +%s 2> /dev/null) | head -%s | gzip -c' % (input_name, start_line+1, line_count) else: cmd = 'tail -n +%s "%s" 2> /dev/null | head -%s' % (start_line+1, input_name, line_count) cmd += ' > "%s"' % output_name return [cmd] get_split_commands_sequential = staticmethod(get_split_commands_sequential) Best regards On 25 February 2015 at 15:38, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Wed, Feb 25, 2015 at 2:07 PM, Roberto Alonso CIPF <ralonso@cipf.es> wrote:
Hello again :),
I have found the problem, the code that merge the files is this: galaxy/datatypes/tabular.py:484: cmd = 'egrep -v "^@" %s >> %s' % ( ' '.join(split_files[1:]), output_file ) This concatenates the file name into the sam file. Just adding "h" it is enough, so it will be like this: galaxy/datatypes/tabular.py:484: cmd = 'egrep -hv "^@" %s >> %s' % ( ' '.join(split_files[1:]), output_file )
Thanks all for your help, best regards
Well done :)
It looks like the SAM merge needs fixing then,
$ man egrep ... -h, --no-filename Suppress the prefixing of file names on output. This is the default when there is only one file (or only standard input) to search.
I filed a pull request adding the -h option to egrep, crediting you: https://github.com/galaxyproject/galaxy/pull/4
-- Roberto Alonso Functional Genomics Unit Bioinformatics and Genomics Department Prince Felipe Research Center (CIPF) C./Eduardo Primo Yúfera (Científic), nº 3 (junto Oceanografico) 46012 Valencia, Spain Tel: +34 963289680 Ext. 1021 Fax: +34 963289574 E-Mail: ralonso@cipf.es