Re: [galaxy-dev] problems splitting

25 Feb 2015


      Hi Roberto,
I'm happy you solved your issue, thanks for sharing the
solution!
I'd suggest you open a pull request with the fixes at
https://github.com/galaxyproject/galaxy .

Cheers,
Nicola

Il 25.02.2015
15:07 Roberto Alonso CIPF ha scritto:
...
Hello again :), 
I have
found the problem, the code that merge the files is this:
galaxy/datatypes/tabular.py:484: cmd = 'egrep -v "^@" %s >> %s' % ( '
'.join(split_files[1:]), output_file )
...
This concatenates the file
name into the sam file. Just adding "h" it is enough, so it will be like
this:
galaxy/datatypes/tabular.py:484: cmd = 'egrep -Hv "^@" %s >>
%s' % ( ' '.join(split_files[1:]), output_file ) 
Thanks all for your
help, best regards
On 25 February 2015 at 12:31, Roberto Alonso
CIPF wrote:
...
Ok, I think I understand the line: 
beginning
merge: bwa mem
/home/ralonso/BiB/Galaxy/data/Cclementina_v1.0_scaffolds.fa
/home/ralonso/galaxy-dist/database/files/000/dataset_8.dat >
/home/ralonso/galaxy-dist/database/files/000/dataset_94.dat 2> /dev/null
...
...
it refers to the original command, so everything is fine with this
line. The other problem still remains 
Regards, sorry for the
confusion
On 25 February 2015 at 11:40, Roberto Alonso CIPF
wrote:
...
Hello again, 
this is something that I consider
important, when I see the log I see this output:
galaxy.jobs.runners.tasks DEBUG 2015-02-25 11:33:30,989 execution
finished - BEGINNING MERGE: BWA MEM
/home/ralonso/BiB/Galaxy/data/Cclementina_v1.0_scaffolds.fa
/home/ralonso/galaxy-dist/database/files/000/dataset_8.dat >
/home/ralonso/galaxy-dist/database/files/000/dataset_94.dat 2> /dev/null
...
...
...
I think the merge should be done with samtools. I don't know how is
this programmed in Galaxy, but I didn't indicate anywhere the path to
samtools, is it maybe the problem related with this? 
Thanks a lot,
...
...
...
Regards
On 25 February 2015 at 11:13, Roberto Alonso CIPF
wrote:
...
Hello, 
I just changed for the CDATA format, but
the problem still remains. When I split by 2, there is no problem, but
when I go for 3, it happens the problem commented before. Here it is the
link to the sam/bam file:
https://dl.dropboxusercontent.com/u/1669701/ejemplo_split.bam [3]
...
Best regards
...
On 24 February 2015 at 17:49, Peter Cock
wrote:
...
...
On Tue, Feb 24, 2015 at 4:43 PM, Roberto Alonso CIPF
wrote:
...
...
...
Hello again,
first of all thanks for your
help, it is being very useful.
What I have done up to
now is to copy this method to the class Sequence
def
get_split_commands_sequential(is_compressed, input_name,
output_name,
start_sequence, sequence_count):
...
return [cmd]
...
get_split_commands_sequential =
staticmethod(get_split_commands_sequential)
...
This is
something that you suggested.
Good.
...
When I
run the tool with this configuration:
map with
bwa
> split_mode="number_of_parts">
bwa
mem /home/ralonso/BiB/Galaxy/data/Cclementina_v1.0_scaffolds.fa
$input > $output 2>/dev/null
...
...
...
...
...
...
...
bwa
...
...
...
...
...
One minor improvement would be to escape the ">" as ">" in
your XML, or use the CDATA approach documented here:
...
https://wiki.galaxyproject.org/Tools/BestPractices [2]
...
...
Everything ends ok, but when I go to check how is the sam, I see that in
the
...
...
alingments it is the path of the file, i.e
example_split.sam:
...
/home/ralonso/galaxy-dist/database/job_working_directory/000/90/task_2/dataset_91.dat:SRR098409.1113446
4 * 0 0 * * 0 0
...
...
TCTGGGTGAGGGAGTGGGGAGTGGGTTTTTGAGGGTGTGTGAGGATGTGTAAGTGGATGGAAGTAGATTGAATGTT
############################################################################
...
...
...
...
...
AS:i:0 XS:i:0
...
...
you know what may be going on?
If i don't split the file, everything goes correctly.
This
sounds to me like there may be a problem with SAM merging?
Could
you share the entire example_split.sam file (e.g. as a gist
on
GitHub, or via dropbox)?
Peter
--
Roberto Alonso
...
Functional Genomics Unit
Bioinformatics and
Genomics Department
Prince Felipe Research Center (CIPF)
C./Eduardo Primo Yúfera (Científic), nº 3
...
(junto
Oceanografico)
46012 Valencia, Spain
Tel: +34 963289680 Ext.
1021
Fax: +34 963289574
E-Mail: ralonso@cipf.es [5]
--
...
Roberto Alonso 
Functional Genomics Unit
Bioinformatics and Genomics Department
...
Prince Felipe Research Center
(CIPF) 
C./Eduardo Primo Yúfera (Científic), nº 3
(junto
Oceanografico)
46012 Valencia, Spain
Tel: +34 963289680 Ext.
1021
Fax: +34 963289574
E-Mail: ralonso@cipf.es [7]
--
...
...
Roberto Alonso 
Functional Genomics Unit
Bioinformatics
and Genomics Department
...
Prince Felipe Research Center (CIPF)
C./Eduardo Primo Yúfera (Científic), nº 3
...
(junto Oceanografico)
46012 Valencia, Spain
...
Tel: +34 963289680 Ext. 1021
Fax: +34
963289574
E-Mail: ralonso@cipf.es [9]
--
Roberto Alonso
Functional Genomics Unit
...
Bioinformatics and Genomics Department
Prince Felipe Research Center (CIPF)
...
C./Eduardo Primo Yúfera
(Científic), nº 3
(junto Oceanografico)
46012 Valencia, Spain
Tel:
+34 963289680 Ext. 1021
Fax: +34 963289574
E-Mail: ralonso@cipf.es
[11]
Connetti gratis il mondo con la nuova indoona:  hai la chat, le chiamate, le video chiamate e persino le chiamate di gruppo.
E chiami gratis anche i numeri fissi e mobili nel mondo!
Scarica subito l’app Vai su https://www.indoona.com/

Re: [galaxy-dev] problems splitting

Nicola Soranzo