fasta_to_tabular.py slowness

18 Jul 2009

Hi Galaxy Team,

I've found that fasta_to_tabular.py is very slow with big sequences, e.g. ~4 minutes for a single 5MB sequence.

The patch below makes the running time go from minutes to seconds for such a sequence. Mind you, this is my first line of python, so there may be a smarter way.

Best regards,
Rasmus Ory Nielsen

--- fasta_to_tabular.py.orig	2009-07-18 16:25:50.896487000 +0200
+++ fasta_to_tabular.py	2009-07-18 17:22:49.544611000 +0200
@@ -34,7 +34,7 @@
             fasta_seq = ''
         else:
             if line:
-                fasta_seq = "%s%s" % ( fasta_seq, line )
+                fasta_seq += line
 
     if fasta_seq:
         out.write( "%s\t%s\n" %( fasta_title[ 1:keep_first ], fasta_seq ) )

    

Rasmus Ory Nielsen

Bob Harris

Greg Von Kuster

Brad Chapman

Rasmus Ory Nielsen

James Casbon

Greg Von Kuster

James Casbon

Greg Von Kuster

Greg Von Kuster

Rasmus Ory Nielsen

James Casbon

tags

participants (5)