Galaxy Colleagues, I don't know who is maintaining the Galaxy wiki page at http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed that the Python script under the Megablast instructions has an error: the "defline" operation after the "line.startswith" should be moved *after* the if length > 0 statement, otherwise, the defline is reset incorrectly before the previous sequence is written out. This results in a frameshift in the FASTA header line identifiers (i.e. the current sequence gets the next sequence identifier). I've commented out the erroneous defline below and added the right one: import sys length = 0 defline = '' seq = [] for line in sys.stdin : line = line.rstrip( '\r\n' ) if line.startswith( '>' ): # defline = line.split( "|" )[1] # defline should NOT be here if length > 0: print ">%s_%s" % ( defline, length ) print "\n".join( seq ) length = 0 seq = [] defline = line.split( "|" )[1] # defline should be here else: seq.append( line ) length += len( line ) print ">%s_%s" % ( defline, length ) print "\n".join( seq ) While on the topic of this page, perhaps the software versions need to be revisited. Megablast has been superseded already by Blast+. Perhaps new releases of Galaxy should update this? BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I heard rumors that it was due this week. Cheers Richard Bruskiewich -- *Richard Bruskiewich, PhD* Senior Scientist, Computational and Systems Biology Applications Team for Computational Genomics T.T. Chang Genetic Resources Center International Rice Research Institute
Hi Richard, any bitbucket registered user can edit the Galaxy wiki. This way the documentation quality should increase as more people can put their experiences/fixes/comments on it. So I would say, feel free to edit. The only real downside is that it is a bitbucket wiki, which is rather limited.. Cheers, Jelle On Fri, Dec 10, 2010 at 4:04 AM, Richard Bruskiewich <r.bruskiewich@irri.org> wrote:
Galaxy Colleagues,
I don't know who is maintaining the Galaxy wiki page at http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed that the Python script under the Megablast instructions has an error: the "defline" operation after the "line.startswith" should be moved *after* the if length > 0 statement, otherwise, the defline is reset incorrectly before the previous sequence is written out. This results in a frameshift in the FASTA header line identifiers (i.e. the current sequence gets the next sequence identifier).
I've commented out the erroneous defline below and added the right one:
import sys
length = 0 defline = '' seq = []
for line in sys.stdin : line = line.rstrip( '\r\n' )
if line.startswith( '>' ): # defline = line.split( "|" )[1] # defline should NOT be here if length > 0: print ">%s_%s" % ( defline, length )
print "\n".join( seq ) length = 0 seq = [] defline = line.split( "|" )[1] # defline should be here
else: seq.append( line )
length += len( line )
print ">%s_%s" % ( defline, length ) print "\n".join( seq )
While on the topic of this page, perhaps the software versions need to be revisited. Megablast has been superseded already by Blast+. Perhaps new releases of Galaxy should update this?
BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I heard rumors that it was due this week.
Cheers Richard Bruskiewich
-- Richard Bruskiewich, PhD Senior Scientist, Computational and Systems Biology Applications Team for Computational Genomics T.T. Chang Genetic Resources Center International Rice Research Institute
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
Richard: This beauty was mine. Thanks for pointing this out. It is now fixed. Thanks, anton On Dec 9, 2010, at 10:04 PM, Richard Bruskiewich wrote:
Galaxy Colleagues,
I don't know who is maintaining the Galaxy wiki page at http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed that the Python script under the Megablast instructions has an error: the "defline" operation after the "line.startswith" should be moved *after* the if length > 0 statement, otherwise, the defline is reset incorrectly before the previous sequence is written out. This results in a frameshift in the FASTA header line identifiers (i.e. the current sequence gets the next sequence identifier).
I've commented out the erroneous defline below and added the right one: import sys
length = 0 defline = '' seq = []
for line in sys.stdin : line = line.rstrip( '\r\n' )
if line.startswith( '>' ): # defline = line.split( "|" )[1] # defline should NOT be here if length > 0: print ">%s_%s" % ( defline, length )
print "\n".join( seq ) length = 0 seq = [] defline = line.split( "|" )[1] # defline should be here
else: seq.append( line )
length += len( line )
print ">%s_%s" % ( defline, length ) print "\n".join( seq ) While on the topic of this page, perhaps the software versions need to be revisited. Megablast has been superseded already by Blast+. Perhaps new releases of Galaxy should update this?
BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I heard rumors that it was due this week.
Cheers Richard Bruskiewich
-- Richard Bruskiewich, PhD Senior Scientist, Computational and Systems Biology Applications Team for Computational Genomics T.T. Chang Genetic Resources Center International Rice Research Institute
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
Anton Nekrutenko http://nekrut.bx.psu.edu http://usegalaxy.org
On Fri, Dec 10, 2010 at 3:04 AM, Richard Bruskiewich <r.bruskiewich@irri.org> wrote:
Galaxy Colleagues,
I don't know who is maintaining the Galaxy wiki page ...
While on the topic of this page, perhaps the software versions need to be revisited. Megablast has been superseded already by Blast+. Perhaps new releases of Galaxy should update this?
Hi Richard, Galaxy already has wrappers for the main BLAST+ tools, including blastn which covers megablast. However they are not currently available on the public Galaxy instance, in part due to load concerns. You can enable them locally if you are running your own Galaxy - that is what we are doing. I did also offer to update the old Megablast tool to use blastn from BLAST+ instead of blastall from legacy BLAST, but as I recall the Galaxy team were cautious about this since it could break the reproducibility of existing work flows. Peter
participants (4)
-
Anton Nekrutenko
-
Jelle Scholtalbers
-
Peter
-
Richard Bruskiewich