Python error when running Bowtie for Illumina
Hi all, I'm new to next-gen sequencing, so please be gentle. I've just received a pair of Illumina FASTQ files from the sequencing facility and intend to map them to the hg19 reference genome. I first used the FASTQ Groomer utility to convert the reads into Sanger reads. However, when running Bowtie for Illumina on the resulting dataset under default settings, I received the following error: An error occurred running this job: *Error aligning sequence. requested number of bytes is more than a Python string can hold* * * Can someone help point out my mistake? My history is accessible at http://main.g2.bx.psu.edu/u/wengkhong_lim/h/chip-seq-pilot-batch Appreciate the help! Weng Khong, LIM Department of Genetics University of Cambridge E-mail: wkl24@cam.ac.uk Tel: +447503225832
I have had the same problem, and am also a newbie to NGS with Illumina. The work-around I found was to download bowtie directly from the website, run it on your local computer, then upload the resulting SAM file for subsequent Galaxy-driven analysis. Not optimal, I know, but if you are in a hurry... Best, Dan On Tue, Apr 6, 2010 at 1:20 PM, Weng Khong Lim <wengkhong@gmail.com> wrote:
Hi all,
I'm new to next-gen sequencing, so please be gentle. I've just received a pair of Illumina FASTQ files from the sequencing facility and intend to map them to the hg19 reference genome. I first used the FASTQ Groomer utility to convert the reads into Sanger reads. However, when running Bowtie for Illumina on the resulting dataset under default settings, I received the following error:
An error occurred running this job: *Error aligning sequence. requested number of bytes is more than a Python string can hold* * * Can someone help point out my mistake? My history is accessible at http://main.g2.bx.psu.edu/u/wengkhong_lim/h/chip-seq-pilot-batch
Appreciate the help!
Weng Khong, LIM Department of Genetics University of Cambridge E-mail: wkl24@cam.ac.uk Tel: +447503225832
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Dan Webster Ph.D. Student - Cancer Biology Laboratory of Paul Khavari CCSR BLDG, Rm 2150 269 Campus Drive Stanford, CA 94305 DanWebster@stanford.edu
Hello, There is a bug in Bowtie right now. We have the fix for this ready, but it won't be available until the main server is updated and restarted. In the meantime, you can use Bowtie on our test server (http://test.g2.bx.psu.edu/), which has the updated code on it (just be aware that you can't transfer history items directly from our test server to main). I am sorry for the inconvenience. Regards, Kelly On 4/6/10 4:33 PM, Dan Webster wrote:
I have had the same problem, and am also a newbie to NGS with Illumina. The work-around I found was to download bowtie directly from the website, run it on your local computer, then upload the resulting SAM file for subsequent Galaxy-driven analysis. Not optimal, I know, but if you are in a hurry...
Best, Dan
On Tue, Apr 6, 2010 at 1:20 PM, Weng Khong Lim <wengkhong@gmail.com <mailto:wengkhong@gmail.com>> wrote:
Hi all,
I'm new to next-gen sequencing, so please be gentle. I've just received a pair of Illumina FASTQ files from the sequencing facility and intend to map them to the hg19 reference genome. I first used the FASTQ Groomer utility to convert the reads into Sanger reads. However, when running Bowtie for Illumina on the resulting dataset under default settings, I received the following error:
An error occurred running this job: /Error aligning sequence. requested number of bytes is more than a Python string can hold/ / / Can someone help point out my mistake? My history is accessible at http://main.g2.bx.psu.edu/u/wengkhong_lim/h/chip-seq-pilot-batch
Appreciate the help!
Weng Khong, LIM Department of Genetics University of Cambridge E-mail: wkl24@cam.ac.uk <mailto:wkl24@cam.ac.uk> Tel: +447503225832
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu <mailto:galaxy-user@lists.bx.psu.edu> http://lists.bx.psu.edu/listinfo/galaxy-user
-- Dan Webster Ph.D. Student - Cancer Biology Laboratory of Paul Khavari CCSR BLDG, Rm 2150 269 Campus Drive Stanford, CA 94305 DanWebster@stanford.edu <mailto:DanWebster@stanford.edu>
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
Hi, I am using the samtools in Galaxy and got exactly the same error as described in the thread above. It is our own local Galaxy server, but the samtools tools are taken as is from the Galaxy production server. tool_id=sam_to_bam An error occurred running this job: Error extracting alignments from (/net/bs-gridhome/sw-repo/grid/applications/galaxy_dist/database/files/000/dataset_861.dat), requested number of bytes is more than a Python string can hold The try/except block throwing this error is: try: # Extract all alignments from the input SAM file to BAM format ( since no region is specified, all the alignments will be extracted ). tmp_aligns_file = tempfile.NamedTemporaryFile( dir=tmp_dir ) tmp_aligns_file_name = tmp_aligns_file.name tmp_aligns_file.close() # IMPORTANT NOTE: for some reason the samtools view command gzips the resulting bam file without warning, # and the docs do not currently state that this occurs ( very bad ). command = samtools_binary_path.SAMTOOLS+' view -bt %s -o %s %s' % ( fai_index_file_path, tmp_aligns_file_name, options.input1 ) tmp = tempfile.NamedTemporaryFile( dir=tmp_dir ).name tmp_stderr = open( tmp, 'wb' ) proc = subprocess.Popen( args=command, shell=True, cwd=tmp_dir, stderr=tmp_stderr.fileno() ) returncode = proc.wait() tmp_stderr.close() # get stderr, allowing for case where it's very large tmp_stderr = open( tmp, 'rb' ) stderr = '' buffsize = 1048576000 try: while True: stderr += tmp_stderr.read( buffsize ) if not stderr or len( stderr ) % buffsize != 0: break except OverflowError: pass tmp_stderr.close() if returncode != 0: raise Exception, stderr if len( open( tmp_aligns_file_name ).read() ) == 0: raise Exception, 'Initial BAM file empty' except Exception, e: #clean up temp files if os.path.exists( tmp_dir ): shutil.rmtree( tmp_dir ) stop_err( 'Noooo, Error extracting alignments from (%s), %s' % ( options.input1, str( e ) ) ) try: So could you provide me the fix you applied? Kind regards, Manuel
participants (4)
-
Dan Webster
-
Kelly Vincent
-
Manuel Kohler
-
Weng Khong Lim