I'm new to next-gen sequencing, so please be gentle. I've just received a
pair of Illumina FASTQ files from the sequencing facility and intend to map
them to the hg19 reference genome. I first used the FASTQ Groomer utility to
convert the reads into Sanger reads. However, when running Bowtie for
Illumina on the resulting dataset under default settings, I received the
An error occurred running this job: *Error aligning sequence. requested
number of bytes is more than a Python string can hold*
Can someone help point out my mistake? My history is accessible at
Appreciate the help!
Weng Khong, LIM
Department of Genetics
University of Cambridge
It seems there is a (serious) bug while "Adding datasets" via the
"Upload directory of files" (with or without copying the data into
Galaxy (3545:2590120aed68), i.e. (un)tick the 'No' box) if the files
are in ZIP format. Actually the files get erased from the filesystem.
In the "information" column, I get:
Job error (click name for more info)
Then, after clicking on the file name:
Information about 2010.fastq.zip
Uploaded by: erick(a)mydomain.com
Date uploaded: 2010-03-23
Miscellaneous information: The uploaded file contains
We have no problems with adding unzipped files...
Hi - I recently discovered Galaxy and am just getting started with it...
I installed a local instance to test with NGS data.
We have a lot of NGS datasets (both Solid and illumina) on a SAN
available via NFS. Each user that has data has an account on the system
in a directory tree such as:
where user1 and user2 are distinct users with their own dataset, and
project1, project2 contain datasets shared by multiple users.
I want to import this data in Galaxy and I came across this thread in
the mailing list (quoted below).
I set up my Galaxy instance to see this NFS share...btw Galaxy is
running as its own user on a virtual machine. The Galaxy user only has
read access to this read and nothing else.
When I went to import the files by specifying a local system path on the
admin user interface, I got errors importing the data. I checked the
paster.log file and I saw errors related to galaxy trying to change the
permissions of the files to 0644.
Does this mean all the files need to be owned by Galaxy?
The best approach to handle this is to have a Galaxy administrator
upload the files into a library, creating library datasets. Set the
following config setting to point to the NFS accessible directory you
use to contain the files:
# Directories of files contained in the following directory can be
uploaded to a library from the Admin view
library_import_dir = /var/opt/galaxy/import
You can set ACCESS permissions on the library to restrict access to
specific users, or leave it public to allow anyone to access the library
Users can import the library datasets into their own histories for
analysis - doing this does not create another disk file, but simply a
"pointer" to the library dataset's disk file.
Greg Von Kuster
Galaxy Development Team
Andreas Kuntzagk wrote:
> to reduce traffic when importing big dataset into our local galaxy we
> would like to copy directly from the fileserver (which is accessible
> from the galaxy server via NFS) into galaxy without moving through the
> users desktop computer.
> Is there already such a tool by chance? At a glance I could only find
> import.py which imports only from a predefined set of files.
> So if no such tool exist what would be the best starting point?
> For security reasons I would restrict the import to a certain directory.
> Anything else to keep in mind?
> regards, Andreas
> galaxy-user mailing list
> galaxy-user at bx.psu.edu
From: research pal <workinformatics(a)yahoo.com>
I have a bed file for the chimp genome and I want to download the same region for the human genome.
Do you have a tool in Galaxy which can help me for these orthologus regions.
some options in the file 'universe_wsgi.ini' are written 'true' (e.g.
= true*) and others 'True' (e.g. *static_enabled = True*). I was wondering
whether using any of them is fine?
I am using a local install of galaxy for processing Illumina data. However,
for large (about 3 GB) files, it gives an error and while it appears as if
the upload is progressing, nothing happens.
line 818, in str_POST
File "/usr/lib64/python2.6/cgi.py", line 508, in __init__
self.read_multi(environ, keep_blank_values, strict_parsing)
File "/usr/lib64/python2.6/cgi.py", line 635, in read_multi
headers = rfc822.Message(self.fp)
File "/usr/lib64/python2.6/rfc822.py", line 108, in __init__
File "/usr/lib64/python2.6/rfc822.py", line 155, in readheaders
line = self.fp.readline()
File "/usr/lib/python2.6/site-packages/paste/httpserver.py", line 476, in
data = self.file.readline(max_read)
File "/usr/lib64/python2.6/socket.py", line 379, in readline
bline = buf.readline(size)
OverflowError: signed integer is greater than maximum
127.0.0.1 - - [13/Apr/2010:12:14:00 +0600] "GET /history HTTP/1.1" 200 - "
http://localhost:8080/tool_runner/upload_async_message" "Mozilla/5.0 (X11;
U; Linux x86_64; en-US) AppleWebKit/533.2 (KHTML, like Gecko)
What could be going wrong?