I've CC'd the mailing list again (assuming you omitted it by
mistake - its easily done).
On Tue, Nov 9, 2010 at 8:43 PM, John David Osborne wrote:
>>I've written a BLAST+ wrapper which is now in galaxy-central, and
>>will eventually be in galaxy-dist and thus potentially on the public
>>Galaxy server (assuming it won't tax it too much). This doesn't
>>(yet) offer the option to run the BLAST remotely at the NCBI
>>(the wrapper could do this in theory). However, running a BLAST
>> against the NCBI databases will cause difficulties for reproducibility
>>(since we have no control over the databases, and the NCBI make
> Understood, I assume then most people get their sequence data
> and upload before starting a workload... That wrapper does sound
> handy though! It just seems odd to me that the Galaxy portal
> doesn't have a generic blast tool and standard (nt, nr, etc...) to
> run against.
Well, I'm sure they will consider it in future. But if I were in
their shoes, I'd be a little nervous about the computational
load it might impose.
>>Would you be running your own Galaxy server?
> That's the current plan. We are also looking at Taverna, but it
> doesn't seem like I can find much in the way (actually none)
> of biology publications that use it as part of their workflow. I
> haven't looked that hard though.
Within our institute we intend to offer BLAST running on our
departmental server (in Galaxy) for Biologists to run large
searches (e.g. multiple query sequences like a set of contigs)
against both NCBI databases like NR and also in house ones
(e.g. unpublished genomes), and as part of workflows (e.g.
upload a FASTA file, blast against organism X, divide the
FASTA file into those with good matches and those without).
This is drifting into a discussion more suited for the galaxy
dev mailing list though ;)
We have deployed a local instance of Galaxy and have recently been
experiencing several problems which we believe to be related to our server
being overloaded with user requests. We get the following error message
(from the log files):
raise exc.TimeoutError("QueuePool limit of size %d overflow %d
reached, connection timed out, timeout %d" % (self.size(),
TimeoutError: QueuePool limit of size 10 overflow 20 reached, connection
timed out, timeout 30
Could someone explain why we might get this error message and how we might
configure our server settings to solve this problem?
Is universe_wsgi.ini.sample a configuration for production server
instances? If not, could someone post some suggested settings for
production servers (e.g. the settings that you use for the main galaxy
Finally, is there a wiki page explaining the settings in universe_wsgi.ini?
I am new to galaxy and I am wondering if it is possible to:
1) Fetch data from NCBI from the main galaxy portal? I assume this is possible from a local installation. For example, do a query say looking for a disease in rodents using Ratmine, export to Galaxy (did this fine) and then use the returned identifies to fetch sequence from NCBI. I can't seem to do this last bit - I don't even see NCBI under "Get Data".
2) Run blast from the main galaxy portal? Is this disabled due to too many users? Does this option appear once you have sequence?
The context is our group at UAB is evaluating galaxy and I am just playing around to see what it can do.
This may not be the 'best' way, but it I've found it to be a workable solution.
I'm running a Perl script that takes the tool-xml parameters, does some processing, and calls
R to obtain a p-val calculation. The results are parsed and included in a formatted HTML page.
It's all code based.
You can likely use something similar in a Python script:
# --- create the R command
echo 'phyper($match, $listcount, $all, $qrycount, lower.tail = FALSE, log.p = FALSE)' > rcmd
# --- run R in command line mode
R --slave -f rcmd > rpval
# --- open and parse results
open my $fh, "<", "rpval";
my $line = <$fh>;
$line =~ /\[1\] (.+)/;
hope you find this of use.
I would appreciate if you could add my email address
(sk123p(a)clinmed.gla.ac.uk) to your mailing list.
sreenivasulu kurukuti PhD
Division of Cancer Sciences and Molecular Pathology
Section of Pathology and Gene Regulation
Tel: 44-(0)141 211 2743
Fax: 44-(0)141 337 2494
Dear Galaxy Users,
Since I've updated my Galaxy version I'm unable to import any dataset from
my history into my data library. It returns a message of e.g. "Invalid item
id (1464) specified".
I then did a complete new test installation to make sure that this problem
wasn't because of the update. But the error still remains. I've included a
screen shot that reports the error.
When a queried the database the dataset id does exist in the
I went through the paster.log file and found the following warning message:
Unknown library item type: <class
'galaxy.model.HistoryDatasetAssociation'>. I'm sure this is the problem and
would need to find a way to fix it.
Does anyone have the same issue?
The reference file is the source genome or any other fasta file that
your data is derived from (could be custom). It is the "reference"
sequence that you will be using for mapping.
There are a few thing to try:
If your data will be mapped to a genome already in Galaxy, then use the
pencil icon (for the SAM history item) and alter the attributes to
assign a genome. Next, use the same genome as the reference when running
SAM->BAM. Please note that not all genomes are indexed for use by SAM
tools. If your genome is not here, we are open to requests to add more,
if the data is in our main genome list or publicly available from a
stable source. Please be specific for requests - exact genome name as we
use it, or a link to NCBI, or a link to another public data source is
If your data is custom, the database can remain undefined (will display
as a "?"). Load your custom fasta genome/sequence into your history, if
not already there. Then when running SAM->BAM, use the option "locally
cashed" and set the reference to be that loaded custom fasta file.
Hopefully this helps to resolve the issue. But, if you continue to have
problems, please feel free to share your history and we can take a
closer look. To do this, at the top of the history pane (right): Options
-> Share or Publish -> Make History Accessible via Link and email to me.
On 11/8/10 12:20 PM, Nripesh Prasad wrote:
> Hii Jennifer,
> I am doing it that way only, i have uploaded my .sam files in history,
> now when i select NGS: SAM Tools -> SAM-to-BAM, it gives me two options
> to choose the source, if i select history and enter the sam file then it
> is asking for a reference file, What is a reference file adn where do i
> get a reference file ?
> if i do it by the option locally cached as a source then it is giving me
> following error.
> * No Content-Length: returned in header for
> can't proceed, sorry
> what shopuld i do now?
> On Mon, Nov 8, 2010 at 2:14 PM, Jennifer Jackson <jen(a)bx.psu.edu
> <mailto:firstname.lastname@example.org>> wrote:
> Hi Nripesh,
> Use "NGS: SAM Tools -> SAM-to-BAM". This will create a new BAM data
> history item.
> Hope this helps!
> Galaxy team
> ps. For new data/usage questions, it would be great for us if you
> could send them to the mailing list galaxy-user(a)bx.psu.edu
> <mailto:email@example.com>. We like to publish answers there
> for other all to learn from.
> On 11/8/10 11:31 AM, Nripesh Prasad wrote:
> Hii Jennifer,
> How can i convert .sam format to .bam format in galaxy.
> Jennifer Jackson
> http://usegalaxy.org <http://usegalaxy.org/>
Use "NGS: SAM Tools -> SAM-to-BAM". This will create a new BAM data
Hope this helps!
ps. For new data/usage questions, it would be great for us if you could
send them to the mailing list galaxy-user(a)bx.psu.edu. We like to publish
answers there for other all to learn from.
On 11/8/10 11:31 AM, Nripesh Prasad wrote:
> Hii Jennifer,
> How can i convert .sam format to .bam format in galaxy.
Please send your queries to one of our mailing lists -- galaxy-dev or galaxy-user -- rather than to individuals. That way, more people can see and help you with your questions, you'll likely get a more timely response, and the mailing list archive can serve as a useful repository. Your question about GFF to BED is best addressed to galaxy-user (b/c it's about usage of Galaxy), and your question about Tophat is best addressed to galaxy-dev (b/c it's about setting up a local Galaxy instance). Hence, I've cc'd both lists.
Ok, on to your questions:
> Maybe i can use the tophat anlone.
We've got a Galaxy wrapper ready for Tophat version 1.1+ ; it will likely be available in galaxy-central this week. Once you update your version of Galaxy, you'll be able to run Tophat in Galaxy.
> But now i have a another question for giff to bed function.
> i want to use gff-to-bed function after i run it, it seems that i got the results :
> empty, format: bed, database: ?
> Info: 0 lines converted to BED. Skipped 74166 blank/comment/invalid lines starting with line #1.
> i am using gff3 file.and it is the first few lines.
> ##gff-version 3
> ##genome-build MSU Rice Genome Annotation Project osa1r6
> ##species Oryza sativa spp japonica cv Nipponbare
> ##sequence-region Chr1 1 43268879
> Chr1 MSU_osa1r6 gene 1903 9817 . + . ID=13101.t00001;Name=TBC%20domain%20containing%20protein%2C%20expressed;Alias=LOC_Os01g01010
> Chr1 MSU_osa1r6 mRNA 1903 9817 . + . ID=13101.m00001;Parent=13101.t00001;Alias=LOC_Os01g01010.1
> Chr1 MSU_osa1r6 five_prime_UTR 1903 2268 . + . Parent=13101.m00001
> Chr1 MSU_osa1r6 five_prime_UTR 2354 2448 . + . Parent=13101.m00001
This appears to be a valid GFF3 file and should work fine. We'll look into this and get back to you.
Dear Galaxy users,
I would like to do a quite simple operation, in theory: I've configured a
Galaxy pipeline on a local Galaxy server (installed in a Sun Grid Engine
cluster), and I would like to run it on several datasets (several thousands,
in a directory) and get result files in another directory.
With the web interface, using libraries or not, I didn't found any solution.
Does a simple solution exist ? Or anybody have experienced the same problem
Research engineer - ARCAD project
CIRAD - Montpellier - France