Hi...
I have an analysis of 18,000,000 + sequences (X4) blasting through the HTGS database. Is there anyway to blast to a subset of this database to make the analysis much more speedy? Is the also a way to know if the analysis is working...ie periodic updates? I have had issues before with uploading large files where there is an error, but no error message. The data will look like its uploading when in fact it isnt. I just dont want to have to wait a week or weeks and find out the analysis is never going to end.
Thanks!!
Mersee
Mersee Madison-Villar
PhD Student, UT Arlington
Lab/Office B01
Genomics, Speciation, and Evolutionary Biology
"If evolution is outlawed, only outlaws will evolve"- Jello Biafra
________________________________________
From: galaxy-user-bounces(a)lists.bx.psu.edu [galaxy-user-bounces(a)lists.bx.psu.edu] On Behalf Of Guruprasad Ananda [gua110(a)bx.psu.edu]
Sent: Monday, March 29, 2010 9:46 AM
To: Andigoni Malousi
Cc: galaxy-user(a)bx.psu.edu
Subject: Re: [galaxy-user] problem report!!
Hi Andigoni,
I'm unable to reproduce the issue reported by you. I used the BED file sent by you as attachment and fetched sequences on it. The sequences produced were of the same lengths as in the BED file.
Also, I computed summary statistics for the lengths of your BED intervals - the mean and median lengths were 19790 bp and 3466 bp respectively. Since you mention that you expect the constitutive exons to be 100-200 bp long, I'm guessing there might be a problem in one of the steps prior to the subtraction step in your pipeline.
If you can share your history with me (guru(a)psu.edu<mailto:guru@psu.edu>), I can take a look at what might be going on. Here's how to do it:
-go to the Options menu above your current history, select "Saved Histories"
-go to the pull-down menu for the problematic history and select "Share or Publish"
-share the history with me by clicking on "share with a user" and entering my email address: guru(a)psu.edu<mailto:guru@psu.edu>.
Thanks for using Galaxy,
Guru
Galaxy team.
On Mar 27, 2010, at 5:06 AM, Andigoni Malousi wrote:
Dear Galaxy member,
I'm sending you this e-mail because of a problem I have in fetching sequences. I used Galaxy to fetch sequences corresponding to alternatively spliced exons as well as constitutive exons.
Here they are the steps I followed to do that:
First, I extracted the coordinates of the ref genes from whole human genome:
1. I selected Get Data -> UCSC table browser
2. Genome: "Human", assembly: "Feb. 2009", Group: "Genes and Gene Prediction Tracks", Track:"RefSeq genes".
3. Filter: (+) region: Genome
4. "get output" and "Exons plus 0 bases…"
Second, I extracted the coordinates of alternative splicing events
1. I selected "Get Data" -> "UCSC Main table browser".
2. Genome: "Human", assembly: "Feb. 2009", Group: "Genes and Gene Prediction Tracks", Track:"Alt Events".
3. Filter: (+) region: Genome
Then, I stored the outcome of these two processes in BED format. To extract the coordinates of the constitutive exons I used the Subtract operation in "Operate on Genomic Intervals"
1. Substract: Alternative splicing events From: Ref genes
2. "Intervals with no overlap"
3. Stored the constitutive exon coordinates in BED format