Get flanks (version 1.0.0)
HI, I am very new to genomic data analysis and I need to get some upstream and downstream of some chromosome regions of the pig genome. I have about 70 blat hits of a query of ca 100aa. I need to get 7000 nucleotides both upstream and downstream of this 100aa region. I have tried to use Get flanks to get the "new" coordinates... bus instead of generating coordinates which would correspond to about 14000 nucleotides, it generates one coordinate for the upstream region and them another one for the downstream region. Is there a way of doing what I need using Galaxy? I would appreciate any help! Thanks a lot! All the best, Fabricia.
Hello Fabricia, You are probably running the tool like this, correct? This lumps the upstream flank and downstream flank ends to create one interval: "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 Instead, run the tool in twice to extract upstream and downstream regions into distinct intervals: Run 1 "Region:" Whole feature "Location of the flanking region/s:" Upstream "Offset" 0 "Length of the flanking region(s):" 7000 Run 2 "Region:" Whole feature "Location of the flanking region/s:" Downstream "Offset" 0 "Length of the flanking region(s):" 7000 If your question has been misunderstood, please let us know, Best, Jen Galaxy team On 5/2/12 5:51 PM, Fabricia Nascimento wrote:
HI,
I am very new to genomic data analysis and I need to get some upstream and downstream of some chromosome regions of the pig genome. I have about 70 blat hits of a query of ca 100aa. I need to get 7000 nucleotides both upstream and downstream of this 100aa region. I have tried to use Get flanks to get the "new" coordinates... bus instead of generating coordinates which would correspond to about 14000 nucleotides, it generates one coordinate for the upstream region and them another one for the downstream region. Is there a way of doing what I need using Galaxy?
I would appreciate any help!
Thanks a lot!
All the best, Fabricia.
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
Hi Jen, Thanks a lot for your reply. But I think you misuderstood my question. I will reformulate it given examples. I have initially (because I am doing just preliminary analysis) 70 blat hits corresponding to different coordinates in the pig genome. What I would like to have is the flanking region in both direction between these blat hits. I am not working with gene (or introns and exons). For example: Imagine that this symbol ########## corrsponds to my blat hit and this symbol -------------------- corresponds to flanking regions I have initially ########## and I would like to obtain -------------------- ########## -------------------- In numbers: I have: chr146496908464969603 I would like to have chr146496208464976603 (This will correspont to the first coorditane minus 7000 and the last coordinate plus 7000) What I got using "Get Flanks" and using the parameters "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 chr14 64962084 64969084 chr14 64969603 64976603 Is there a way of merging the above coorditates to come with what I need? Thanks a lot for your help, All the best, Fabricia. ________________________________ De: Jennifer Jackson <jen@bx.psu.edu> Para: Fabricia Nascimento <nascimentoff@yahoo.com.br> Cc: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu> Enviadas: Quinta-feira, 3 de Maio de 2012 4:25 Assunto: Re: [galaxy-user] Get flanks (version 1.0.0) Hello Fabricia, You are probably running the tool like this, correct? This lumps the upstream flank and downstream flank ends to create one interval: "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 Instead, run the tool in twice to extract upstream and downstream regions into distinct intervals: Run 1 "Region:" Whole feature "Location of the flanking region/s:" Upstream "Offset" 0 "Length of the flanking region(s):" 7000 Run 2 "Region:" Whole feature "Location of the flanking region/s:" Downstream "Offset" 0 "Length of the flanking region(s):" 7000 If your question has been misunderstood, please let us know, Best, Jen Galaxy team On 5/2/12 5:51 PM, Fabricia Nascimento wrote:
HI,
I am very new to genomic data analysis and I need to get some upstream and downstream of some chromosome regions of the pig genome. I have about 70 blat hits of a query of ca 100aa. I need to get 7000 nucleotides both upstream and downstream of this 100aa region. I have tried to use Get flanks to get the "new" coordinates... bus instead of generating coordinates which would correspond to about 14000 nucleotides, it generates one coordinate for the upstream region and them another one for the downstream region. Is there a way of doing what I need using Galaxy?
I would appreciate any help!
Thanks a lot!
All the best, Fabricia.
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
Hi Fabricia, To create a merged interval that spans the 7k upstream flank interval, the original interval, and the 7k downstream flank interval, do the following: Starting with the two files you already have: 1 - original intervals (extracted from blat hits) 2 - flank results from the query: "Get Flanks" "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000 Put both datasets into a single dataset using the tool: "Operate on Genomic Intervals -> Concatenate", Both datasets are same filetype?: checked. On that result file, Merge the intervals together using the tool: "Operate on Genomic Intervals -> Merge". If your original blat hits have any overlap, or the flanks your are generating have any overlap with any of your other intervals (original or other flanks), then this is probably not going to give you the results you want. In that case, it may just be simpler to just modify the coordinates using "Text manipulation" tools. Specifically, "Compute an expression on every row", run twice, once with the expression "c2 - 7000" and once with "c3 + 7000" (this is subtracting 7000 from the start, adding 7000 to the end). Then use "Cut" to recreate the interval file using the new values as start and end. Hopefully one of these will work for you. Jen Galaxy team On 5/3/12 6:38 AM, Fabricia Nascimento wrote:
Hi Jen,
Thanks a lot for your reply. But I think you misuderstood my question. I will reformulate it given examples.
I have initially (because I am doing just preliminary analysis) *_70 blat hits_* corresponding to different coordinates in the pig genome. What I would like to have is the flanking region in both direction between these blat hits. I am not working with gene (or introns and exons).
For example:
Imagine that this symbol ########## corrsponds to my blat hit and this symbol -------------------- corresponds to flanking regions
I have initially ########## and I would like to obtain -------------------- ########## --------------------
In numbers:
I have: chr146496908464969603
I would like to have chr146496208464976603 (This will correspont to the first coorditane minus 7000 and the last coordinate plus 7000)
What I got using "Get Flanks" and using the parameters "Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000
chr14 64962084 64969084 chr14 64969603 64976603
Is there a way of merging the above coorditates to come with what I need?
Thanks a lot for your help,
All the best, Fabricia.
------------------------------------------------------------------------ *De:* Jennifer Jackson <jen@bx.psu.edu> *Para:* Fabricia Nascimento <nascimentoff@yahoo.com.br> *Cc:* "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu> *Enviadas:* Quinta-feira, 3 de Maio de 2012 4:25 *Assunto:* Re: [galaxy-user] Get flanks (version 1.0.0)
Hello Fabricia,
You are probably running the tool like this, correct? This lumps the upstream flank and downstream flank ends to create one interval:
"Region:" Whole feature "Location of the flanking region/s:" Both "Offset" 0 "Length of the flanking region(s):" 7000
Instead, run the tool in twice to extract upstream and downstream regions into distinct intervals:
Run 1 "Region:" Whole feature "Location of the flanking region/s:" Upstream "Offset" 0 "Length of the flanking region(s):" 7000
Run 2 "Region:" Whole feature "Location of the flanking region/s:" Downstream "Offset" 0 "Length of the flanking region(s):" 7000
If your question has been misunderstood, please let us know,
Best,
Jen Galaxy team
On 5/2/12 5:51 PM, Fabricia Nascimento wrote:
HI,
I am very new to genomic data analysis and I need to get some upstream and downstream of some chromosome regions of the pig genome. I have about 70 blat hits of a query of ca 100aa. I need to get 7000 nucleotides both upstream and downstream of this 100aa region. I have tried to use Get flanks to get the "new" coordinates... bus instead of generating coordinates which would correspond to about 14000 nucleotides, it generates one coordinate for the upstream region and them another one for the downstream region. Is there a way of doing what I need using Galaxy?
I would appreciate any help!
Thanks a lot!
All the best, Fabricia.
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Fabricia Nascimento
-
Jennifer Jackson