---------------------------- Original Message ---------------------------- Subject: Re: Randomization etc. From: "France Denoeud" <fdenoeud@imim.es> Date: Wed, March 15, 2006 6:08 am To: "Anton Nekrutenko" <anton@bx.psu.edu> Cc: "Dan Blankenberg" <djb396@psu.edu> -------------------------------------------------------------------------- Anton, I used the randomozation tool, it seems to do exactly what I need it to do (and is faster than my script!) : thanks! Just two details: - the links to the UCSC browser are not working because of the "." in the 5th column (probably my mistake, as I put "." in the input file, because I am more used to gff format). - would it be possible to ask for several runs at the same time ? I usually generate 10 (sometimes 100) random sets to calculate the random overlap in a more robust way. Then, it would be usefull to be able to save the outputs all at once: is it possible in Galaxy? Two more general comments about galaxy: - would it be difficult to make it handle gff format? - would it be difficult to make it handle ENCODE coordinates (and add a tool to do the encode2chr conversion if needed ?). France.
France,
Try new randomizer at http://test.g2.bx.psu.edu written by Dan Blankenberg (djb396@bx.psu.edu)
It is slower, but takes strands into account and returns correct coordinates now. However, it not well tested yet, so be a beta-tester.
anton
Anton Nekrutenko Assistant Professor Department of Biochemistry and Molecular Biology Center for Comparative Genomics and Bioinformatics 505 Wartik Building PennState University University Park, PA 16802 814 865-4752 814 863-6699 FAX anton@bx.psu.edu http://www.bx.psu.edu/~anton http://g2.bx.psu.edu
On Mar 14, 2006, at 1:28 PM, France Denoeud wrote:
France,
We can change the tool to take care of the strand information. We are currently re-writing our set operation and we'll redo the randomizations as well (it uses the same underlying library). This will take a few weeks.
More worrying, they do seem to fall inside ENCODE regions (I looked using the "display at UCSC" link).
This is normal as the tool currently only allows you to generate regions within ENCODE. Do you want the whole genome?
Oops, I forgot a "not" : they do NOT fall inside ENCODE regions but at the beginning of the chromosomes, like if a shifting step had been forgotten...
Another thing: I am trying to generate a random set mimicking a set of 616 objects and I get only 612 objects, is it normal ?
You have 4 regions with length 1 (not really ranges, but points):
chr7 116096515 116096516 racefrag_15 . + 1.0 chr7 115906408 115906409 racefrag_13 . + 1.0 chr19 60020799 60020800 racefrag_312 . - 1.0 chr11 64079672 64079673 racefrag_561 . + 1.0
(by the way, is it possible in galaxy to be using two different "histories" at the same time?)
Yes:
1. Click "share history" and bookmark the link 2. Click "You may also create a new history by clicking here" 3. Get the first query in and click "share history" and bookmark
Now you can simply switch between the two using bookmarks (use browser tabs)
anton
Anton Nekrutenko Assistant Professor Department of Biochemistry and Molecular Biology Center for Comparative Genomics and Bioinformatics 505 Wartik Building PennState University University Park, PA 16802 814 865-4752 814 863-6699 FAX anton@bx.psu.edu http://www.bx.psu.edu/~anton http://g2.bx.psu.edu