Problem in randomly selecting sequences?
On Thu, Oct 13, 2011 at 10:24 AM, Daniel Sher <dsher@sci.haifa.ac.il> wrote:
Hello again,
I am trying to randomly select sequences from an uploaded fasta file, but only about one-half of the randomly selected sequences actually contain sequence data (see below). The others contain only the name of the sequence. This happens even after making sure that in the initial file all of the sequences indeed have sequence data (by filtering to obtain only sequences with >100bp).
Any suggestions?
Looks like you're randomly picking *line* from the file, not *records*. Which Galaxy tool are you using? Peter
On Thu, Oct 13, 2011 at 11:38 AM, Daniel Sher <dsher@sci.haifa.ac.il> wrote:
Hi Peter - I am using the "select random lines"... so I see the problem with fasta format (duh). How can I randomly select records?
Daniel
Assuming you are restricted to the tools on the public Galaxy at Penn State, one solution is FASTA -> Tabular -> Select random line -> FASTA. Messy and several extra steps though. Peter
Hello Daniel, Peter, I have opened a bitbucket ticket to track a 'select random records' tool request. Please feel free to add in additional datatypes or details that you would like to be considered. https://bitbucket.org/galaxy/galaxy-central/issue/668/select-random-records-... Thanks Peter for the work-around in the case of random Fasta sequences! Best, Jen Galaxy team On 10/13/11 6:42 AM, Peter Cock wrote:
On Thu, Oct 13, 2011 at 11:38 AM, Daniel Sher<dsher@sci.haifa.ac.il> wrote:
Hi Peter - I am using the "select random lines"... so I see the problem with fasta format (duh). How can I randomly select records?
Daniel
Assuming you are restricted to the tools on the public Galaxy at Penn State, one solution is FASTA -> Tabular -> Select random line -> FASTA. Messy and several extra steps though.
Peter
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/Support
participants (3)
-
Daniel Sher
-
Jennifer Jackson
-
Peter Cock