Hi, is there a way to manipulate the different values in the column of a tab sepparated file (i.e. change all chr1@sdfasdfasdfasdf to chr1 using regexp). thx, Felix
Hi Felix, "Text Manipulation -> Convert delimiters to TAB" could split one field into more than one, but the delimiter has to be in the list ("@" is not). "Text Manipulation -> Cut columns" from a table is similar, but it will not split on a "@" either. "Text Manipulation -> Trim leading or trailing characters" could be use for this specific case, since you can trim off the end of a column based on a position (but again, not a specified delimiter). To prep for an entire genome, you would need to break up the starting query so that the chromosome name lengths in any derivative queries are of a consistent length, then merge back together. Perhaps the "@" was just an example and one of these tools will work for you. If you are customizing, additions to the Tool Shed that expand the native tools are always welcome! http://community.g2.bx.psu.edu Thanks for using Galaxy and for bringing up an interesting use case, Best, Jen Galaxy team On 2/18/11 4:57 AM, Felix Hammer wrote:
Hi, is there a way to manipulate the different values in the column of a tab sepparated file (i.e. change all chr1@sdfasdfasdfasdf to chr1 using regexp). thx, Felix
_______________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org
On Tue, Feb 22, 2011 at 2:28 AM, Jennifer Jackson <jen@bx.psu.edu> wrote:
Hi Felix,
"Text Manipulation -> Convert delimiters to TAB" could split one field into more than one, but the delimiter has to be in the list ("@" is not).
"Text Manipulation -> Cut columns" from a table is similar, but it will not split on a "@" either.
"Text Manipulation -> Trim leading or trailing characters" could be use for this specific case, since you can trim off the end of a column based on a position (but again, not a specified delimiter). To prep for an entire genome, you would need to break up the starting query so that the chromosome name lengths in any derivative queries are of a consistent length, then merge back together.
Perhaps the "@" was just an example and one of these tools will work for you. If you are customizing, additions to the Tool Shed that expand the native tools are always welcome! http://community.g2.bx.psu.edu
I've been planning to write a Galaxy tool to split a column on a given delimiter (e.g. @ for this example, or | for NCBI style identifiers), which would solve this use case nicely. I haven't done it yet though - so if anyone else wants to write such a tool first, please go ahead. Specifically I would be aiming to expose the Python split and rsplit string method functionality, so the user would have to specify the number of splits (or perhaps more intuitively the number of columns to make) and if it should start on the left (default) or on the right. Peter
Hi Peter, sounds nice, would be a great feature. For everyone else how is not using a custom server: if you are lucky you can use the trim tool on tabs to solve your problem. If you want to add text to the beginning or end of a column: - Use add column and add the text as a new column - Then use cut to get everything in the right order - finally join the new column with the column you want to add the text to I hope you get what I mean and is helpful to someone Using these tricks I have created a work flow that serves as my custom sam to gff converter. Of course its terribly inefficient, but it gets the job done. thx, Felix
On Tue, Feb 22, 2011 at 2:28 AM, Jennifer Jackson <jen@bx.psu.edu> wrote:
Hi Felix,
"Text Manipulation -> Convert delimiters to TAB" could split one field into more than one, but the delimiter has to be in the list ("@" is not).
"Text Manipulation -> Cut columns" from a table is similar, but it will not split on a "@" either.
"Text Manipulation -> Trim leading or trailing characters" could be use for this specific case, since you can trim off the end of a column based on a position (but again, not a specified delimiter). To prep for an entire genome, you would need to break up the starting query so that the chromosome name lengths in any derivative queries are of a consistent length, then merge back together.
Perhaps the "@" was just an example and one of these tools will work for you. If you are customizing, additions to the Tool Shed that expand the native tools are always welcome! http://community.g2.bx.psu.edu
I've been planning to write a Galaxy tool to split a column on a given delimiter (e.g. @ for this example, or | for NCBI style identifiers), which would solve this use case nicely. I haven't done it yet though - so if anyone else wants to write such a tool first, please go ahead.
Specifically I would be aiming to expose the Python split and rsplit string method functionality, so the user would have to specify the number of splits (or perhaps more intuitively the number of columns to make) and if it should start on the left (default) or on the right.
Peter
participants (3)
-
Felix Hammer
-
Jennifer Jackson
-
Peter Cock