Post to mailing list On 8/26/12 11:38 AM, LeeMSilver@genepeeks.com wrote:
Hi Jen,
The ability to use Python commands with the Compute tool seems to be a very well-hidden gem in Galaxy. The problem with this nearly unmentioned gem is the sheer frustration felt when something that works on the Python command line fails on the Galaxy Compute tool.
What I would like to do is manipulate a column from a UCSC download that lists two or three codons separated by a comma, e.g. AAC,GCG, or GGT, CAC, TAT, . The string-based split command < c5.split(",").pop(1) > fails on this data because Galaxy assigned it a <list> type automatically. All of my attempts to change the data type to "string" have failed.
Any suggestions?
Lee LeeMSilver@genepeeks.com -- Jennifer Jackson http://galaxyproject.org
Hi Lee, There are a few options .. my guess is you are working with files like snpXXXCodingDbSnp? 1 - Using an expression such as this one will extract individual characters from the column, including the commas (treating the column's data like a string) c5.pop(1) = extracts the second character from the column 2 - Or, if you want to work with the entire codons, you could use the tool "Convert delimiters to TAB" to fully expand the data. You might want to use other Text Manipulation tools such as "Cut" with this option to get access to specific columns. Or "Condense consecutive characters" if the extra trailing commas in some of these datasets create empty columns (after converting to tabs). Hopefully this helps! If there is more feedback from our developers, we'll post more. Others on the list are also welcome to add in comments. Best, Jen Galaxy team On 8/27/12 9:44 AM, Jennifer Jackson wrote:
Post to mailing list
On 8/26/12 11:38 AM, LeeMSilver@genepeeks.com wrote:
Hi Jen,
The ability to use Python commands with the Compute tool seems to be a very well-hidden gem in Galaxy. The problem with this nearly unmentioned gem is the sheer frustration felt when something that works on the Python command line fails on the Galaxy Compute tool.
What I would like to do is manipulate a column from a UCSC download that lists two or three codons separated by a comma, e.g. AAC,GCG, or GGT, CAC, TAT, . The string-based split command < c5.split(",").pop(1) > fails on this data because Galaxy assigned it a <list> type automatically. All of my attempts to change the data type to "string" have failed.
Any suggestions?
Lee LeeMSilver@genepeeks.com
-- Jennifer Jackson http://galaxyproject.org
Thanks Jen, Yes, I am working with codingDbSnp files. The solutions you suggest do the work, but they're clunky in requiring a bunch of operations to extract a whole codon or to determine whether a particular interval has two or three codons. So if someone has a one or two line solution, it would be much appreciated. Lee On Aug 27, 2012, at 2:04 PM, Jennifer Jackson <jen@bx.psu.edu> wrote:
Hi Lee,
There are a few options .. my guess is you are working with files like snpXXXCodingDbSnp?
1 - Using an expression such as this one will extract individual characters from the column, including the commas (treating the column's data like a string)
c5.pop(1)
= extracts the second character from the column
2 - Or, if you want to work with the entire codons, you could use the tool "Convert delimiters to TAB" to fully expand the data. You might want to use other Text Manipulation tools such as "Cut" with this option to get access to specific columns. Or "Condense consecutive characters" if the extra trailing commas in some of these datasets create empty columns (after converting to tabs).
Hopefully this helps! If there is more feedback from our developers, we'll post more. Others on the list are also welcome to add in comments.
Best,
Jen Galaxy team
On 8/27/12 9:44 AM, Jennifer Jackson wrote:
Post to mailing list
On 8/26/12 11:38 AM, LeeMSilver@genepeeks.com wrote:
Hi Jen,
The ability to use Python commands with the Compute tool seems to be a very well-hidden gem in Galaxy. The problem with this nearly unmentioned gem is the sheer frustration felt when something that works on the Python command line fails on the Galaxy Compute tool.
What I would like to do is manipulate a column from a UCSC download that lists two or three codons separated by a comma, e.g. AAC,GCG, or GGT, CAC, TAT, . The string-based split command < c5.split(",").pop(1) > fails on this data because Galaxy assigned it a <list> type automatically. All of my attempts to change the data type to "string" have failed.
Any suggestions?
Lee LeeMSilver@genepeeks.com
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
Lee M. Silver