Count characters in column
Dear Galaxy Community I frequently use pileup files in galaxy. The sequence read column contains characters like .,AaCcTtGgNn$* etc. I would like to be able to add a column to my tables that represents the number of instances of a given character in the read column. I can do this in excel with this type of formula: =len(I2)-len(substitute(I2,"A","") But of course, Excel can only handle a limited number of total lines. Do we have this functionality in Galaxy? If so how? If not..... please? Thank you, Gregory Thyssen, PhD Molecular Biologist Cotton Fiber Bioscience USDA-ARS-Southern Regional Research Center 1100 Robert E Lee Blvd New Orleans, LA 70124 gregory.thyssen@ars.usda.gov 504-286-4280 This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.
Hi Greg, You can do this using the "Compute" tool in the Text Manipulation section. Your expression in this case, to add a column with the number of "A" nucleotides in column 1, would be something like: str(c1).count("A") Thanks for using Galaxy! -Dannon On Fri, Mar 15, 2013 at 5:05 PM, Thyssen, Gregory - ARS <Gregory.Thyssen@ars.usda.gov> wrote:
Dear Galaxy Community
I frequently use pileup files in galaxy.
The sequence read column contains characters like .,AaCcTtGgNn$* etc.
I would like to be able to add a column to my tables that represents the number of instances of a given character in the read column.
I can do this in excel with this type of formula:
=len(I2)-len(substitute(I2,”A”,””)
But of course, Excel can only handle a limited number of total lines.
Do we have this functionality in Galaxy?
If so how?
If not….. please?
Thank you,
Gregory Thyssen, PhD
Molecular Biologist
Cotton Fiber Bioscience
USDA-ARS-Southern Regional Research Center
1100 Robert E Lee Blvd
New Orleans, LA 70124
gregory.thyssen@ars.usda.gov
504-286-4280
This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Dannon Baker
-
Thyssen, Gregory - ARS