Hi Yang, I am going to give you a method to do this - in short you'll be splitting the dataset into three parts, altering two of them, then merging the three final results datasets together. A workflow could be extracted from the history once you have completed this method, saved for future use. 1 - Use 'Filter and Sort -> Select' The default string would match all of the lines in your dataset. Alter it to create three files: Use "Matching" for all All chroms, minus ChrM and ChrC ^chr([0-9])+ ChrM ^ChrM ChrC ^ChrC 2. For the datasets ChrM and ChrC, use 'Text Manipulation -> Add column' on each file individually. This column should be in the final desired form, e.g. "chrM" or "chrC" 3. For both results, use "'Text Manipulation -> Cut" to replace column "1" with the new column. 4. Use the tool "Concatenate datasets" to combine the three files again, using the new results. 5. Reassign the metadata as needed using the pencil icon as needed. These tool all work on datatype "tabular" and generally on other text data, but assign a dataset to "tabular" format using the pencil icon if it is not recognized by a tool. This is fine until the last step where you can set it back to GFF. On 1/14/14 11:17 AM, Yang Bi wrote:
Hi Jen:
I still have a little problem with the chromosome names. It appears that the mitochondria genes and chloroplast genes are named "ChrC" and "ChrM" in the gff3 file which I need to change to "chrC" and "chrM". How do I change cases specifically for the initial letters and not the entire words?
Thanks Yang
-- Jennifer Hillman-Jackson http://galaxyproject.org