On 9/18/13 7:36 AM, D. A. Cowart wrote:
Hello,
I need to perform an action (or series of actions) on an 454
dataset using Galaxy, and have not been able to figure out the
necessary steps, even after looking through the toolbar
expressions and using custom search.
My file is a fasta and has the standard format:
>GNJQDEZ01A940A
CTGAGTCAGGTCAACAATCATAAGATATTGGCACCATGTACCTGTGGTTCTCGTTTCC
ATGTTA
>GNJQDEZ01BJYQZ
CTGAGTCAGGTCAACAATCATAAGACATCGGCTCTCTATATTTAATATTGGT
Each of the 100,000 sequences within this file contains a
specific tag, which is the first 8 nucleotides.
There are 19 tags total. I would like to identify these tags
and add an identifier of the tag to the sequence name.
Therefore, if I am looking for the first tag (CTGAGTCA), the
output would look like:
>GNJQDEZ01A940A_Tag1
CTGAGTCAGGTCAACAATCATAAGATATTGGCACCATGTACCTGTGGTTCTCGTTTCC
ATGTTA
Is it possible to achieve this using Galaxy? If possible,
could you kindly suggest tools to use.
Thank you in advance,
Dominique Cowart
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/