You are correct, by default, only identifiers, but not description text
is produced when using the fetch sequence tool. Description text is not
preserved by most Galaxy tools that use fasta files in Galaxy anyway, so
be sure to include key names in the identifier itself. The identifier
can be modified after the fasta file is created, if you want, using
tools from the text manipulation tool set.
Once you have the fasta file, start by converting the file type from
fasta to tabular. From there, alter the sequence identifier to be any
value you want, by joining in new columns of data (other identifiers),
merging together existing columns (e.g. converting spaces to
underscores), adding new columns, and similar manipulations. You may
want to cut out the sequence to save it back, work on the identifier,
the merge it all back together at the end. As long as the end dataset is
a two column tabular file.
It is very important that there is no extra white space - only one tab
between the two columns. The first column is the identifier, the second
the sequence. Next, convert this to fasta format as the final product.
This will take some experimentation, but these are very powerful tools
that can do most of what can be done on the unix line command or with
simple scripting. Once you work out a process that you like, it can be
saved in a workflow, so that next time you want to do the same thing,
you can just run the workflow instead of running a batch of tedious
steps. Or, at a minimum, a saved workflow will provide you with a
starter set of functions custom to your type of projects.
Best wishes and if you have questions about a particular tool, please
let us know,
On 11/9/11 4:06 AM, Lawrence Mckechnie wrote:
I uploaded a tab-delimited file(this was constructed within R using
write.table) into Galaxy with chr, start, end, and esembl_TSS_name.
Whilst I am able to use fetch sequence function, currently not able to
include the Esembl ID in the FASTA output. I am able to include the
Ensembl name in the interval format but not in FASTA format.
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
To manage your subscriptions to this and other Galaxy lists,
please use the interface at: