I'm picturing select parameter for FASTA output,
Name features using: * build, reference, co-ordinates and strand (default) * name from annotation file (if present) * reference name (useful if working on gene/proteins)
Agreed.
If name is selected, then a conditional text parameter for GFF type files would be shown to ask which tag(s) to use as the name - a command separated list might work well: http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-August/006432.html
Yes, this is a limitation of the current Galaxy framework but should be able to be implemented without too much trouble.
This could default to ID for GFF3, and transcript_id,gene_id for GTF, and whatever else is sensible for GFF2. Or a single default suitable for all: ID,transcript_id,gene_id
Maybe we don't need the tag setting to be optional, just hard code it to something like ID,transcript_id,gene_id?
As a first step, hardcoding is fine.
Is it acceptable for the file format conversion tools in Galaxy to have parameters? In this case, a list of tags to use as the feature name, e.g. ID, transcript_id, gene_id
Not that I know of because Galaxy assumes conversions can be done automatically as needed.
Finally, note that all changes made to any GFF code must work for GFF, GFF3, and GTF formats.
That makes life interesting... what are the major sources of legacy GFF files within Galaxy (anything not GFF3)?
Perhaps I spoke a bit too strongly here. I think that GTFs are the primary flavor of GFF files used in Galaxy, and these are acquired from UCSC and Ensembl. GFF3 also seem to be used quite frequently as well, especially for folks working with bacteria and other simple organisms. GFF 2.2 and earlier aren't seen much as best I know. So let me rephrase and say that any changes need to be compatible with GTF and GFF3. Best, J.