I have a tool that uses .gtf input, when I upload a .gtf file it is automatically recognised as a .gff file meaning I have to manually change the format to gtf. I know gff and gtf are very similar but is it possible to have a gtf sniffer? Out of interest is there any documentation relating to writing sniffers for different datatypes I should probably have a go at writing a few for my on. Thanks. Shaun Webb -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
I know gff and gtf are very similar but is it possible to have a gtf sniffer?
There is a GTF sniffer in Galaxy, and it should detect GTF files as such assuming your datatypes_conf.xml file is set up appropriately. To correctly sniff GTF files, make sure you have the following line in your <sniffers> section and that it appears above the GFF entry: ... <sniffer type="galaxy.datatypes.interval:Gtf"/> <sniffer type="galaxy.datatypes.interval:Gff"/> <sniffer type="galaxy.datatypes.interval:Gff3"/> ... If you have a GTF file that's still not being recognized, please send it our way and we'll take a look.
Out of interest is there any documentation relating to writing sniffers for different datatypes I should probably have a go at writing a few for my on.
I don't see anything in our wiki off hand. However, if you're going to write your own, looking at an existing sniffer should make it clear what needs to happen. For instance, see any of the sniff() functions in /lib/galaxy/datatypes/interval.py Best, J.
I don't see anything in our wiki off hand. However, if you're going to write your own, looking at an existing sniffer should make it clear what needs to happen. For instance, see any of the sniff() functions in /lib/galaxy/datatypes/interval.py
I was mistaken about the lack of documentation about sniffing. Here's some useful documentation within the context of adding a datatype to Galaxy: http://bitbucket.org/galaxy/galaxy-central/wiki/AddingDatatypes J.
File attached now, thanks. Quoting Jeremy Goecks <jeremy.goecks@emory.edu>:
I don't see anything in our wiki off hand. However, if you're going to write your own, looking at an existing sniffer should make it clear what needs to happen. For instance, see any of the sniff() functions in /lib/galaxy/datatypes/interval.py
I was mistaken about the lack of documentation about sniffing. Here's some useful documentation within the context of adding a datatype to Galaxy:
http://bitbucket.org/galaxy/galaxy-central/wiki/AddingDatatypes
J.
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Thanks, the gtf sniffer was not in the datatypes_conf.xml.sample so I assumed it did not exist. I have updated this to include the gtf sniffer but my file is still not being recognised. I have attached the file, any help would be great. Thanks Shaun Quoting Jeremy Goecks <jeremy.goecks@emory.edu>:
I know gff and gtf are very similar but is it possible to have a gtf sniffer?
There is a GTF sniffer in Galaxy, and it should detect GTF files as such assuming your datatypes_conf.xml file is set up appropriately. To correctly sniff GTF files, make sure you have the following line in your <sniffers> section and that it appears above the GFF entry:
... <sniffer type="galaxy.datatypes.interval:Gtf"/> <sniffer type="galaxy.datatypes.interval:Gff"/> <sniffer type="galaxy.datatypes.interval:Gff3"/> ...
If you have a GTF file that's still not being recognized, please send it our way and we'll take a look.
Out of interest is there any documentation relating to writing sniffers for different datatypes I should probably have a go at writing a few for my on.
I don't see anything in our wiki off hand. However, if you're going to write your own, looking at an existing sniffer should make it clear what needs to happen. For instance, see any of the sniff() functions in /lib/galaxy/datatypes/interval.py
Best, J.
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
I have attached the file, any help would be great.
Here's the first line in the file: -- Mito intergenic_50 exon 1 680 . + . gene_id "INT50_3749" ; transcript_name "INT50_3749"; transcript_id "INT50_3749"; gene_name "INT50_3749" ; Note "intergenic regions 50 nt from a coding sequence" -- As it turns out, the file is not technically a GTF file. The GTF spec is here: http://genome.ucsc.edu/FAQ/FAQformat#format4 and the requirement broken by this file is: -- The attribute list must begin with the two mandatory attributes: • gene_id value - A globally unique identifier for the genomic source of the sequence. • transcript_id value - A globally unique identifier for the predicted transcript. -- Where did you get this file? If there are prominent tools that are producing almost-but-not-quite-compliant GTF files, it may be necessary for Galaxy to relax its requirements a bit. Thanks, J.
participants (2)
-
Jeremy Goecks
-
SHAUN WEBB