GFF3 and metagenome data?
Hi all, has anybody an idea how to do the following in galaxy? I have short (400bp) metagenome reads and I have used meta-Genemark to find protein coding regions in the unassembled reads. Meta-Genemark outputs a GFF3 file (you find a sample at the bottom of the post). I saw that Galaxy has a tool to fetch sequence from a genome file using a GFF format file: "Extract Genomic DNA<http://main.g2.bx.psu.edu/tool_runner?tool_id=Extract+genomic+DNA+1>using coordinates from assembled/unassembled genomes ". I would like to use that tool, if possible. The problem is however that I get the following error: Unspecified genome build, click the pencil icon in the history item to set the genome build. Of course I have no genome, so I am a bit stuck and I have no clue on how to use the coordinates in my GFF file to extract those regions from my metagenome reads. Anybody an idea for a proper workflow? Thomas GFF3 output: ##source-version GeneMark.hmm_PROKARYOTIC 2.7d ##date Thu Mar 24 06:15:18 2011 ##Sequence file name: ghm.mfa ##Model file name: /home/genmark/public_html/metagenome/Prediction/bin_MetaGeneMark/MetaGeneMark_v1.mod FV4B4XA01C8BBF GeneMark.hmm gene 1 513 . + 0 gene_id 1 FV4B4XA01D6PDN GeneMark.hmm gene 2 334 . + 0 gene_id 2 FV4B4XA01DC6SS GeneMark.hmm gene 1 390 . - 0 gene_id 3 FV4B4XA01AOJUF GeneMark.hmm gene 2 400 . - 0 gene_id 4 FV4B4XA01CMP07 GeneMark.hmm gene 1 465 . + 0 gene_id 5 FV4B4XA01CIPQZ GeneMark.hmm gene 1 228 . + 0 gene_id 6 FV4B4XA01DWJZ1 GeneMark.hmm gene 1 459 . - 0 gene_id 7 FV4B4XA01AUE58 GeneMark.hmm gene 237 488 . + 0 gene_id 8 FV4B4XA01C56SJ GeneMark.hmm gene 1 309 . + 0 gene_id 9 FV4B4XA01C56SJ GeneMark.hmm gene 321 422 . + 0 gene_id 10 FV4B4XA01A3DSA GeneMark.hmm gene 3 143 . + 0 gene_id 11
participants (1)
-
Thomas Haverkamp