Re: [galaxy-user] Metagenomic filtering
Dear Galaxy When running HiSeq shot metagenomics sample from the environment against megablast and taxonomic representation, How do I filter/remove all the 16s and other conserved sequences. The problem if blasting a single organism that has a fraction of conserved sequence, the results will align with E.coli 10,000 times more then the possible target organism. This data would be wrong and misleading. For example a 100mg sample that was negative for e coli using MUG test, give thousands of hits with galaxy. 1) Is there a "filter conserved sequences" setting? 2) Is there a "remove model organisms" setting? Scott Tighe -- Core Laboratory Research Staff Advanced Genome Technologies Core Deep Sequencing (MPS) Facility Vermont Cancer Center 149 Beaumont Ave University of Vermont HSRF 303 Burlington Vermont USA 05045 802-656-AGTC 802-999-6666 (cell) Quoting Jennifer Jackson <jen@bx.psu.edu>:
Hello Elwood,
Are you still having connection issues today? Or is this resolved?
Best,
Jen Galaxy team
On 9/13/13 11:36 AM, Elwood Linney wrote:
A message sent earlier this week by me indicated that I could not connect to Galaxy via Fetch to download data.
A reply indicated a glitch was fixed.
I then could connect with Fetch and I tried to transfer 4 x 16gb files and the connection disconnected about 4 times.
Now, once again, I cannot connect with Galaxy online to transfer data.
Is this a problem that can be solved-either at my end or at Galaxy?
Elwood Linney
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson http://galaxyproject.org
Hi Scott, The tool "Metagenomic analyses -> Find diagnostic hits" can be used to isolate the conserved sequences. Then, you use the tool "Join, Subtract and Group -> Compare" to find "Non Matching rows of 1st dataset" to filter out anything that you think is spurious for your analysis (put in original file first, output of diagnostic hits second) before moving forward with the other summary tools. You will probably want to run the "Find diagnostic hits" tool more than once. The choice is yours whether to do the "Compare" after each, or to "Text Manipulation -> Concatenate" all the results together first, then "Compare". The first might work faster, it just depends on the size of your datasets (how much filtering occurred before this step, etc). The "Compare" tool sorts and holds data in memory. Even if you need to break the data up and run in smaller chunks, the results should be the same in the end. None of these jobs require the data to be in one lump. Others are welcome to add to this with their own strategies, I am sure there are others ways to do this. Some of the public servers specializing in Metagenomics may also have tools for this, or options, and some of those may have donated to the Tool Shed, for local or cloud use. May be worth a look. http://wiki.galaxyproject.org/PublicGalaxyServers Good question! Jen Galaxy team On 9/18/13 7:03 AM, Scott W. Tighe wrote:
Dear Galaxy
When running HiSeq shot metagenomics sample from the environment against megablast and taxonomic representation, How do I filter/remove all the 16s and other conserved sequences.
The problem if blasting a single organism that has a fraction of conserved sequence, the results will align with E.coli 10,000 times more then the possible target organism. This data would be wrong and misleading. For example a 100mg sample that was negative for e coli using MUG test, give thousands of hits with galaxy.
1) Is there a "filter conserved sequences" setting?
2) Is there a "remove model organisms" setting?
Scott Tighe
-- Jennifer Hillman-Jackson http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
Scott W. Tighe