Scatter and gather workflow
Hi, Am a complete newbie, at investigating Galaxy. I found my question(Scatter-Gather data) in the mailing list about 10 months ago, but did not see any action on it hence asking again. I want to create a sub work flow where, 1. I scatter/split a file into little files, 2. Find the occurrence of each sequence in the little files, within a huge reference file and recreate the little files with the index positions appended against each sequence, as to where they appeared in the huge file. 3. Gather back the little files into a whole big file. Can you tell me if it is possible to do this in Galaxy? If yes, are there any existing/in-built tools that I can use to do the same? Thank you, Sonali Amonkar DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Hello Sonali, If the reference file is sequence and the little files are sequence, then using one of the NGS mapping tools would work (without the need to split/merge). This would make the assumption that "index position" means "positional coordinates" with respect to the reference. It is not exactly clear what you are trying to do, so if you would like to provide more information about the bio task you are trying to create a workflow for, we can offer more help. Perhaps there is another method to obtain the result you want. Best, Jen Galaxy team On 1/10/11 9:23 AM, Sonali Amonkar wrote:
Hi,
Am a complete newbie, at investigating Galaxy. I found my question(Scatter-Gather data) in the mailing list about 10 months ago, but did not see any action on it hence asking again.
I want to create a sub work flow where, 1. I scatter/split a file into little files, 2. Find the occurrence of each sequence in the little files, within a huge reference file and recreate the little files with the index positions appended against each sequence, as to where they appeared in the huge file. 3. Gather back the little files into a whole big file.
Can you tell me if it is possible to do this in Galaxy? If yes, are there any existing/in-built tools that I can use to do the same?
Thank you, Sonali Amonkar
DISCLAIMER ========== This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
_______________________________________________ galaxy-user mailing list galaxy-user@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-user
-- Jennifer Jackson http://usegalaxy.org
participants (2)
-
Jennifer Jackson
-
Sonali Amonkar