________________________________ From: Jennifer Jackson <jen@bx.psu.edu> To: Milad Bastami <mi.bastami@yahoo.com> Cc: "galaxy-user@lists.bx.psu.edu" <galaxy-user@lists.bx.psu.edu> Sent: Tuesday, February 12, 2013 6:56 PM Subject: Re: [galaxy-user] Galaxy Join takes too long?
Hello Milad,
I am not sure if you are using the public Galaxy Main server (https://main.g2.bx.psu.edu) or your own local computer, or if the jobs are yellow and running or still grey and in the waiting-to-run stage.
If using Galaxy Main, and if actually running (dataset is yellow), this type of job sometimes can take a while to run on very large datasets.
To improve the chances of a successful run and to increase the speed of any run, make sure to put the large dataset as the first input and the smaller file as the second input. The interval operations tools use an indexing strategy where the second input file is the portion loaded memory and the first file is processed against it.
Hopefully this helps,
Jen Galaxy team
On 2/12/13 2:58 AM, Milad Bastami wrote:
I'm trying to joint two large intervals (one with 800,000 intervals and the other with about 350,000 intervals) using operates on intervals > join tool . I have no idea how long it should takes normaly. Two days have past and it is still runnig. Is there any limitation in file size for this tool?
Any help would be appreciated. Regards, Milad Bastami PhD student of Medical Genetics Department of Medical Genetics Shahid Beheshti's university of Medical Science
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists,
Hi Jen Thanks for your information, I'm using the public Galaxy main server and the job is yellow.It was a good point, I put the large dataset as the second input, I will wait if no success I will treat as you said. Milad Bastami PhD student of Medical Genetics Department of Medical Genetics Shahid Beheshti's university of Medical Science please use the interface at: http://lists.bx.psu.edu/
--
Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org