I am trying to perform a "join" on two sets of intervals. There are ~20,000 intervals in one dataset and about 13 million on the other. This has been running for about 3 days now, and I'm pretty certain that its not going to work. Is there a way to know if there is enough memory available for a given function to run ahead of time? Also, how much storage space does each account have available? I know that one can access cloud space online, through amazon for example. However that seems to be fairly complicated and a bit out of reach for me for the short term.
Hello Keith,
There are no quotas set at this time, however for very large jobs we do recommend using a cloud instance or your own local instance. Getting Cloudman going should not be overly complicated - did you have trouble with a particular part of the setup/execution we could help with?
It is also possible that this job simply was run at a time when the load here was high. Have you been able to run the job successfully since you originally wrote? Does the job complete when you break the larger 13m line file into smaller chunks (could break up, do the join, then merge again). Tools in Text Manipulation and a perhaps a Workflow could be useful for this.
Please let us know how we can help,
Jen Galaxy team
On 3/18/11 8:22 AM, Keith E. Giles wrote:
I am trying to perform a "join" on two sets of intervals. There are ~20,000 intervals in one dataset and about 13 million on the other. This has been running for about 3 days now, and I'm pretty certain that its not going to work. Is there a way to know if there is enough memory available for a given function to run ahead of time? Also, how much storage space does each account have available? I know that one can access cloud space online, through amazon for example. However that seems to be fairly complicated and a bit out of reach for me for the short term. ___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
galaxy-user@lists.galaxyproject.org