The problem was a lack of memory to execute the join function of bx.intervals.operations.join. I tested the joining dataset in several machines with different memory sizes. It was only succeeded on the 16gb memory system. Also I observed that the joining two datasets, galaxy-dist/tools/new_operations/gops_join.py, used around 10gb memory for the computation. I used 37 mega bytes exons and 185 mega bytes repeats. Other than 16gb memory system, it was terminated like 'Killed' when it was running of memory. I suspect the join function might have a memory 'leak'. I just wanted to update what I found out. Thanks, Hyungro On Mon, Jul 1, 2013 at 10:57 AM, Hyungro Lee <hroe.lee@gmail.com> wrote:
Hi,
I am new here and trying to use the Galaxy toolkit for my project on the Windows Azure cloud environment. The thing is I got errors when I tested the 101 example on my local machine, so I looked 'stderr' via View details and it only showed 'Killed' which is not much information to do debugging. I looked web logs and found out the execution is : 'python /home/.../galaxy-dist/tools/new_operations/gops_join.py /home/.../galaxy-dist/database/files/000/dataset_1.dat /home/.../galaxy-dist/database/files/000/dataset_2.dat /home/.../galaxy-dist/database/files/000/dataset_3.dat -1 1,2,3,6 -2 1,2,3,6 -m 1 -f none' which is about joining two datasets. I believe the input data sets and the parameters are okay since I tested it on the galaxy public server.
To get more debugging information, I tried to run the job on shell but got same results, Killed. I was wondering is there a way that I can get more information about the execution of the tool other than looking into the source code line by line? That might be helpful to figure out what I am doing wrong for this test.
Thanks, Hyungro