The problem was a lack of memory to execute the join function
of bx.intervals.operations.join.
I tested the joining dataset in several machines with different memory
sizes. It was only succeeded on the 16gb memory system.
Also I observed that the joining two datasets,
galaxy-dist/tools/new_operations/gops_join.py, used around 10gb memory for
the computation.
I used 37 mega bytes exons and 185 mega bytes repeats.
Other than 16gb memory system, it was terminated like 'Killed' when it was
running of memory. I suspect the join function might have a memory 'leak'.
I just wanted to update what I found out.
Thanks,
Hyungro
On Mon, Jul 1, 2013 at 10:57 AM, Hyungro Lee <hroe.lee(a)gmail.com> wrote:
Hi,
I am new here and trying to use the Galaxy toolkit for my project on the
Windows Azure cloud environment. The thing is I got errors when I tested
the 101 example on my local machine, so I looked 'stderr' via View details
and it only showed 'Killed' which is not much information to do debugging.
I looked web logs and found out the execution is : 'python
/home/.../galaxy-dist/tools/new_operations/gops_join.py
/home/.../galaxy-dist/database/files/000/dataset_1.dat
/home/.../galaxy-dist/database/files/000/dataset_2.dat
/home/.../galaxy-dist/database/files/000/dataset_3.dat -1 1,2,3,6 -2
1,2,3,6 -m 1 -f none' which is about joining two datasets. I believe the
input data sets and the parameters are okay since I tested it on the galaxy
public server.
To get more debugging information, I tried to run the job on shell but got
same results, Killed. I was wondering is there a way that I can get more
information about the execution of the tool other than looking into the
source code line by line? That might be helpful to figure out what I am
doing wrong for this test.
Thanks,
Hyungro