Re: [galaxy-dev] [galaxy-bugs] Problem with indexing a genome through tophat on a local copy of Galaxy...
Hi David, There should be additional information in the galaxy database about why the job failed; take a look at stderr column of the failed job using some SQL like this: -- select * from job where state='error' and tool_id='tophat' and stderr like '%indexing reference%' order by id desc; -- If that doesn't work, try something simpler: -- select * from job where state='error' and tool_id='tophat' order by id desc; -- What are the errors that you see? Finally, please direct future questions about local Galaxy installations to galaxy-dev (cc'd) so that the community can learn from and participate in discussions. Thanks, J. On Dec 5, 2011, at 10:09 AM, David Matthews wrote:
Hi,
We have a local version of Galaxy here at Bristol and it is very nice. However, it does not like to run TopHat on large genomes (i.e. Human) but is happy with small ones (e.g. viruses). The genome we have is hg19 but missing chr Y and this genome works fine on the PSU version. However, we seem to be hitting a wall when we run it here and it seems to relate to the indexing part right at the begining. Here is the error output we see in the red box:
An error occurred running this job:Settings: Output files: "/tmp/tmppA7Hrk/dataset_942.*.ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 5 (one in 32) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt m
And here is the stuff we see when we click on the green bug (the second set is just the last bit of a long list of stuff):
Error indexing reference sequence Returning block of 520230381 Getting block 6 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:59 Sorting block of length 346541282 (Using difference cover) Sorting block time: 00:06:04 Returning block of 346541283 Getting block 7 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:40 Sorting block of length 408062248 (Using difference cover) Sorting block time: 00:07:08 Returning block of 408062249 Exited Ebwt loop fchr[A]: 0 fchr[C]: 837200431 fchr[G]: 1417119193 fchr[T]: 1997326330 fchr[$]: 2835690133 Exiting Ebwt::buildToDisk() Total time for backward call to driver() for mirror index: 01:16:32 TopHat v1.2.0 We are running on our HPC and have directed the job to land on a whole node with 8GB of ram on board. Any ideas why the index is failing?
Best Wishes, David.
__________________________________ Dr David A. Matthews
Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K.
Tel. +44 117 3312058 Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
Hi Jeremy, Many thanks for this, I'll pass it over to our HPC guys to work on it (my grasp of this is a bit flimsy). On a related note. When tophat is running on one of our nodes it seems to be very slow and when we check the activity on the node we typically see that is is using about 6.8GB of ram and about 16.5 GB of virtual memory with about 1% CPU activity - presumably because there is a lot of I/O which slows down the run. Is this roughly what you would expect on an 8GB node? Each node is 8GB RAM and 1GB RAM per core. Anything we can do at this end to optimise this a bit? Best Wishes, David. __________________________________ Dr David A. Matthews Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K. Tel. +44 117 3312058 Fax. +44 117 3312091 D.A.Matthews@bristol.ac.uk On 5 Dec 2011, at 21:00, Jeremy Goecks wrote:
Hi David,
There should be additional information in the galaxy database about why the job failed; take a look at stderr column of the failed job using some SQL like this:
-- select * from job where state='error' and tool_id='tophat' and stderr like '%indexing reference%' order by id desc; --
If that doesn't work, try something simpler:
-- select * from job where state='error' and tool_id='tophat' order by id desc; --
What are the errors that you see?
Finally, please direct future questions about local Galaxy installations to galaxy-dev (cc'd) so that the community can learn from and participate in discussions.
Thanks, J.
On Dec 5, 2011, at 10:09 AM, David Matthews wrote:
Hi,
We have a local version of Galaxy here at Bristol and it is very nice. However, it does not like to run TopHat on large genomes (i.e. Human) but is happy with small ones (e.g. viruses). The genome we have is hg19 but missing chr Y and this genome works fine on the PSU version. However, we seem to be hitting a wall when we run it here and it seems to relate to the indexing part right at the begining. Here is the error output we see in the red box:
An error occurred running this job:Settings: Output files: "/tmp/tmppA7Hrk/dataset_942.*.ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 5 (one in 32) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt m
And here is the stuff we see when we click on the green bug (the second set is just the last bit of a long list of stuff):
Error indexing reference sequence Returning block of 520230381 Getting block 6 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:59 Sorting block of length 346541282 (Using difference cover) Sorting block time: 00:06:04 Returning block of 346541283 Getting block 7 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:40 Sorting block of length 408062248 (Using difference cover) Sorting block time: 00:07:08 Returning block of 408062249 Exited Ebwt loop fchr[A]: 0 fchr[C]: 837200431 fchr[G]: 1417119193 fchr[T]: 1997326330 fchr[$]: 2835690133 Exiting Ebwt::buildToDisk() Total time for backward call to driver() for mirror index: 01:16:32 TopHat v1.2.0 We are running on our HPC and have directed the job to land on a whole node with 8GB of ram on board. Any ideas why the index is failing?
Best Wishes, David.
__________________________________ Dr David A. Matthews
Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K.
Tel. +44 117 3312058 Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
David, There are many steps in Tophat, only some of which can be parallelized. Hence, utilization appears low in many instances when Tophat is doing a step that cannot be parallelized. I'm not aware of anything that can be done to optimize Tophat further, but you could contact the Tophat authors and ask for their suggestions: tophat.cufflinks@gmail.com If you find anything that works well, please share it with the list. Good luck, J. On Dec 6, 2011, at 4:55 AM, David Matthews wrote:
Hi Jeremy,
Many thanks for this, I'll pass it over to our HPC guys to work on it (my grasp of this is a bit flimsy). On a related note. When tophat is running on one of our nodes it seems to be very slow and when we check the activity on the node we typically see that is is using about 6.8GB of ram and about 16.5 GB of virtual memory with about 1% CPU activity - presumably because there is a lot of I/O which slows down the run. Is this roughly what you would expect on an 8GB node? Each node is 8GB RAM and 1GB RAM per core. Anything we can do at this end to optimise this a bit?
Best Wishes, David.
__________________________________ Dr David A. Matthews
Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K.
Tel. +44 117 3312058 Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
On 5 Dec 2011, at 21:00, Jeremy Goecks wrote:
Hi David,
There should be additional information in the galaxy database about why the job failed; take a look at stderr column of the failed job using some SQL like this:
-- select * from job where state='error' and tool_id='tophat' and stderr like '%indexing reference%' order by id desc; --
If that doesn't work, try something simpler:
-- select * from job where state='error' and tool_id='tophat' order by id desc; --
What are the errors that you see?
Finally, please direct future questions about local Galaxy installations to galaxy-dev (cc'd) so that the community can learn from and participate in discussions.
Thanks, J.
On Dec 5, 2011, at 10:09 AM, David Matthews wrote:
Hi,
We have a local version of Galaxy here at Bristol and it is very nice. However, it does not like to run TopHat on large genomes (i.e. Human) but is happy with small ones (e.g. viruses). The genome we have is hg19 but missing chr Y and this genome works fine on the PSU version. However, we seem to be hitting a wall when we run it here and it seems to relate to the indexing part right at the begining. Here is the error output we see in the red box:
An error occurred running this job:Settings: Output files: "/tmp/tmppA7Hrk/dataset_942.*.ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 5 (one in 32) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt m
And here is the stuff we see when we click on the green bug (the second set is just the last bit of a long list of stuff):
Error indexing reference sequence Returning block of 520230381 Getting block 6 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:59 Sorting block of length 346541282 (Using difference cover) Sorting block time: 00:06:04 Returning block of 346541283 Getting block 7 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:40 Sorting block of length 408062248 (Using difference cover) Sorting block time: 00:07:08 Returning block of 408062249 Exited Ebwt loop fchr[A]: 0 fchr[C]: 837200431 fchr[G]: 1417119193 fchr[T]: 1997326330 fchr[$]: 2835690133 Exiting Ebwt::buildToDisk() Total time for backward call to driver() for mirror index: 01:16:32 TopHat v1.2.0 We are running on our HPC and have directed the job to land on a whole node with 8GB of ram on board. Any ideas why the index is failing?
Best Wishes, David.
__________________________________ Dr David A. Matthews
Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K.
Tel. +44 117 3312058 Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
Hi, So it turns out that your hunch was right. When our HPC guys looked further they saw that EBWT was writing its output to /tmp on the nodes which is tiny (only 2.9GB) - it is now directed to write to /local which is much bigger. We think that should sort the problem out - let you know if it fails to resolve it. Any hints on optimising the Tophat run would be great still. Best Wishes, David. __________________________________ Dr David A. Matthews Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K. Tel. +44 117 3312058 Fax. +44 117 3312091 D.A.Matthews@bristol.ac.uk On 5 Dec 2011, at 21:00, Jeremy Goecks wrote:
Hi David,
There should be additional information in the galaxy database about why the job failed; take a look at stderr column of the failed job using some SQL like this:
-- select * from job where state='error' and tool_id='tophat' and stderr like '%indexing reference%' order by id desc; --
If that doesn't work, try something simpler:
-- select * from job where state='error' and tool_id='tophat' order by id desc; --
What are the errors that you see?
Finally, please direct future questions about local Galaxy installations to galaxy-dev (cc'd) so that the community can learn from and participate in discussions.
Thanks, J.
On Dec 5, 2011, at 10:09 AM, David Matthews wrote:
Hi,
We have a local version of Galaxy here at Bristol and it is very nice. However, it does not like to run TopHat on large genomes (i.e. Human) but is happy with small ones (e.g. viruses). The genome we have is hg19 but missing chr Y and this genome works fine on the PSU version. However, we seem to be hitting a wall when we run it here and it seems to relate to the indexing part right at the begining. Here is the error output we see in the red box:
An error occurred running this job:Settings: Output files: "/tmp/tmppA7Hrk/dataset_942.*.ebwt" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 5 (one in 32) FTable chars: 10 Strings: unpacked Max bucket size: default Max bucket size, sqrt m
And here is the stuff we see when we click on the green bug (the second set is just the last bit of a long list of stuff):
Error indexing reference sequence Returning block of 520230381 Getting block 6 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:59 Sorting block of length 346541282 (Using difference cover) Sorting block time: 00:06:04 Returning block of 346541283 Getting block 7 of 7 Reserving size (531691900) for bucket Calculating Z arrays Calculating Z arrays time: 00:00:00 Entering block accumulator loop: 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Block accumulator loop time: 00:00:40 Sorting block of length 408062248 (Using difference cover) Sorting block time: 00:07:08 Returning block of 408062249 Exited Ebwt loop fchr[A]: 0 fchr[C]: 837200431 fchr[G]: 1417119193 fchr[T]: 1997326330 fchr[$]: 2835690133 Exiting Ebwt::buildToDisk() Total time for backward call to driver() for mirror index: 01:16:32 TopHat v1.2.0 We are running on our HPC and have directed the job to land on a whole node with 8GB of ram on board. Any ideas why the index is failing?
Best Wishes, David.
__________________________________ Dr David A. Matthews
Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K.
Tel. +44 117 3312058 Fax. +44 117 3312091
D.A.Matthews@bristol.ac.uk
participants (2)
-
David Matthews
-
Jeremy Goecks