Hello, I am writing some code to enable parallelization for some tool wrappers. First, I did it for simple bwa wrapper, but now I am modifying toolshed.g2.bx.psu.edu/repos/devteam/bwa/c71dd035971e/bwa/bwa-mem.xml to check if the code would work with this wrapper. So, I wrote some code that I thing was necessary in order to merge some bam and I added the parallelism tag (in bold) to the config file: <tool id="bwa_mem" name="BWA-MEM" version="0.1"> <macros> <import>bwa_macros.xml</import> </macros> <requirements> <requirement type="package" version="0.7.10.039ea20639">bwa</requirement> <requirement type="package" version="1.1">samtools</requirement> </requirements> <description>- map medium and long reads (> 100 bp) against reference genome</description> *<parallelism method="multi" split_size="3" shared_inputs="ref_file" split_mode="number_of_parts" merge_outputs="bam_output" split_inputs="fastq_input1,fastq_input2" ></parallelism>* <command> ... So, everything works well, and the resulting bam from parallelization mode and without the parallelization mode is the same but the Galaxy log throws an error regarding metadata, it says something like this: galaxy.jobs.splitters.multi DEBUG 2015-04-17 09:54:58,335 merge finished: /home/ralonso/galaxy/database/files/000/dataset_198.dat galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,473 executing external set_meta script for job 200: python /home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py /home/ralonso/galaxy/database/tmp/tmpHS8Byo /home/ralonso/galaxy/database/job_working_directory/000/200/galaxy.json /home/ralonso/galaxy/database/tmp/metadata_in_HistoryDatasetAssociation_198_yOGiQG,/home/ralonso/galaxy/database/tmp/metadata_kwds_HistoryDatasetAssociation_198_nAsQoq,/home/ralonso/galaxy/database/tmp/metadata_out_HistoryDatasetAssociation_198_I_cLs4,/home/ralonso/galaxy/database/tmp/metadata_results_HistoryDatasetAssociation_198_qhjzoV,/home/ralonso/galaxy/database/files/000/dataset_198.dat,/home/ralonso/galaxy/database/tmp/metadata_override_HistoryDatasetAssociation_198_ScKLqH Traceback (most recent call last): File "/home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py", line 1, in <module> from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata() ImportError: No module named galaxy_ext.metadata.set_metadata galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,624 execution of external set_meta finished for job 200 *galaxy.datatypes.metadata DEBUG 2015-04-17 09:54:58,714 setting metadata externally failed for HistoryDatasetAssociation 198: External set_meta() not called* When using no parallelization mode, there is no problem, also because Galaxy doesn't go through this part of code, I mean it doesn't execute this. I see that Galaxy have to do something with metada attributes, but what is t trying to do? is there any way to solve this? Thank you very much Regards, Roberto
Thanks for the report! I can confirm the bug and I opened a pull request with the fixes here https://github.com/galaxyproject/galaxy/pull/139. -John On Fri, Apr 17, 2015 at 4:20 AM, Roberto Alonso <roalva1@gmail.com> wrote:
Hello,
I am writing some code to enable parallelization for some tool wrappers. First, I did it for simple bwa wrapper, but now I am modifying toolshed.g2.bx.psu.edu/repos/devteam/bwa/c71dd035971e/bwa/bwa-mem.xml to check if the code would work with this wrapper. So, I wrote some code that I thing was necessary in order to merge some bam and I added the parallelism tag (in bold) to the config file:
<tool id="bwa_mem" name="BWA-MEM" version="0.1">
<macros> <import>bwa_macros.xml</import> </macros>
<requirements> <requirement type="package" version="0.7.10.039ea20639">bwa</requirement> <requirement type="package" version="1.1">samtools</requirement> </requirements> <description>- map medium and long reads (> 100 bp) against reference genome</description> <parallelism method="multi" split_size="3" shared_inputs="ref_file" split_mode="number_of_parts" merge_outputs="bam_output" split_inputs="fastq_input1,fastq_input2" ></parallelism>
<command> ...
So, everything works well, and the resulting bam from parallelization mode and without the parallelization mode is the same but the Galaxy log throws an error regarding metadata, it says something like this:
galaxy.jobs.splitters.multi DEBUG 2015-04-17 09:54:58,335 merge finished: /home/ralonso/galaxy/database/files/000/dataset_198.dat galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,473 executing external set_meta script for job 200: python /home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py /home/ralonso/galaxy/database/tmp/tmpHS8Byo /home/ralonso/galaxy/database/job_working_directory/000/200/galaxy.json /home/ralonso/galaxy/database/tmp/metadata_in_HistoryDatasetAssociation_198_yOGiQG,/home/ralonso/galaxy/database/tmp/metadata_kwds_HistoryDatasetAssociation_198_nAsQoq,/home/ralonso/galaxy/database/tmp/metadata_out_HistoryDatasetAssociation_198_I_cLs4,/home/ralonso/galaxy/database/tmp/metadata_results_HistoryDatasetAssociation_198_qhjzoV,/home/ralonso/galaxy/database/files/000/dataset_198.dat,/home/ralonso/galaxy/database/tmp/metadata_override_HistoryDatasetAssociation_198_ScKLqH Traceback (most recent call last): File "/home/ralonso/galaxy/database/tmp/set_metadata_E5fGIE.py", line 1, in <module> from galaxy_ext.metadata.set_metadata import set_metadata; set_metadata() ImportError: No module named galaxy_ext.metadata.set_metadata galaxy.jobs.runners.tasks DEBUG 2015-04-17 09:54:58,624 execution of external set_meta finished for job 200 galaxy.datatypes.metadata DEBUG 2015-04-17 09:54:58,714 setting metadata externally failed for HistoryDatasetAssociation 198: External set_meta() not called
When using no parallelization mode, there is no problem, also because Galaxy doesn't go through this part of code, I mean it doesn't execute this. I see that Galaxy have to do something with metada attributes, but what is t trying to do? is there any way to solve this?
Thank you very much
Regards, Roberto
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
John Chilton
-
Roberto Alonso