Hi again, I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is: An error occurred running this job: *No sequences are available for 'hg18', request them by reporting this error.* The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output: python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py --input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat --dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy-dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance. Juan Perin
Hi Juan, I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file. Let us know if that doesn't help, or if you need more information about setting up the loc file and indices. Regards, Kelly On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
The second error may stem from running the tool directly from the command line (not via Galaxy)? If so, it can be run with: % PYTHONPATH=/path/to/galaxy_dist/lib python ... --nate Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Unfortunately that's the first thing I tried. The initial errors seemed to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well. Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info. My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this: index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/ human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ZebrafishFull.fa The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again. Juan On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Hi, I think I may have spotted the problem. It looks like there are two tabs between "hg18" and "/share/apps/genome/human/bowtie/hg18/ hg18.fa", and I think this might cause the loc parsing to behave badly. Otherwise, your loc file looks fine to me (assuming you mean to have it in a directory called bowtie). And hg18 is the correct way to refer to Human Mar. 2006, so everything should be matching up. Let me know if that works or not. Regards, Kelly On Nov 9, 2009, at 2:31 PM, Juan Perin wrote:
Unfortunately that's the first thing I tried. The initial errors seemed to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well.
Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info.
My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this:
index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/ human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ZebrafishFull.fa
The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again.
Juan
On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Boy, i feel stupid. The extra tab was definitely throwing it off, which allowed things to seemingly move forward. Unfortunately it crashed and threw additional errors that seem to be related to PBS: Traceback (most recent call last): File "/home/perin/galaxy-dist/lib/galaxy/jobs/runners/pbs.py", line 448, in finish_job pbs_job_state.job_wrapper.finish( stdout, stderr ) File "/home/perin/galaxy-dist/lib/galaxy/jobs/__init__.py", line 528, in finish dataset.set_meta( overwrite = False ) File "/home/perin/galaxy-dist/lib/galaxy/model/__init__.py", line 539, in set_meta return self.datatype.set_meta( self, **kwd ) File "/home/perin/galaxy-dist/lib/galaxy/datatypes/images.py", line 261, in set_meta except subprocess.CalledProcessError: AttributeError: 'module' object has no attribute 'CalledProcessError' Tool execution generated the following error message: Unable to finish job This was in an attempt to run sam to bam, on a newly aligned sam file from bowtie. I then tried doing this with bwa instead, wondering if the sam output was maybe the problem, but same error. Other jobs seem to run fine otherwise, so I'm inclined to think it is failing elsewhere, not PBS. Sam to interval worked fine, for example. If you click the eye to look at the file the sam to bam output creates, the index file downloaded is in what looks to be some sort of binary, which looks a lot like bam output. It almost seems like its running ok and creating a bam file, but something else throws it off. I don't know where the actual jobs are going to try to look at the possible error output from the submitted job? I'm new to coding in python, so I'll do my best to debug, but thought I'd continue the conversation until I am able to fully configure our systems anyway. Thanks! Juan On Mon, Nov 9, 2009 at 3:05 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote:
Hi,
I think I may have spotted the problem. It looks like there are two tabs between "hg18" and "/share/apps/genome/human/bowtie/hg18/hg18.fa", and I think this might cause the loc parsing to behave badly.
Otherwise, your loc file looks fine to me (assuming you mean to have it in a directory called bowtie). And hg18 is the correct way to refer to Human Mar. 2006, so everything should be matching up.
Let me know if that works or not.
Regards, Kelly
On Nov 9, 2009, at 2:31 PM, Juan Perin wrote:
Unfortunately that's the first thing I tried. The initial errors seemed
to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well.
Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info.
My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this:
index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ZebrafishFull.fa
The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again.
Juan
On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py --input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat --dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy-dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Hi, I'm glad that helped get you a bit further. Now, you are right that it appears that the bam file is being created. It is. What is happening is that the bam index file (which is created basically at the same time the bam file itself is created) is not being created successfully, which means Galaxy thinks the overall bam file creation process failed. subprocess.CalledProcessError is available only in Python 2.5 and higher, so I am guessing you are probably running 2.4. There are four commands that make up the bam index creation, which are wrapped in a try/except with subprocess.CalledProcessError specified for the error message (lines 256-262 in the file lib/galaxy/datatypes/ images.py). In order to get around the problem, you should be able to change from subprocess.check_call to os.system. So change the four command lines subprocess.check_call(['cd', tmp_dir], shell=True) subprocess.check_call('cp %s %s' % (dataset.file_name, tmpf1.name), shell=True) subprocess.check_call('samtools index %s' % tmpf1.name, shell=True) subprocess.check_call('cp %s %s' % (tmpf1bai, index_file.file_name), shell=True) to os.system('cd %s' % tmp_dir) os.system('cp %s %s' % (dataset.file_name, tmpf1.name)) os.system('samtools index %s' % tmpf1.name) os.system('cp %s %s' % (tmpf1bai, index_file.file_name)) and then change the except line except subprocess.CalledProcessError: to except: or except Exception, ex: after which you can print str(ex). If there are still problems, you can try removing the command lines from the try/except block and seeing what happens. Let us know how things go. Regards, Kelly On Nov 9, 2009, at 5:36 PM, juan perin wrote:
Boy, i feel stupid. The extra tab was definitely throwing it off, which allowed things to seemingly move forward. Unfortunately it crashed and threw additional errors that seem to be related to PBS:
Traceback (most recent call last): File "/home/perin/galaxy-dist/lib/galaxy/jobs/runners/pbs.py", line 448, in finish_job pbs_job_state.job_wrapper.finish( stdout, stderr ) File "/home/perin/galaxy-dist/lib/galaxy/jobs/__init__.py", line 528, in finish dataset.set_meta( overwrite = False ) File "/home/perin/galaxy-dist/lib/galaxy/model/__init__.py", line 539, in set_meta return self.datatype.set_meta( self, **kwd ) File "/home/perin/galaxy-dist/lib/galaxy/datatypes/images.py", line 261, in set_meta except subprocess.CalledProcessError: AttributeError: 'module' object has no attribute 'CalledProcessError' Tool execution generated the following error message:
Unable to finish job
This was in an attempt to run sam to bam, on a newly aligned sam file from bowtie. I then tried doing this with bwa instead, wondering if the sam output was maybe the problem, but same error. Other jobs seem to run fine otherwise, so I'm inclined to think it is failing elsewhere, not PBS. Sam to interval worked fine, for example. If you click the eye to look at the file the sam to bam output creates, the index file downloaded is in what looks to be some sort of binary, which looks a lot like bam output. It almost seems like its running ok and creating a bam file, but something else throws it off. I don't know where the actual jobs are going to try to look at the possible error output from the submitted job? I'm new to coding in python, so I'll do my best to debug, but thought I'd continue the conversation until I am able to fully configure our systems anyway. Thanks!
Juan
On Mon, Nov 9, 2009 at 3:05 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote: Hi,
I think I may have spotted the problem. It looks like there are two tabs between "hg18" and "/share/apps/genome/human/bowtie/hg18/ hg18.fa", and I think this might cause the loc parsing to behave badly.
Otherwise, your loc file looks fine to me (assuming you mean to have it in a directory called bowtie). And hg18 is the correct way to refer to Human Mar. 2006, so everything should be matching up.
Let me know if that works or not.
Regards, Kelly
On Nov 9, 2009, at 2:31 PM, Juan Perin wrote:
Unfortunately that's the first thing I tried. The initial errors seemed to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well.
Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info.
My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this:
index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/ human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ ZebrafishFull.fa
The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again.
Juan
On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Everyone dealing with BAM data, Note that I will be making these changes with the BAM datatype and subprocess to the distribution shortly, since we do still support 2.4. Kelly Begin forwarded message:
From: Kelly Vincent <kpvincent@bx.psu.edu> Date: November 10, 2009 1:44:03 AM EST To: juan perin <juanperin@gmail.com> Cc: galaxy-dev@bx.psu.edu Subject: Re: [galaxy-dev] Samtools configuration
Hi,
I'm glad that helped get you a bit further. Now, you are right that it appears that the bam file is being created. It is. What is happening is that the bam index file (which is created basically at the same time the bam file itself is created) is not being created successfully, which means Galaxy thinks the overall bam file creation process failed. subprocess.CalledProcessError is available only in Python 2.5 and higher, so I am guessing you are probably running 2.4. There are four commands that make up the bam index creation, which are wrapped in a try/except with subprocess.CalledProcessError specified for the error message (lines 256-262 in the file lib/galaxy/datatypes/images.py). In order to get around the problem, you should be able to change from subprocess.check_call to os.system. So change the four command lines subprocess.check_call(['cd', tmp_dir], shell=True) subprocess.check_call('cp %s %s' % (dataset.file_name, tmpf1.name), shell=True) subprocess.check_call('samtools index %s' % tmpf1.name, shell=True) subprocess.check_call('cp %s %s' % (tmpf1bai, index_file.file_name), shell=True) to os.system('cd %s' % tmp_dir) os.system('cp %s %s' % (dataset.file_name, tmpf1.name)) os.system('samtools index %s' % tmpf1.name) os.system('cp %s %s' % (tmpf1bai, index_file.file_name)) and then change the except line except subprocess.CalledProcessError: to except: or except Exception, ex: after which you can print str(ex). If there are still problems, you can try removing the command lines from the try/except block and seeing what happens.
Let us know how things go.
Regards, Kelly
On Nov 9, 2009, at 5:36 PM, juan perin wrote:
Boy, i feel stupid. The extra tab was definitely throwing it off, which allowed things to seemingly move forward. Unfortunately it crashed and threw additional errors that seem to be related to PBS:
Traceback (most recent call last): File "/home/perin/galaxy-dist/lib/galaxy/jobs/runners/pbs.py", line 448, in finish_job pbs_job_state.job_wrapper.finish( stdout, stderr ) File "/home/perin/galaxy-dist/lib/galaxy/jobs/__init__.py", line 528, in finish dataset.set_meta( overwrite = False ) File "/home/perin/galaxy-dist/lib/galaxy/model/__init__.py", line 539, in set_meta return self.datatype.set_meta( self, **kwd ) File "/home/perin/galaxy-dist/lib/galaxy/datatypes/images.py", line 261, in set_meta except subprocess.CalledProcessError: AttributeError: 'module' object has no attribute 'CalledProcessError' Tool execution generated the following error message:
Unable to finish job
This was in an attempt to run sam to bam, on a newly aligned sam file from bowtie. I then tried doing this with bwa instead, wondering if the sam output was maybe the problem, but same error. Other jobs seem to run fine otherwise, so I'm inclined to think it is failing elsewhere, not PBS. Sam to interval worked fine, for example. If you click the eye to look at the file the sam to bam output creates, the index file downloaded is in what looks to be some sort of binary, which looks a lot like bam output. It almost seems like its running ok and creating a bam file, but something else throws it off. I don't know where the actual jobs are going to try to look at the possible error output from the submitted job? I'm new to coding in python, so I'll do my best to debug, but thought I'd continue the conversation until I am able to fully configure our systems anyway. Thanks!
Juan
On Mon, Nov 9, 2009 at 3:05 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote: Hi,
I think I may have spotted the problem. It looks like there are two tabs between "hg18" and "/share/apps/genome/human/bowtie/hg18/ hg18.fa", and I think this might cause the loc parsing to behave badly.
Otherwise, your loc file looks fine to me (assuming you mean to have it in a directory called bowtie). And hg18 is the correct way to refer to Human Mar. 2006, so everything should be matching up.
Let me know if that works or not.
Regards, Kelly
On Nov 9, 2009, at 2:31 PM, Juan Perin wrote:
Unfortunately that's the first thing I tried. The initial errors seemed to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well.
Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info.
My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this:
index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/ human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ ZebrafishFull.fa
The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again.
Juan
On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Interesting. I am in fact running 2.4. Would it make more sense to just update my python version? I can do this with Yum easily. I'm fine editing the code, but if in the long run its better to just update, I'll go for that instead? Juan On Tue, Nov 10, 2009 at 1:44 AM, Kelly Vincent <kpvincent@bx.psu.edu> wrote:
Hi,
I'm glad that helped get you a bit further. Now, you are right that it appears that the bam file is being created. It is. What is happening is that the bam index file (which is created basically at the same time the bam file itself is created) is not being created successfully, which means Galaxy thinks the overall bam file creation process failed. subprocess.CalledProcessError is available only in Python 2.5 and higher, so I am guessing you are probably running 2.4. There are four commands that make up the bam index creation, which are wrapped in a try/except with subprocess.CalledProcessError specified for the error message (lines 256-262 in the file lib/galaxy/datatypes/images.py). In order to get around the problem, you should be able to change from subprocess.check_call to os.system. So change the four command lines
subprocess.check_call(['cd', tmp_dir], shell=True) subprocess.check_call('cp %s %s' % (dataset.file_name, tmpf1.name), shell=True) subprocess.check_call('samtools index %s' % tmpf1.name, shell=True) subprocess.check_call('cp %s %s' % (tmpf1bai, index_file.file_name), shell=True) to os.system('cd %s' % tmp_dir) os.system('cp %s %s' % (dataset.file_name, tmpf1.name)) os.system('samtools index %s' % tmpf1.name) os.system('cp %s %s' % (tmpf1bai, index_file.file_name)) and then change the except line except subprocess.CalledProcessError: to except: or except Exception, ex: after which you can print str(ex). If there are still problems, you can try removing the command lines from the try/except block and seeing what happens.
Let us know how things go.
Regards, Kelly
On Nov 9, 2009, at 5:36 PM, juan perin wrote:
Boy, i feel stupid. The extra tab was definitely throwing it off, which
allowed things to seemingly move forward. Unfortunately it crashed and threw additional errors that seem to be related to PBS:
Traceback (most recent call last): File "/home/perin/galaxy-dist/lib/galaxy/jobs/runners/pbs.py", line 448, in finish_job pbs_job_state.job_wrapper.finish( stdout, stderr ) File "/home/perin/galaxy-dist/lib/galaxy/jobs/__init__.py", line 528, in finish dataset.set_meta( overwrite = False ) File "/home/perin/galaxy-dist/lib/galaxy/model/__init__.py", line 539, in set_meta return self.datatype.set_meta( self, **kwd ) File "/home/perin/galaxy-dist/lib/galaxy/datatypes/images.py", line 261, in set_meta except subprocess.CalledProcessError: AttributeError: 'module' object has no attribute 'CalledProcessError' Tool execution generated the following error message:
Unable to finish job
This was in an attempt to run sam to bam, on a newly aligned sam file from bowtie. I then tried doing this with bwa instead, wondering if the sam output was maybe the problem, but same error. Other jobs seem to run fine otherwise, so I'm inclined to think it is failing elsewhere, not PBS. Sam to interval worked fine, for example. If you click the eye to look at the file the sam to bam output creates, the index file downloaded is in what looks to be some sort of binary, which looks a lot like bam output. It almost seems like its running ok and creating a bam file, but something else throws it off. I don't know where the actual jobs are going to try to look at the possible error output from the submitted job? I'm new to coding in python, so I'll do my best to debug, but thought I'd continue the conversation until I am able to fully configure our systems anyway. Thanks!
Juan
On Mon, Nov 9, 2009 at 3:05 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote: Hi,
I think I may have spotted the problem. It looks like there are two tabs between "hg18" and "/share/apps/genome/human/bowtie/hg18/hg18.fa", and I think this might cause the loc parsing to behave badly.
Otherwise, your loc file looks fine to me (assuming you mean to have it in a directory called bowtie). And hg18 is the correct way to refer to Human Mar. 2006, so everything should be matching up.
Let me know if that works or not.
Regards, Kelly
On Nov 9, 2009, at 2:31 PM, Juan Perin wrote:
Unfortunately that's the first thing I tried. The initial errors seemed to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well.
Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info.
My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this:
index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ZebrafishFull.fa
The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again.
Juan
On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py --input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat --dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy-dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Juan, we still support Python 2.4 with Galaxy, we'll adjust this tool to work with either Python 2.4 or Python 2.5, and it will be available in the distribution with these changes soon. On Nov 10, 2009, at 10:29 AM, juan perin wrote:
Interesting. I am in fact running 2.4. Would it make more sense to just update my python version? I can do this with Yum easily. I'm fine editing the code, but if in the long run its better to just update, I'll go for that instead?
Juan
On Tue, Nov 10, 2009 at 1:44 AM, Kelly Vincent <kpvincent@bx.psu.edu> wrote: Hi,
I'm glad that helped get you a bit further. Now, you are right that it appears that the bam file is being created. It is. What is happening is that the bam index file (which is created basically at the same time the bam file itself is created) is not being created successfully, which means Galaxy thinks the overall bam file creation process failed. subprocess.CalledProcessError is available only in Python 2.5 and higher, so I am guessing you are probably running 2.4. There are four commands that make up the bam index creation, which are wrapped in a try/except with subprocess.CalledProcessError specified for the error message (lines 256-262 in the file lib/galaxy/datatypes/images.py). In order to get around the problem, you should be able to change from subprocess.check_call to os.system. So change the four command lines
subprocess.check_call(['cd', tmp_dir], shell=True) subprocess.check_call('cp %s %s' % (dataset.file_name, tmpf1.name), shell=True) subprocess.check_call('samtools index %s' % tmpf1.name, shell=True) subprocess.check_call('cp %s %s' % (tmpf1bai, index_file.file_name), shell=True) to os.system('cd %s' % tmp_dir) os.system('cp %s %s' % (dataset.file_name, tmpf1.name)) os.system('samtools index %s' % tmpf1.name) os.system('cp %s %s' % (tmpf1bai, index_file.file_name)) and then change the except line except subprocess.CalledProcessError: to except: or except Exception, ex: after which you can print str(ex). If there are still problems, you can try removing the command lines from the try/except block and seeing what happens.
Let us know how things go.
Regards, Kelly
On Nov 9, 2009, at 5:36 PM, juan perin wrote:
Boy, i feel stupid. The extra tab was definitely throwing it off, which allowed things to seemingly move forward. Unfortunately it crashed and threw additional errors that seem to be related to PBS:
Traceback (most recent call last): File "/home/perin/galaxy-dist/lib/galaxy/jobs/runners/pbs.py", line 448, in finish_job pbs_job_state.job_wrapper.finish( stdout, stderr ) File "/home/perin/galaxy-dist/lib/galaxy/jobs/__init__.py", line 528, in finish dataset.set_meta( overwrite = False ) File "/home/perin/galaxy-dist/lib/galaxy/model/__init__.py", line 539, in set_meta return self.datatype.set_meta( self, **kwd ) File "/home/perin/galaxy-dist/lib/galaxy/datatypes/images.py", line 261, in set_meta except subprocess.CalledProcessError: AttributeError: 'module' object has no attribute 'CalledProcessError' Tool execution generated the following error message:
Unable to finish job
This was in an attempt to run sam to bam, on a newly aligned sam file from bowtie. I then tried doing this with bwa instead, wondering if the sam output was maybe the problem, but same error. Other jobs seem to run fine otherwise, so I'm inclined to think it is failing elsewhere, not PBS. Sam to interval worked fine, for example. If you click the eye to look at the file the sam to bam output creates, the index file downloaded is in what looks to be some sort of binary, which looks a lot like bam output. It almost seems like its running ok and creating a bam file, but something else throws it off. I don't know where the actual jobs are going to try to look at the possible error output from the submitted job? I'm new to coding in python, so I'll do my best to debug, but thought I'd continue the conversation until I am able to fully configure our systems anyway. Thanks!
Juan
On Mon, Nov 9, 2009 at 3:05 PM, Kelly Vincent <kpvincent@bx.psu.edu> wrote: Hi,
I think I may have spotted the problem. It looks like there are two tabs between "hg18" and "/share/apps/genome/human/bowtie/hg18/ hg18.fa", and I think this might cause the loc parsing to behave badly.
Otherwise, your loc file looks fine to me (assuming you mean to have it in a directory called bowtie). And hg18 is the correct way to refer to Human Mar. 2006, so everything should be matching up.
Let me know if that works or not.
Regards, Kelly
On Nov 9, 2009, at 2:31 PM, Juan Perin wrote:
Unfortunately that's the first thing I tried. The initial errors seemed to suggest that from the web interface. However, I made the addition of the reference and the indexed file (.fai) in my .loc file. I even went back to ensure I started the alignment from the beginning, just to be sure that the alignment was associated with the right reference file. I definitely have the two files (.fa and .fai) in the specified location and they are both fully readable by all users. I've tried this with two different reference sets as well.
Is there anywhere else I might find error oupput? Its difficult to traverse through the running output from run.sh, as it contains all the SQL etc... and the output that does exist seems to be sparce. The python error that I sent, which Nate, pointed out would fail on the command line without the proper env. variable was just a bad lead. I haven't gotten that to work on the command line yet, however, to see if it may give me more info.
My only guess, at the moment, is that maybe the reference that my alignment is associated to (Human Mar. 2006 (hg18)) does not match up with the hg18 that I have listed in my .loc files? My loc file entry looks like this:
index hg18 /share/apps/genome/human/bowtie/hg18/hg18.fa index Human_male_hg18 /share/apps/genome/human/bwa/hg18_male/ human_b36_male.fa index ZebraFish_v8 /share/apps/genome/zebrafish/v8/ ZebrafishFull.fa
The dropdown menu that tells Galaxy which build the alignment refers to is based on the original hard coded hg18 listed above, so I'm thinking either the way I've named it is wrong and Galaxy doesn't know that my hg18 is the same as the hg18 listed, or I'm doing something else totally wrong. Thanks again.
Juan
On Nov 9, 2009, at 1:18 PM, Kelly Vincent wrote:
Hi Juan,
I believe that the problem is actually not the error that is printing below. Do you have the Samtools indices in place (specifically for hg18) and the loc file sam_fa_indices.loc created in the tool-data directory? Specifically, from my experience, it appears that you have to have both the index (hg18.fa.fai) and the original fasta file (hg18.fa) in the directory you specify in the loc file.
Let us know if that doesn't help, or if you need more information about setting up the loc file and indices.
Regards, Kelly
On Nov 9, 2009, at 10:59 AM, juan perin wrote:
Hi again,
I'm now stuck trying to get the samtools app's to work. I've finally gotten the bwa and bowtie tools to run an alignment, but my attempts to convert the resulting sam file to bam, using sam_to_bam fails. The error I get in the browser is:
An error occurred running this job: No sequences are available for 'hg18', request them by reporting this error.
The error, (after searching through the output for the command), seemed to suggest that I was missing an 'egg': This was the output:
python /home/perin/galaxy-dist/tools/samtools/sam_to_bam.py -- input1=/home/perin/galaxy-dist/database/files/000/dataset_70.dat -- dbkey=hg18 --ref_file="None" --output1=/home/perin/galaxy-dist/ database/files/000/dataset_72.dat --index_dir=/home/perin/galaxy- dist/tool-data Traceback (most recent call last): File "/home/perin/galaxy-dist/tools/samtools/sam_to_bam.py", line 17, in ? from galaxy import eggs ImportError: No module named galaxy
I'm guessing I'm either missing a config parameter that tells samtools how to work, or that I need to scramble another egg. Not sure, however, and was hoping for some help. I can't seem to find anything on the web. Thanks in advance.
Juan Perin
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
_______________________________________________ galaxy-dev mailing list galaxy-dev@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-dev
Greg Von Kuster Galaxy Development Team greg@bx.psu.edu
participants (5)
-
Greg Von Kuster
-
Juan Perin
-
juan perin
-
Kelly Vincent
-
Nate Coraor