So Ihave finally determined the cause and found a resolution for the snpEff/java problem. The issue was the fact that the vrsion of Java on the Ubuntu linux VM from AWS is 1.6.x snpEff 4.0e requires version 1.7.0 or higher. This is fine, but you willneed to manually upgrade the java version. To do so you will need to complete the following:
SSH to the VM
sdo su - root
CD to /mnt/galaxy/
Complete the following steps:
1. sudo apt-get purge openjdk* (This removes java 1.6 completely) 2. Modify this file: vi /etc/apt/sources.list.d/cloudbiolinu.list (Remove the leading section of line 11) 3. sudo add-apt-repository ppa:webupd8team/java 4. sudo apt-get update 5. sudo apt-get installoracle-java7-installer 6. java -version
Now if you plan to add nodes via the Cloudman Console you will need to perform these tasks for each node you install. I worked with AWS Support to setup an "Auto Scaling Group" to accommodate this process. This required getting my Master instance upgraded and creating an AMI from it.
From that point you can build the group based on the following:
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/creating-your- auto-scaling-groups.html
This was, in theory, a great idea. However, it did not work for me. Each of the node that were generated by this tool had Java version 1.6.x and this caused snpEff t fail. My recommendation is that if you have time you should paly around with this more, but I did not have that luxury for this project.
Iry
On 10/16/14 12:00 PM, "galaxy-dev-request@lists.bx.psu.edu" galaxy-dev-request@lists.bx.psu.edu wrote:
Send galaxy-dev mailing list submissions to galaxy-dev@lists.bx.psu.edu
To subscribe or unsubscribe via the Worl Wide Web, visit http://lists.bx.psu.edu/listinfo/galaxy-dev or, via email, send a message with subject or body 'help' to galaxy-ev-request@lists.bx.psu.edu
You can reach the person managing the list at galaxy-dev-owner@lists.bx.psu.edu
When replying, please edit your Subjct line so it is more specific than "Re: Contents of galaxy-dev digest..."
HEY! This is important! If you reply to a thread in a igest, please
- Change the subject of your response from "Galaxy-dev Digest Vol ..."
to the original subject for the thread. 2. Strip out everything else in the dgest that is not part of the thread you are responding to.
Why?
- This will keep the subject meaningful. People will have some idea>from the subject line if they should read it or not.
- Not doing this greatly increases the number of emails that match
search ueries, but that aren't actually informative.
Today's Topics:
- Re: Set output dbkey from parameter value (Daniel Blankenberg)> 2. Re: Set output dbkey from parameter value (Nikos Sidiropoulos)
- Re: snpeff tool for Galay extra_files_path (John Chilton)
- Re: snpeff tool for Galaxy extra_files_path (Bj?rn Gr?nng)
- Re: snpeff tool for Galaxy extra_files_path (John Chilton)
- Re: snpeff tool for Galaxy extra_files_path (Jim Johnson)
- snpEff and java issue(Iry Witham)
- Re: HOWTO share tool parameter settings? (Lukasse, Pieter)
9.Help with Galaxy server migration (Sarah Diehl) 10. Re: Help with Galaxy server migrtion (John Chilton) 11. Login issue with a nginx proxy (Alexandre Loywick) 12. Re: Help with Galaxy server migration (Sarah Diehl)
Message: 1 Date: Wed, 15 Oct 2014 12:14:28 -0400 From: Daniel Blankenberg dan@bx.psu.edu To: Nikos Sidiropoulos nikos.sidiro@gmail.com Cc: "galaxy-dev@bx.psu.edu" galaxy-dev@bx.psu.edu Subject: Re: [galaxy-dev] Set output dbkey from parameter value Message-ID: E77C8A92-617A-455E-B5EB-209F4505BECA@bx.psu.edu Content-Type: text/plain; charset=windows-1252
Does removing the ?param_attribute=?value"' attribute help?
On Oct 15, 2014, at 11:23 AM, Nikos Sidiropoulos nikos.sidiro@gmail.com wrote:
Hi Daniel,
Thanks for the response.
I've edited the output to:
<data format="bedgraph" name="bedgraph_slograt"
label="${tool.name} on ${on_string}: Smoot Log2ratio (bedGraph)" from_work_dir="output_dir/slograt.bedgraph"> <filter> bedgraph['check'] == 'yes' and slograt['check'] == 'yes' </filter> <actions> <conditional name="bedgraph.check"> <when value="yes"> <action type="metadata" name="dbkey"> <option type="from_param" name="bedgraph.genome" param_attribute="value" /> </action> </when> </conditional> </actions> </data>
Now I'm getting a tool execution error.
Error executing tool: 'unicode' object has no attribute 'value'
I've tried to change the param_attribute to "ext", "dbkey" (ones that I know that exist) and got a similar error.
Bests, Nikos
On 15 October 2014 16:58, Daniel Blankenberg dan@bx.psu.edu wrote:
Hi Nikos,
In the very least, you?ll want to make sure that you have a bounding <actions></actions> tag set around your actions. It is probably also advisable to add a set of conditional/whens around the action, since you?re only setting the dbkey under certain circumstances.
Thanks for using Galaxy,
Dan
On Oct 15, 2014, at 6:37 AM, Nikos Sidiropoulos nikos.sidiro@gmail.com wote:
Hi all,
I'm trying to set the dbkey of an output file from the value (text) of a parameter.
The parameter I want to use is genome.
<conditional name="bedgraph"> <param name="check" type="select" label="Produce BedGraph
output" help="Can be displayed directly on UCSC browser. One file per normalisation method." > <option value="no" selected="True">No</option> <option value="yes">Yes</option> </param> <when value="yes"> <param name="bed_file" type="data" format="bed" label="Transcripts ins BED format" help="12 column BED file containing trancript definitions." /> <param name="genome" type="text" label="Genome Build" help="E.g. hg19" /> <param name="track_name" type="text" label="Track Name" size="20" value="Track Name" /> </when> <when value="no" /> </conditional>
and this is how I've set the output:
<data format="bedgraph" name="bedgraph_slograt"
label="${tool.name} on ${on_string}: Smoot Log2ratio (bedGraph)" from_work_dir="output_dir/slograt.bedgraph"> <filter> bedgraph['check'] == 'yes' and slograt['check'] == 'yes' </filter> <action type="metadata" name="dbkey"> <option type="from_param" name="bedgraph.genome" param_attribute="value" /> </action> </data>
When I run the tool the dbkey isn't set to the output file. Does anyone know a workaround?
Bests, Nikos ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Message: 2 Date: Wed, 15 Oct 2014 18:43:17 +0200 From: Nikos Sidiropoulos nikos.sidiro@gmail.com To: Daniel Blankenberg dan@bx.psu.edu Cc: "galaxy-dev@bx.psu.edu" galaxy-dev@bx.psu.edu Subject: Re: [galaxy-dev] Set output dbkey from parameter value Message-ID: CAMjao2O1iw6ZHBrmuJ3VMH8kYe+7dVp2wk7sdOMb32OodFNgiw@mail.gmail.com Content-Type: text/plain; charset=UTF-8
Yes!
Thank you very much Daniel.
On 15 October 2014 18:14, Daniel Blankenberg dan@bx.psu.edu wrote:
Does removing the ?param_attribute=?value"' attribute help?
On Oct 15, 2014, at 11:23 AM, Nikos Sidiropoulos nikos.sidiro@gmail.com wrote:
Hi Daniel,
Thanks for the response.
I've edited the output to:
<data format="bedgraph" name="bedgraph_slograt"
label="${tool.name} on ${on_string}: Smoot Log2ratio (bedGraph)" from_work_dir="output_dir/slograt.bedgraph"> <filter> bedgraph['check'] == 'yes' and slograt['check'] == 'yes' </filter> <actions> <conditional name="bedgraph.check"> <when value="yes"> <action type="metadata" name="dbkey"> <option type="from_param" name="bedgraph.genome" param_attribute="value" /> </action> </when> </conditional> </actions> </data>
Now I'm getting a tool execution error.
Error executing tool: 'unicode' object has no attribute 'value'
I've tried to change the param_attribute to "ext", "dbkey" (ones that I know that exist) and got a similar error.
Bests, Nikos
On 15 October 2014 16:58, Daniel Blankenberg dan@bx.psu.edu wrote:
Hi Nikos,
In the very least, you?ll want to make sure that you have a bounding <actions></actions> tag set around your actions. It is probably also advisable to add a set of conditional/whens around the action, since you?re only setting the dbkey under certain circumstances.
Thanks for using Galaxy,
Dan
On Oct 15, 2014, at 6:37 AM, Nikos Sidiropoulos nikos.sidiro@gmail.com wrote:
Hi all,
I'm tryingto set the dbkey of an output file from the value (text) of a parameter.
The parameter I want to use is genome.
<conditional name="bedgraph">
<param name="ceck" type="select" label="Produce BedGraph
output" help="Can be displayed directly on UCSC browser. One file per normalisation method." > <option value="no" selected="True">No</option> <option value="yes">Yes</option> </param> <when value="yes"> <param name="bed_file" type="data" format="bed" label="Transcripts ins BED format" help="12 column BED file containing trancript definitions." /> <param name="genome" type="text" label="Genome Build" help="E.g. hg19" /> <param name="track_name" type="text" label="Track Name" size="20" value="Track Name" /> </when> <when value="no" /> </conditional>
and this is how I've set the output:
<data format="bedgraph" name="bedgraph_slograt"
label="${tool.name} on ${on_string}: Smoot Log2ratio (bedGraph)" from_work_dir="output_dir/slograt.bedgraph"> <filter> bedgraph['check'] == 'yes' and slograt['check'] == 'yes' </filter> <action type="metadata" name="dbkey"> <option type="from_param" name="bedgraph.genome" param_attribute="value" /> </action> </data>
When I run the tool the dbkey isn't set to the output file. Does anyone know a workaround?
Bests, Nikos ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Message: 3 Date: Wed, 15 Oct 2014 13:05:55 -0400 From: John Chilton jmchilton@gmail.com To: Jim Johnson jj@umn.edu Cc: "galaxy-dev@lists.bx.psu.edu" galaxy-dev@lists.bx.psu.edu, Nuwan Goonasekera nuwan.goonasekera@unimelb.edu.au, Andrew Lonie alonie@unimelb.edu.au Subject: Re: [galaxy-dev] snpeff tool for Galaxy extra_files_path Message-ID: CANwbokevqaN=G_cDTet7E-s25cRZ2No1At-xaiAJmi_t_tSUTg@mail.gmail.com Content-Type: text/plain; charset=UTF-8
Hey JJ,
Opened a pull request to stable with my best guess at the right to proceed and hopefully a best practice recommendation we can all get behind. Do you want to try it out and let me know if it fixes snpeff? (It does fix the velvet datatypes you contributed to Galaxy.)
https://bitbucket.org/galaxy/galaxy-central/pull-request/532/fix-for-datat ypes-consuming-output-extra/diff
Dan Bjoern - does this make sense - can we move forward with this approach ($input.extra_files_path or inputs and $output.files_path for outputs) as the best practices forhow to reference these directories.
-John
On Wed, Oct 15, 2014 at 1144 AM, Jim Johnson johns198@umn.edu wrote:
I agree with you about theinadvisable use of: input.dataset.*.
I'm looking at:
lib/galay/model/__init__.py class Dataset( object ): ... def __iit__( self, id=None, state=None, external_filename=None, extra_files_path=None, file_size=None, purgable=True, uid=None ): ... self._extra_files_path = extra_fils_path ... @property def extra_files_path( self ):>> return self.object_store.get_filename( self, dir_only=True, extra_dir=self._extrafiles_path or "dataset_%d_files" % self.id )
I'm trying to see whenself._extra_files_path gets set. Otherwise, would this return the path rlative to the current file location of dataset?
On 0/15/14, 9:36 AM, John Chilton wrote:
Okay - so this is what boke things:
https://bitbucket.org/galaxy/galaxy-central/commits/d781366bc120787e201b 73a4dd99b56282169d86
My feeling with the commit was that wrappers and tools should nevr be explicitly accessing paths explicitly through input.dataset.*. I tink this would circumvent options like outputs_to_working_directory and>>> break remote job execution through Pulsar. It also breaks the object store abstraction I think - which is why I made the change for Bjoer I guess.
I did not (and this was stupid on my part) realize hat datatype code would be running on the remote host and accessng these model properties directly outside the abstractions setup by the wrappers supplied to cheetah code and s they have become out of sync as of that commit.
I am thinking somehow changing what the datatype code gets is the right approach and not fixing things by circumvent the wrapper and accessing properties directly on the datset. Since you will find that doing this breaks things for Bjoern object store and could probably never run on usegalaxy.org say for the same reason.
Too many different competng deployment options all being incompatible with each other :(.
Will keep thinking about this and respond again.
-John
On Wed, Oct 15, 2014 at 9:39 AM, John Chilton jmchilton@gmail.com wrote:
JJ,
Arg this is a mess. I am very sorry about this - Istill don't understand extra_files_path versus files_path myself. There are open questions on Peter's blast repo and no one ever followed up on my object store questions about this with Bjoern's issues a couple release cycles ago. We need to get these to work - write documetation explicitly declaring best practices we can all agree on and then write some tests to ensure things don't break in the future.
When you say your tools broke recently - can you say fo certain which release broke these - the August14, October14, something older?
I'll try to do some more research and get back to you.
-John
On Tue, Oct 14, 2014 at 6:04 AM, Jim Johnson johns198@umn.edu wrote:
Andrew,
Thanks for investigating this. I changed the subject and sent to the galaxy dev lst.
I've had a number of tools quit working recently. Particularly tools that inspect the extra_files_pth when setting metadata, Defuse, Rsem, SnpEff.
I think there was a change in the galaxy framework: The extra_files_path when reerenced from an input or output in the cheetah template sections of the tool config xml will be relative to the job woring directly rather than the files location. I've just changed a few of my tools on my server yesterday from: <param_name>.extra_files_path o: <param_name>.dataset.extra_files_path and they now work again.
Dan or John, is that the right way to handle this?
Thanks,
JJ
On 10/13/14, 9:29 PM, Andrew Lonie wrote:
Hi Jim. I am probably oing about this the wrong way, but I am not clear on how to report tool errors (if in fact this is a tool error!)
I've been trialling your snpeff wrapper from the test toolshed and
getting a consistent eror with the SnpEff Download and SnpEff sub tools (the SnpSift dbNSFP works fine). The prolem seems to be with an attribute declaration and manifests during database download as:
Traceback (most recent call last): Fle "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/runners/__init__.py", line 54, in finish_job job_state.job_wrapper.finish( stdout, stderr, exit_code )>>>>>> File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/__init__.py", line 1107, in finish dataset.datatype.set_meta( dataset, overwrite=False ) # call
datatype.set_meta directly for the initial set_meta call during
dataset creation File
"/mnt/galaxy/shed_tools/testtoolshed.g2.bx.psu.edu/repos/iuc/snpeff/1 938721334b3/snpeff/lib/galaxy/datatypes/snpeff.py", line 21, in set_meta data_dir = dataset.files_path AttributeError: 'HistoryDatasetAssociation' object has no attribute 'files_path'
We fiddled around with the wrapper, eventually replacing 'dataset.files_path' with 'dataset.extra_files_path' in snpeff.py, which fixed the download bug, but then SnpEff subtool itself threw a similar error when I tried to use that database from the history.
I chased up a bit more but cannot understand the various posts on files_path vs extra_files_path
I've shared a history with both of these errors here: http://130.56.251.62/galaxy/u/alonie/h/unnamed-history
Maybe this is a problem with our Galaxy image?
Any help appreciated!
Andrew
A/Prof Andrew Lonie University of Melbourne
-- James E. Jhnson Minnesota Supercomputing Institute University of Minnesota ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota
Message: 4 Date: Wed, 15 Oct 2014 19:47:40 +0200 From: Bj?rn Gr?ning bjoern.gruening@gmail.com To: galaxy-dev@lists.bx.psu.edu, John Chilton jmchilton@gmail.com Subject: Re: [galaxy-dev] snpeff tool for Galaxy extra_files_path Message-ID: 543EB33C.3070607@gmail.com Content-Type: text/plain; charset=windows-1252
Hi John,
glad to see this gets some attention!
Am 15.10.2014 um 19:05 schrieb John Chilton:
Hey JJ,
Opened a pull request to stable with my best guess at the right to proceed and hopefully a best practice recommendation we can all get behind. Do you want to try it out and let me know if it fixes snpeff? (It does fix the velvet datatypes you contributed to Galaxy.)
https://bitbucket.org/galaxy/galaxy-central/pull-request/532/fix-for-data types-consuming-output-extra/diff
Dan, Bjoern - does this make sense - can we move forward with this approach ($input.extra_files_path for inputs and $output.files_path for outputs) as the best practices for how to reference these directories.
I'm not sure why we need this distinction? Can we not simply choose one for both, inputs and outputs? Otherwise we need to explain it very well, why this is needed and I would vote to rename it to reflect that files_path can be only used by $outputs ...
Salve, Bjoern
-John
On Wed, Oct 15, 2014 at 11:44 AM, Jim Johnson johns198@umn.edu wrote:
I agree with you about the inadvisable use of: input.dataset.*.
I'm looking at:
lib/galaxy/model/__init__.py class Dataset( object ): ... def __init__( self, id=None, state=None, external_filename=None, extra_files_path=None, file_size=None, purgable=True, uuid=None ): ... self._extra_files_path = extra_files_path ... @property def extra_files_path( self ): return self.object_store.get_filename( self, dir_only=True, extra_dir=self._extra_files_path or "dataset_%d_files" % self.id )
I'm trying to see when self._extra_files_path gets set. Otherwise, would this return the path relative to the current file location of dataset?
On 10/15/14, 9:36 AM, John Chilton wrote:
Okay - so this is what broke things:
https://bitbucket.org/galaxy/galaxy-central/commits/d781366bc120787e201 b73a4dd99b56282169d86
My feeling with the commit was that wrappers and tools should never be explicitly accessing paths explicitly through input.dataset.*. I think this would circumvent options like outputs_to_working_directory and break remote job execution through Pulsar. It also breaks the object store abstraction I think - which is why I made the change for Bjoern I guess.
I did not (and this was stupid on my part) realize that datatype code would be running on the remote host and accessing these model properties directly outside the abstractions setup by the wrappers supplied to cheetah code and so they have become out of sync as of that commit.
I am thinking somehow changing what the datatype code gets is the right approach and not fixing things by circumvent the wrapper and accessing properties directly on the dataset. Since you will find that doing this breaks things for Bjoern object store and could probably never run on usegalaxy.org say for the same reason.
Too many different competing deployment options all being incompatible with each other :(.
Will keep thinking about this and respond again.
-John
On Wed, Oct 15, 2014 at 9:39 AM, John Chilton jmchilton@gmail.com wrote:
JJ,
Arg this is a mess. I am very sorry about this - I still don't understand extra_files_path versus files_path myself. There are open questions on Peter's blast repo and no one ever followed up on my object store questions about this with Bjoern's issues a couple release cycles ago. We need to get these to work - write documetation explicitly declaring best practices we can all agree on and then write some tests to ensure things don't break in the future.
When you say your tools broke recently - can you say for certain which release broke these - the August14, October14, something older?
I'll try to do some more research and get back to you.
-John
On Tue, Oct 14, 2014 at 6:04 AM, Jim Johnson johns198@umn.edu wrote:
Andrew,
Thanks for investigating this. I changed the subject and sent to the galaxy dev list.
I've had a number of tools quit working recently. Particularly tools that inspect the extra_files_path when setting metadata, Defuse, Rsem, SnpEff.
I think there was a change in the galaxy framework: The extra_files_path when referenced from an input or output in the cheetah template sections of the tool config xml will be relative to the job working directly rather than the files location. I've just changed a few of my tools on my server yesterday from: <param_name>.extra_files_path to: <param_name>.dataset.extra_files_path and they now work again.
Dan or John, is that the right way to handle this? Thanks,
JJ
On 10/13/14, 9:29 PM, Andrew Lonie wrote: > > Hi Jim. I am probably going about this the wrong way, but I am not > clear on how to report tool errors (if in fact this is a tool >error!) > > I've been trialling your snpeff wrapper from the test toolshed and > getting a consistent error with the SnpEff Download and SnpEff sub > tools (the SnpSift dbNSFP works fine). The problem seems to be >with an > attribute declaration and manifests during database download as: > > Traceback (most recent call last): > File >"/mnt/galaxy/galaxy-app/lib/galaxy/jobs/runners/__init__.py", > line 564, in finish_job > job_state.job_wrapper.finish( stdout, stderr, exit_code ) > File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/__init__.py", line > 1107, in finish > dataset.datatype.set_meta( dataset, overwrite=False ) # call > datatype.set_meta directly for the initial set_meta call during > dataset creation > File > > >"/mnt/galaxy/shed_tools/testtoolshed.g2.bx.psu.edu/repos/iuc/snpeff/ >1938721334b3/snpeff/lib/galaxy/datatypes/snpeff.py", > line 21, in set_meta > data_dir = dataset.files_path > AttributeError: 'HistoryDatasetAssociation' object has no attribute > 'files_path' > > > We fiddled around with the wrapper, eventually replacing > 'dataset.files_path' with 'dataset.extra_files_path' in snpeff.py, > which fixed the download bug, but then SnpEff subtool itself threw >a > similar error when I tried to use that database from the history. > > I chased up a bit more but cannot understand the various posts on > files_path vs extra_files_path > > I've shared a history with both of these errors here: > http://130.56.251.62/galaxy/u/alonie/h/unnamed-history > > Maybe this is a problem with our Galaxy image? > > Any help appreciated! > > Andrew > > > > > A/Prof Andrew Lonie > University of Melbourne
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Message: 5 Date: Wed, 15 Oct 2014 14:06:14 -0400 From: John Chilton jmchilton@gmail.com To: Bj?rn Gr?ning bjoern.gruening@gmail.com Cc: "galaxy-dev@lists.bx.psu.edu" galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] snpeff tool for Galaxy extra_files_path Message-ID: CANwbokfuD83XLiBR9EWsKLAPD+KEcq6SsAVM6yYeDnqJqqmeUQ@mail.gmail.com Content-Type: text/plain; charset=UTF-8
On Wed, Oct 15, 2014 at 1:47 PM, Bj?rn Gr?ning bjoern.gruening@gmail.com wrote:
Hi John,
glad to see this gets some attention!
Am 15.10.2014 um 19:05 schrieb John Chilton:
Hey JJ,
Opened a pull request to stable with my best guess at the right to proceed and hopefully a best practice recommendation we can all get behind. Do you want to try it out and let me know if it fixes snpeff? (It does fix the velvet datatypes you contributed to Galaxy.)
https://bitbucket.org/galaxy/galaxy-central/pull-request/532/fix-for-dat atypes-consuming-output-extra/diff
Dan, Bjoern - does this make sense - can we move forward with this approach ($input.extra_files_path for inputs and $output.files_path for outputs) as the best practices for how to reference these directories.
I'm not sure why we need this distinction? Can we not simply choose one for both, inputs and outputs? Otherwise we need to explain it very well, why this is needed and I would vote to rename it to reflect that files_path can be only used by $outputs ...
I sympathize with you that this adds complexity - I really do. But if we do anything else we restrict the range of Galaxy versions these tools can target even further - and we still have to maintain backward compatibility on all of this junk anyway which is really weighing down the wrapper and now metadata code as well.
If you want input.files_path to work - that is fine - I wouldn't be eager for the change given the complexity it would add to the implementation but I would probably accept a pull request for that. If you want $input.input_files_path and $output.output_files_path to work
- I would probably accept pull requests for those as well but I would
not be excited. Finally, I don't personally really want to put the time in given my reservations and the benefits would not be so great I don't think because I would think it would be awhile before we could really recommend those as best practices anyway - given the range of Galaxy versions people run.
How about we reach an agreement that with a fictitious Tool 2.0 spec (https://trello.com/c/AWVobyv1) where we fix all the problems we will not grant access to $input.dataset directly and we will uniformly only allow $input.files_path and $output.files_path.
-John
Salve, Bjoern
-John
On Wed, Oct 15, 2014 at 11:44 AM, Jim Johnson johns198@umn.edu wrote:
I agree with you about the inadvisable use of: input.dataset.*.
I'm looking at:
lib/galaxy/model/__init__.py class Dataset( object ): ... def __init__( self, id=None, state=None, external_filename=None, extra_files_path=None, file_size=None, purgable=True, uuid=None ): ... self._extra_files_path = extra_files_path ... @property def extra_files_path( self ): return self.object_store.get_filename( self, dir_only=True, extra_dir=self._extra_files_path or "dataset_%d_files" % self.id )
I'm trying to see when self._extra_files_path gets set. Otherwise, would this return the path relative to the current file location of dataset?
On 10/15/14, 9:36 AM, John Chilton wrote:
Okay - so this is what broke things:
https://bitbucket.org/galaxy/galaxy-central/commits/d781366bc120787e20 1b73a4dd99b56282169d86
My feeling with the commit was that wrappers and tools should never be explicitly accessing paths explicitly through input.dataset.*. I think this would circumvent options like outputs_to_working_directory and break remote job execution through Pulsar. It also breaks the object store abstraction I think - which is why I made the change for Bjoern I guess.
I did not (and this was stupid on my part) realize that datatype code would be running on the remote host and accessing these model properties directly outside the abstractions setup by the wrappers supplied to cheetah code and so they have become out of sync as of that commit.
I am thinking somehow changing what the datatype code gets is the right approach and not fixing things by circumvent the wrapper and accessing properties directly on the dataset. Since you will find that doing this breaks things for Bjoern object store and could probably never run on usegalaxy.org say for the same reason.
Too many different competing deployment options all being incompatible with each other :(.
Will keep thinking about this and respond again.
-John
On Wed, Oct 15, 2014 at 9:39 AM, John Chilton jmchilton@gmail.com wrote:
JJ,
Arg this is a mess. I am very sorry about this - I still don't understand extra_files_path versus files_path myself. There are open questions on Peter's blast repo and no one ever followed up on my object store questions about this with Bjoern's issues a couple release cycles ago. We need to get these to work - write documetation explicitly declaring best practices we can all agree on and then write some tests to ensure things don't break in the future.
When you say your tools broke recently - can you say for certain which release broke these - the August14, October14, something older?
I'll try to do some more research and get back to you.
-John
On Tue, Oct 14, 2014 at 6:04 AM, Jim Johnson johns198@umn.edu wrote: > > Andrew, > > Thanks for investigating this. I changed the subject and sent to >the > galaxy > dev list. > > I've had a number of tools quit working recently. Particularly >tools > that > inspect the extra_files_path when setting metadata, Defuse, Rsem, > SnpEff. > > I think there was a change in the galaxy framework: > The extra_files_path when referenced from an input or output in the > cheetah > template sections of the tool config xml will be relative to the >job > working > directly rather than the files location. > I've just changed a few of my tools on my server yesterday > from: <param_name>.extra_files_path > to: <param_name>.dataset.extra_files_path > and they now work again. > > Dan or John, is that the right way to handle this? > Thanks, > > JJ > > > > On 10/13/14, 9:29 PM, Andrew Lonie wrote: >> >> Hi Jim. I am probably going about this the wrong way, but I am not >> clear on how to report tool errors (if in fact this is a tool >>error!) >> >> I've been trialling your snpeff wrapper from the test toolshed and >> getting a consistent error with the SnpEff Download and SnpEff sub >> tools (the SnpSift dbNSFP works fine). The problem seems to be >>with an >> attribute declaration and manifests during database download as: >> >> Traceback (most recent call last): >> File >>"/mnt/galaxy/galaxy-app/lib/galaxy/jobs/runners/__init__.py", >> line 564, in finish_job >> job_state.job_wrapper.finish( stdout, stderr, exit_code ) >> File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/__init__.py", >>line >> 1107, in finish >> dataset.datatype.set_meta( dataset, overwrite=False ) # >>call >> datatype.set_meta directly for the initial set_meta call during >> dataset creation >> File >> >> >>"/mnt/galaxy/shed_tools/testtoolshed.g2.bx.psu.edu/repos/iuc/snpeff >>/1938721334b3/snpeff/lib/galaxy/datatypes/snpeff.py", >> line 21, in set_meta >> data_dir = dataset.files_path >> AttributeError: 'HistoryDatasetAssociation' object has no >>attribute >> 'files_path' >> >> >> We fiddled around with the wrapper, eventually replacing >> 'dataset.files_path' with 'dataset.extra_files_path' in snpeff.py, >> which fixed the download bug, but then SnpEff subtool itself >>threw a >> similar error when I tried to use that database from the history. >> >> I chased up a bit more but cannot understand the various posts on >> files_path vs extra_files_path >> >> I've shared a history with both of these errors here: >> http://130.56.251.62/galaxy/u/alonie/h/unnamed-history >> >> Maybe this is a problem with our Galaxy image? >> >> Any help appreciated! >> >> Andrew >> >> >> >> >> A/Prof Andrew Lonie >> University of Melbourne > > > > -- > James E. Johnson Minnesota Supercomputing Institute University of > Minnesota > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > http://lists.bx.psu.edu/ > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Message: 6 Date: Wed, 15 Oct 2014 13:07:40 -0500 From: Jim Johnson johns198@umn.edu To: John Chilton jmchilton@gmail.com Cc: Jim Johnson jj@umn.edu, "galaxy-dev@lists.bx.psu.edu" galaxy-dev@lists.bx.psu.edu, Nuwan Goonasekera nuwan.goonasekera@unimelb.edu.au, Andrew Lonie alonie@unimelb.edu.au Subject: Re: [galaxy-dev] snpeff tool for Galaxy extra_files_path Message-ID: 543EB7EC.2070606@umn.edu Content-Type: text/plain; charset=UTF-8; format=flowed
Looks good, John.
I tested with: https://testtoolshed.g2.bx.psu.edu/view/jjohnson/snpsift_dbnsfp_datatypes
lib/galaxy/datatypes/converters/tabular_to_dbnsfp.xml
reverting from hack: <command interpreter="python">tabular_to_dbnsfp.py $input $dbnsfp.dataset.extra_files_path/dbNSFP.gz</command> back to: <command interpreter="python">tabular_to_dbnsfp.py $input $dbnsfp.files_path/dbNSFP.gz</command>
On 10/15/14, 12:05 PM, John Chilton wrote:
Hey JJ,
Opened a pull request to stable with my best guess at the right to proceed and hopefully a best practice recommendation we can all get behind. Do you want to try it out and let me know if it fixes snpeff? (It does fix the velvet datatypes you contributed to Galaxy.)
https://bitbucket.org/galaxy/galaxy-central/pull-request/532/fix-for-data types-consuming-output-extra/diff
Dan, Bjoern - does this make sense - can we move forward with this approach ($input.extra_files_path for inputs and $output.files_path for outputs) as the best practices for how to reference these directories.
-John
On Wed, Oct 15, 2014 at 11:44 AM, Jim Johnson johns198@umn.edu wrote:
I agree with you about the inadvisable use of: input.dataset.*.
I'm looking at:
lib/galaxy/model/__init__.py class Dataset( object ): ... def __init__( self, id=None, state=None, external_filename=None, extra_files_path=None, file_size=None, purgable=True, uuid=None ): ... self._extra_files_path = extra_files_path ... @property def extra_files_path( self ): return self.object_store.get_filename( self, dir_only=True, extra_dir=self._extra_files_path or "dataset_%d_files" % self.id )
I'm trying to see when self._extra_files_path gets set. Otherwise, would this return the path relative to the current file location of dataset?
On 10/15/14, 9:36 AM, John Chilton wrote:
Okay - so this is what broke things:
https://bitbucket.org/galaxy/galaxy-central/commits/d781366bc120787e201 b73a4dd99b56282169d86
My feeling with the commit was that wrappers and tools should never be explicitly accessing paths explicitly through input.dataset.*. I think this would circumvent options like outputs_to_working_directory and break remote job execution through Pulsar. It also breaks the object store abstraction I think - which is why I made the change for Bjoern I guess.
I did not (and this was stupid on my part) realize that datatype code would be running on the remote host and accessing these model properties directly outside the abstractions setup by the wrappers supplied to cheetah code and so they have become out of sync as of that commit.
I am thinking somehow changing what the datatype code gets is the right approach and not fixing things by circumvent the wrapper and accessing properties directly on the dataset. Since you will find that doing this breaks things for Bjoern object store and could probably never run on usegalaxy.org say for the same reason.
Too many different competing deployment options all being incompatible with each other :(.
Will keep thinking about this and respond again.
-John
On Wed, Oct 15, 2014 at 9:39 AM, John Chilton jmchilton@gmail.com wrote:
JJ,
Arg this is a mess. I am very sorry about this - I still don't understand extra_files_path versus files_path myself. There are open questions on Peter's blast repo and no one ever followed up on my object store questions about this with Bjoern's issues a couple release cycles ago. We need to get these to work - write documetation explicitly declaring best practices we can all agree on and then write some tests to ensure things don't break in the future.
When you say your tools broke recently - can you say for certain which release broke these - the August14, October14, something older?
I'll try to do some more research and get back to you.
-John
On Tue, Oct 14, 2014 at 6:04 AM, Jim Johnson johns198@umn.edu wrote:
Andrew,
Thanks for investigating this. I changed the subject and sent to the galaxy dev list.
I've had a number of tools quit working recently. Particularly tools that inspect the extra_files_path when setting metadata, Defuse, Rsem, SnpEff.
I think there was a change in the galaxy framework: The extra_files_path when referenced from an input or output in the cheetah template sections of the tool config xml will be relative to the job working directly rather than the files location. I've just changed a few of my tools on my server yesterday from: <param_name>.extra_files_path to: <param_name>.dataset.extra_files_path and they now work again.
Dan or John, is that the right way to handle this? Thanks,
JJ
On 10/13/14, 9:29 PM, Andrew Lonie wrote: > Hi Jim. I am probably going about this the wrong way, but I am not > clear on how to report tool errors (if in fact this is a tool >error!) > > I've been trialling your snpeff wrapper from the test toolshed and > getting a consistent error with the SnpEff Download and SnpEff sub > tools (the SnpSift dbNSFP works fine). The problem seems to be >with an > attribute declaration and manifests during database download as: > > Traceback (most recent call last): > File >"/mnt/galaxy/galaxy-app/lib/galaxy/jobs/runners/__init__.py", > line 564, in finish_job > job_state.job_wrapper.finish( stdout, stderr, exit_code ) > File "/mnt/galaxy/galaxy-app/lib/galaxy/jobs/__init__.py", >line > 1107, in finish > dataset.datatype.set_meta( dataset, overwrite=False ) # >call > datatype.set_meta directly for the initial set_meta call during > dataset creation > File > > >"/mnt/galaxy/shed_tools/testtoolshed.g2.bx.psu.edu/repos/iuc/snpeff/ >1938721334b3/snpeff/lib/galaxy/datatypes/snpeff.py", > line 21, in set_meta > data_dir = dataset.files_path > AttributeError: 'HistoryDatasetAssociation' object has no attribute > 'files_path' > > > We fiddled around with the wrapper, eventually replacing > 'dataset.files_path' with 'dataset.extra_files_path' in snpeff.py, > which fixed the download bug, but then SnpEff subtool itself threw >a > similar error when I tried to use that database from the history. > > I chased up a bit more but cannot understand the various posts on > files_path vs extra_files_path > > I've shared a history with both of these errors here: > http://130.56.251.62/galaxy/u/alonie/h/unnamed-history > > Maybe this is a problem with our Galaxy image? > > Any help appreciated! > > Andrew > > > > > A/Prof Andrew Lonie > University of Melbourne
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota
-- James E. Johnson Minnesota Supercomputing Institute University of Minnesota
Message: 7 Date: Wed, 15 Oct 2014 20:50:56 +0000 From: Iry Witham Iry.Witham@jax.org To: "galaxy-dev@lists.bx.psu.edu" galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] snpEff and java issue Message-ID: D064566E.39080%iry.witham@jax.org Content-Type: text/plain; charset="us-ascii"
Hi Team,
I had manually installed the latest version of snpEff based on the recommendation of Pablo and after modifying the XML files I had a working version of snpEff that produced a vcf. However, this morning I reran my workflow and it is failing again, but this time I am getting the following error:
Fatal error: Exit code 1 (Error) Exception in thread "main" java.lang.UnsupportedClassVersionError: ca/mcgill/mcb/pcingola/snpEffect/commandLine/SnpEff : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:643) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) Could not find the main class: ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEff. Program will exit.
Nothing has been changed since I had success.
Regards, Iry
The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.
galaxy-dev@lists.galaxyproject.org