On 24.04.2015 03:18, Keith Suderman wrote:
Hi Björn,

On Apr 22, 2015, at 8:00 AM, Björn Grüning <bjoern.gruening@gmail.com> wrote:

Do you have a beer preference?  

Outing: I'm one of the rare Germans that do not drink alcohol ;)

That must be awkward ;)

:)


This can be done via the ToolShed. I assume your custom command
interpreter is not different than python or perl as interpreter? 

One difference is that my interpreter is a Java program. I likely should have mentioned that little detail... anyone wanting to install our tools would need my interpreter AND Java 1.7+ on their server.  Hopefully that is not an insurmountable problem.

In Galaxy we advertise people to have a JRE around if they use the TS. Currently, the TS is not able it install Java. I was not able to compile java by my own :(
https://wiki.galaxyproject.org/Admin/Config/ToolDependenciesList

So this is ok!

However, does the bioinformatics community really want a bunch of NLP tools in their tool shed?

Yes, Yes, Yes!

The editor also allows me to select output formats that have no converters defined,
so either I am still missing something or the workflow editor does not do what I want.  I can convert formats through the "Edit attributes" menu,
so Galaxy knows about my converters and how to invoke them, just not in the workflow editor.

Ok, I think I understood. Not sure if this is the best way but put your
converters into the toolbox.

By the "toolbox" do you mean adding my converters to the tool_conf.xml file so they are available on the Tools menu?  I have done that and I can add the converters to a workflow manually. I was just hoping the workflow editor could detect when it could perform the conversion and insert the converters as needed; it seems this is not possible.

Maybe someone else can jump in here, I do not see why this shouldn't be possible? Maybe this is just an UI issue?!

Do you have more pointers to tools that use the attached metadata?  In particular tools that set metadata that is consumed by subsequent tools.

The sqlite datatype should be a good example. Keep in mind, we can not
set metadata from inside a tool.
Imho this is not possible, yet, but a
common requested feature. But you can "calculate" such metadata inside
your datatype definition and set it implicitly after your tool is finished.

Setting the metadata in the tool wrapper is fine, and after grepping through some of the other wrappers I think I need something like:

  <!-- Output from a tokenizer -->
  <outputs>
    <data name="output" format="xml" label="Output">
        <actions>
            <action type="metadata" name="tokens">True</action>
        </actions>
    </data>
  </outputs>

  <!-- Input to a part of speech tagger -->
  <inputs>
    <param name="input" type="data" format="xml">
        <validator type="expression" message="Please run a tokenizer first.">metadata.tokens is not None</validator>
    </param>
  </inputs>

That is, the input validator simply checks if some value has been set in the metadata, and the output sets a value in the metadata.  The above does not work, but at least Galaxy stopped complaining about the tool XML with this.  However, the documentation for <option/> <filter/> and <action/> does not match up with what existing wrappers (in the dev branch) are doing so I am having problems with the exact syntax.

Can you try:         <action type="metadata" name="tokens" default="True"/>

You can also filter your inputs in the speech tagger:

 <options options_filter_attribute="metadata.tokens" > <filter type="add_value" value="True" /> </options>



Do you have pointers to any documentation on data collections?  My searches haven't turned up much but tantalizing references [1],
and my experiments trying to return a data collection from a tool have been unsuccessful.

https://wiki.galaxyproject.org/Histories?highlight=%28collection%29#Dataset_Collections

https://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax ->
data_collection

And have a look at:
https://github.com/galaxyproject/galaxy/tree/dev/test/functional/tools

Success!  I was running the code from master, so I suspect that was part of my problem. 

Nice!

However, my browser is still complaining about long running scripts.

Can you put this in a different thread?

A script on this page may be busy, or it may have stopped responding. You can stop the script now, open the script in the debugger, or let the script continue.


I accidentally left visible="true" when creating the dataset collection and ended up with +1500 items in my history; the above message kept popping up while the workflow was running (at least until I selected "Don't show this again").  Deleting +1500 datasets from the history is also very slow, but that is a different issue. On the bright side, at least I had +1500 items in the history to delete.

+1500 different elements is a lot for a history, for usability we should try to use collections here. No one wants to deal with such an mount of history objects :)


I have also been trying John Chiltons blend4j and managed to populate a data library, and this is almost what I want,
but I would like a tool that can be included in a workflow as the data from the library may not necessarily be the first step.   
I have no problem calling the Galaxy API from my tools, except that between the bioinformatics lingo and Python (I'm a Java programmer) it's slow going.

If possible at all you should avoid this, but as last resort probably an
option.

Out of curiosity, what exactly should I avoid; making calls to the Galaxy REST/API from inside a tool, using blend4j, or populating a data library from inside a tool?  I can see myself doing all three in the near future.

* making calls to the Galaxy REST/API from inside a tool

Think big! Your tools will run in large cluster environments, one job schedulers and Cloud-Infrstructures. You don't know if you job is allowed to connect to your Galaxy instance - security wise. Also you need to authenticate, more issues ....

Ciao,
Bjoern

Cheers,
Keith


Ciao,
Bjoern

Cheers,
Keith

REFERENCES

1. https://wiki.galaxyproject.org/Learn/API#Collections
2. https://wiki.galaxyproject.org/Admin/Tools/Multiple%20Output%20Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run


Oh yes this is supported out of the box!
See here for a small documentation:
https://github.com/bgruening/galaxytools/tree/master/chemicaltoolbox#supported-filetypes

Here is a example of how you can write your own datatypes:

https://github.com/bgruening/galaxytools/tree/master/chemicaltoolbox/datatypes

I feel like I must be missing the obvious.  Here is the relevant section of my datatypes_conf.xml (you can see the full file at https://github.com/oanc/Galaxy/blob/master/config/datatypes_conf.xml)

<datatype extension="lif" type="galaxy.datatypes.text:Json" display_in_upload="True">
<converter file="convert.json2gate_2.0.0.xml" target_datatype="gate"/>
</datatype>
<datatype extension="gate" type="galaxy.datatypes.xml:GenericXml" mimetype="application/xml" display_in_upload="true">
<converter file="convert.gate2json_2.0.0.xml" target_datatype="lif"/>
</datatype>

Is there anything I need to do beyond defining the datatypes for implicit conversions to take place?

I guess you need to place your converters under
https://github.com/oanc/Galaxy/tree/master/lib/galaxy/datatypes/converters/

And get rid of 'convert.' in your datatypes_conf.xml at least if you are
not using the TS.

Hope this helps you a little bit more,
Bjoern

Thanks,
Keith Suderman



4. OAuth 2.0 / OpenID Connect:

I need to be able to fetch documents from data providers that require an OAuth 2.0 access token. Currently, I use a separate service to go
through the OAuth authentication/authorization process and then have the user copy/paste their access token into a text field in Galaxy.   
Is there a way to perform the OAuth authentication dance required by the remote service inside Galaxy itself?   

I don't think so, but maybe someone else has an idea here.

I’ve looked at the Trello site for Galaxy and see that both OAuth 2.0 and OpenID Connect are on the radar, hopefully this use case is being considered as well.

I’m sure to have more questions after working through some visualization examples, but this should keep me busy for now.

Hope you are busy now :)
Cheers and keep us up to date!
Bjoern

Sincerely,
Keith Suderman

REFERENCES

1. https://wiki.galaxyproject.org/Admin/Tools/AddingTools

------------------------------
Research Associate
Department of Computer Science
Vassar College
Poughkeepsie, NY


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/



------------------------------
Research Associate
Department of Computer Science
Vassar College
Poughkeepsie, NY





------------------------------
Research Associate
Department of Computer Science
Vassar College
Poughkeepsie, NY




------------------------------
Research Associate
Department of Computer Science
Vassar College
Poughkeepsie, NY