Re: [galaxy-dev] referring to tool_data_tables[] structure (John Chilton)
Hi John, Thansk for the feedback. Righto, I saw abundant use of "trans" in core galaxy code; and problem is I'd wanted access from a tool wrapper's python code. Basically I'm trying to get more out of .loc files, for example this field specification for blast report data: #value type subtype sort filter default min max choose name # Remember to edit tool_data_table_conf.xml for column spec! length numeric int 1 1 1 Alignment length qstart numeric int 1 1 1 Alignment start in query qend numeric int 1 1 1 Alignment end in query sstart numeric int 1 1 1 Alignment start in subject send numeric int 1 1 1 Alignment end in subject qseq text atgc 0 1 1 Aligned part of query sequence sseq text atgc 0 1 1 Aligned part of subject sequence mseq text atgc 0 1 1 Alignment, matched part pident numeric float 1 1 97 90 100 1 Percentage of identical matches ... This data is being accessed in our tool xml code via a too_data_table entry and is handily providing field lists for sorting, filtering, and searching input data. In our python code I'd like to just say blastfieldspec = app.tool_data_tables[ 'blast_report_fields' ] And then go to town on sorting, filtering, validation etc. as desired in python, using this spec. I don't want the python code to be specifying the .loc path directly (which I have to do now), I'd much rather take advantage of what tool_data_tables could provide. Our second desired use of tool_data_tables info is a case where some tools would be making use of 3rd party datasets that another tool manages. We want to set up a tool/system that manages 3rd party reference databases (e.g. for particular specialized gene "universal target" databases like Chaperonin cpn60 or Legionella mip). This system would periodically get and process fasta data online from sources listed in a .loc file. We'd process and use these databases in pulldown menus, but each of these reference database will need management through a Galaxy interface via a plugin tool I guess. Someone has done a prototype for this for database specific to NCBI. We'd target other niche data sources. I realise another hurdle is getting modified .loc file info refreshed back into galaxy without having to stop/restart the server. Hoping an extension to the tool_data_tables class could do this too. Regards, Damion ---------------------------------------------------------------------- Message: 1 Date: Fri, 10 Jan 2014 16:56:56 -0800 From: "Dooley, Damion" <Damion.Dooley@bccdc.ca> To: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> Subject: [galaxy-dev] referring to tool_data_tables[] structure Message-ID: <7891813F3C8F424B97D8BF2E5600E51903301B88EFFD@VEXCCR02.phsabc.ehcnet.ca> Content-Type: text/plain; charset="us-ascii" I've seen $__app__.tool_data_tables[ 'all_fasta' ].get_fields() )[0][-1] in tool xml templates. Is there a way I can access the tool_data_tables structure from python code too? I see all the initialization stuff happening in https://bitbucket.org/abrenner/galaxy-central/src/f3e736fe03df3a6dd5438c12ba... But not seeing where one can access any of these app variables? Regards, Damion ------------------------------ Message: 2 Date: Fri, 10 Jan 2014 20:42:50 -0600 From: John Chilton <chilton@msi.umn.edu> To: "Dooley, Damion" <Damion.Dooley@bccdc.ca> Cc: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> Subject: Re: [galaxy-dev] referring to tool_data_tables[] structure Message-ID: <CANwbokeUZ4KfB0+9Z9RVk0rtdMtE1iwmg+e3uo8yTUOXfbLj9w@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Just to clarify do you want to access them from Galaxy web server code or from a Galaxy tool wrapper written in Python? Nearly every part of the Galaxy source code has access to app and everything inside of it, for instance all controller methods take in a trans variable that contains a reference to app (trans.app). I suspect you want to access it from a tool wrapper though? This is not possible. If this is the case - what are you hoping to accomplish? Do you want to access to the all_fasta data table information in the example? -John
First I am not the best person to respond to this - I hope someone like Dan or JJ can follow up. In my past tool development I have tried to limit my use of .loc files because they can be impediments to reproduciblity (they are getting better though). I have a pet peeve - it is when I ask a general question with specific examples and then people just provide alternative ideas to the specific examples. I am about to do that to you - sorry :(. This is because I don't really think tools should be breaking these abstractions. Instantiating app in the previous example would require for instance the compute nodes to have access to the database - which they should not have. You could pass $__root__ into your tool and read the XML file $__root__/tool_data_table_conf.xml directly - I would still avoid this but it is better than trying to instantiate Galaxy internals from your tool. As mentioned though - I think in your two use cases there are some better approaches: If your first case - that spec is not really something that is going to vary from site to site right? It is fixed data - so I would just place a copy with your tool and resolve references to it relative to your tool wrapper. Most scripting languages have a way to get the path to the current file - for instance in Python you can do something like os.path.join( os.path.dirname( __file__ ), 'ncbi_columns_spec.txt' ). This should work with manual installs as well as with the tool shed. In the second case - I think that data managers are going to be the best way to handle this going forward. I believe they can dynamically update these loc files without restarting Galaxy (... though there are probably some caveats to that). I don't know if they can be driven by the API - but I think I remember Dan mentioning they are just normal tools so the tools API should work??? https://wiki.galaxyproject.org/Admin/Tools/DataManagers/HowTo/Define https://github.com/peterjc/galaxy_blast/issues/22 If there is something they do not currently do but need to, Trello cards can be created and community contributions considered, because data managers are going to be the best path forward for dynamically updating .loc files. Hope this helps! -John On Mon, Jan 13, 2014 at 12:18 PM, Dooley, Damion <Damion.Dooley@bccdc.ca> wrote:
Hi John,
Thansk for the feedback. Righto, I saw abundant use of "trans" in core galaxy code; and problem is I'd wanted access from a tool wrapper's python code. Basically I'm trying to get more out of .loc files, for example this field specification for blast report data:
#value type subtype sort filter default min max choose name # Remember to edit tool_data_table_conf.xml for column spec! length numeric int 1 1 1 Alignment length qstart numeric int 1 1 1 Alignment start in query qend numeric int 1 1 1 Alignment end in query sstart numeric int 1 1 1 Alignment start in subject send numeric int 1 1 1 Alignment end in subject qseq text atgc 0 1 1 Aligned part of query sequence sseq text atgc 0 1 1 Aligned part of subject sequence mseq text atgc 0 1 1 Alignment, matched part pident numeric float 1 1 97 90 100 1 Percentage of identical matches ...
This data is being accessed in our tool xml code via a too_data_table entry and is handily providing field lists for sorting, filtering, and searching input data.
In our python code I'd like to just say
blastfieldspec = app.tool_data_tables[ 'blast_report_fields' ]
And then go to town on sorting, filtering, validation etc. as desired in python, using this spec. I don't want the python code to be specifying the .loc path directly (which I have to do now), I'd much rather take advantage of what tool_data_tables could provide.
Our second desired use of tool_data_tables info is a case where some tools would be making use of 3rd party datasets that another tool manages. We want to set up a tool/system that manages 3rd party reference databases (e.g. for particular specialized gene "universal target" databases like Chaperonin cpn60 or Legionella mip). This system would periodically get and process fasta data online from sources listed in a .loc file. We'd process and use these databases in pulldown menus, but each of these reference database will need management through a Galaxy interface via a plugin tool I guess. Someone has done a prototype for this for database specific to NCBI. We'd target other niche data sources. I realise another hurdle is getting modified .loc file info refreshed back into galaxy without having to stop/restart the server. Hoping an extension to the tool_data_tables class could do this too.
Regards,
Damion
----------------------------------------------------------------------
Message: 1 Date: Fri, 10 Jan 2014 16:56:56 -0800 From: "Dooley, Damion" <Damion.Dooley@bccdc.ca> To: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> Subject: [galaxy-dev] referring to tool_data_tables[] structure Message-ID: <7891813F3C8F424B97D8BF2E5600E51903301B88EFFD@VEXCCR02.phsabc.ehcnet.ca>
Content-Type: text/plain; charset="us-ascii"
I've seen
$__app__.tool_data_tables[ 'all_fasta' ].get_fields() )[0][-1]
in tool xml templates. Is there a way I can access the tool_data_tables structure from python code too?
I see all the initialization stuff happening in https://bitbucket.org/abrenner/galaxy-central/src/f3e736fe03df3a6dd5438c12ba... But not seeing where one can access any of these app variables?
Regards,
Damion
------------------------------
Message: 2 Date: Fri, 10 Jan 2014 20:42:50 -0600 From: John Chilton <chilton@msi.umn.edu> To: "Dooley, Damion" <Damion.Dooley@bccdc.ca> Cc: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> Subject: Re: [galaxy-dev] referring to tool_data_tables[] structure Message-ID: <CANwbokeUZ4KfB0+9Z9RVk0rtdMtE1iwmg+e3uo8yTUOXfbLj9w@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1
Just to clarify do you want to access them from Galaxy web server code or from a Galaxy tool wrapper written in Python?
Nearly every part of the Galaxy source code has access to app and everything inside of it, for instance all controller methods take in a trans variable that contains a reference to app (trans.app).
I suspect you want to access it from a tool wrapper though? This is not possible. If this is the case - what are you hoping to accomplish? Do you want to access to the all_fasta data table information in the example?
-John ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
Dooley, Damion
-
John Chilton