Workflow step dependencies
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello all, At my organisation's instance of galaxy, we have some connectors to talk to chado (inserting/removing GenBank & GFF files, etc). However, for one of the workflows we plan to use, there is a script which must be run which has no input files and no output files (gmod_add_organism.pl, for those curious). This script *must* be run before another script is run (gmod_bulk_load_gff3.pl), or the database insert will fail So my question is: Is there an easy way to specify dependencies when they depend on external factors, rather than just input/output files? If there isn't a generalised mechanism to do this, should I just list input and output files that are never actually created or removed? Cheers, Eric - -- Eric Rasche Programmer II Center for Phage Technology Texas A&M University College Station, TX 77843 404-692-2048 esr@tamu.edu rasche.eric@yandex.ru -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSZWA0AAoJEMqDXdrsMcpVGBUP/izaBVdCMR8ezm+5GNDpbsiZ CohkkFFZluewd14a08Mmgv42atmqAD10YiJAr1p5xu2eDB5OQJSH7dqXEbAS3IaL IhmfCkZdyd/+FUbOHwcS1AQ/XGWCQlahoME3sPsNOJM0Ru1jwA/Z4w7+CwZ0VSmy ooqExkS87S3wLzzflivqBnu40de1vWt33mIIbPRUDukPHQRPZMDNRPtAhUE2mL52 nkuUfhgcSNjyIYnCEqCNq5xd+sACepLB5F3LtszdmMgd9jjAJVcXqeXugiN4TSeI arMlzmPq2uIhDNKqx/ljsA9GqgrsLI+WJ2EvbTirAAvbcPVZDUCEMuRc7tMluSfp J8wU4wu2h2J3KvTIbOYiP0eudTTzDdfBvkNV4Zdvygv9WdNutFbLvwiNMt7FiUVI gq3/Mq95J74UahL9NeuG02vpMCRChigMG6cXjCaZfAlnubjbdLZ56XF/XVQeRaoz MnW35R7Y3ER46eFzSq6hU1Xlsjq2I5Esm4yx2O5Rs7Ieu4MSHdz7SV6qkGX32dVC ajSfue/zPd4DEj83mFrZjcjpax7YNF006CrWyBftC5dPKWfftuqy1Uhl2EZuscPI I8dliTrT0DIHJX5GqhqoMuX4iq5rN1WhSVB0N+HQZDt1nhHyiWyO/7zb0n6sHPv/ E9D2qOO7jYL++b6oTIT9 =RQ0Y -----END PGP SIGNATURE-----
On Mon, Oct 21, 2013 at 12:11 PM, Eric Rasche <rasche.eric@yandex.ru> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello all,
At my organisation's instance of galaxy, we have some connectors to talk to chado (inserting/removing GenBank & GFF files, etc).
That sounds awesome! Is that open source or are there plans to make this available?
However, for one of the workflows we plan to use, there is a script which must be run which has no input files and no output files (gmod_add_organism.pl, for those curious). This script *must* be run before another script is run (gmod_bulk_load_gff3.pl), or the database insert will fail
So my question is: Is there an easy way to specify dependencies when they depend on external factors, rather than just input/output files?
As far as I am aware there is no such mechanism in place. I would create a Trello issue if you feel this should be added - it is certainly not an unreasonable request. My sense however is the Galaxy concept of a tool and workflow are very coupled to file inputs and outputs at this time, so I am not certain we will be able to get to this request quickly. For now, my recommendation would be just to use a dummy input and output file. You can even create a file type that is only used for this purpose to limit confusion. Can you find some logging statements or something to populate this file with? Hope this helps and let us know if there are any additional questions, -John
If there isn't a generalised mechanism to do this, should I just list input and output files that are never actually created or removed?
Cheers, Eric
- -- Eric Rasche Programmer II Center for Phage Technology Texas A&M University College Station, TX 77843 404-692-2048 esr@tamu.edu rasche.eric@yandex.ru -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQIcBAEBAgAGBQJSZWA0AAoJEMqDXdrsMcpVGBUP/izaBVdCMR8ezm+5GNDpbsiZ CohkkFFZluewd14a08Mmgv42atmqAD10YiJAr1p5xu2eDB5OQJSH7dqXEbAS3IaL IhmfCkZdyd/+FUbOHwcS1AQ/XGWCQlahoME3sPsNOJM0Ru1jwA/Z4w7+CwZ0VSmy ooqExkS87S3wLzzflivqBnu40de1vWt33mIIbPRUDukPHQRPZMDNRPtAhUE2mL52 nkuUfhgcSNjyIYnCEqCNq5xd+sACepLB5F3LtszdmMgd9jjAJVcXqeXugiN4TSeI arMlzmPq2uIhDNKqx/ljsA9GqgrsLI+WJ2EvbTirAAvbcPVZDUCEMuRc7tMluSfp J8wU4wu2h2J3KvTIbOYiP0eudTTzDdfBvkNV4Zdvygv9WdNutFbLvwiNMt7FiUVI gq3/Mq95J74UahL9NeuG02vpMCRChigMG6cXjCaZfAlnubjbdLZ56XF/XVQeRaoz MnW35R7Y3ER46eFzSq6hU1Xlsjq2I5Esm4yx2O5Rs7Ieu4MSHdz7SV6qkGX32dVC ajSfue/zPd4DEj83mFrZjcjpax7YNF006CrWyBftC5dPKWfftuqy1Uhl2EZuscPI I8dliTrT0DIHJX5GqhqoMuX4iq5rN1WhSVB0N+HQZDt1nhHyiWyO/7zb0n6sHPv/ E9D2qOO7jYL++b6oTIT9 =RQ0Y -----END PGP SIGNATURE----- ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 John, On 10/21/2013 12:31 PM, John Chilton wrote:
On Mon, Oct 21, 2013 at 12:11 PM, Eric Rasche <rasche.eric@yandex.ru> wrote: Hello all,
At my organisation's instance of galaxy, we have some connectors to talk to chado (inserting/removing GenBank & GFF files, etc).
That sounds awesome! Is that open source or are there plans to make this available?
Yes! Parts of it are already open source, and parts of it are scheduled to be. The perl scripts I use in galaxy mostly wrap a Chado library I threw together. You can see it/fork it here. https://github.com/erasche/charm It's in the process of being re-written, but handles most of the stuff you'd want to do with Chado like creating new instances for new users, backing them up, and cloning them. The gmod*.pl are just what are distributed with Chado. Nothing special there. Eventually I'll rework those and distribute them with galaxy config files.
However, for one of the workflows we plan to use, there is a script which must be run which has no input files and no output files (gmod_add_organism.pl, for those curious). This script *must* be run before another script is run (gmod_bulk_load_gff3.pl), or the database insert will fail
So my question is: Is there an easy way to specify dependencies when they depend on external factors, rather than just input/output files?
As far as I am aware there is no such mechanism in place. I would create a Trello issue if you feel this should be added - it is certainly not an unreasonable request. My sense however is the Galaxy concept of a tool and workflow are very coupled to file inputs and outputs at this time, so I am not certain we will be able to get to this request quickly.
For now, my recommendation would be just to use a dummy input and output file. You can even create a file type that is only used for this purpose to limit confusion. Can you find some logging statements or something to populate this file with?
Hope this helps and let us know if there are any additional questions, -John
Okay. If there's no mechanism in place that's fine. I can surely find some logging output (to put into a file and use that to couple dependencies. I've created a trello card, as per your request, but this is really a low priority issue. https://trello.com/c/h5qZlgU8/1201-allowing-workflow-step-dependencies-when-... Thanks, Eric
If there isn't a generalised mechanism to do this, should I just list input and output files that are never actually created or removed?
Cheers, Eric
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
- -- Eric Rasche Programmer II Center for Phage Technology Texas A&M University College Station, TX 77843 404-692-2048 esr@tamu.edu rasche.eric@yandex.ru -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSZWtSAAoJEMqDXdrsMcpVkFQP/RhCchn7USq31teBREkqkTpg ukE/zHXAXSLdpGLB5l0+O7nxyiqZzheoxuDtpgpULI4JMsxidz0I/UaimkW3QSwi hZtjU7uaCk8Fxi7GZjIMe5wbp7dH6yX7h1B4E4ho5hA0j10lA+dwHKpthQ37Nm9S Ou9MxleO6Dzwfhy9OuG6lRfqpWL8qI/2hh1TEsJjPxGQUKmMlFkhKhHdZti3iytS tv5nx9kj4au7iUCQM3Lp/3fmNjM6LyWQbmFi2zvo++CaMMS9U3zoHENnU5AU3mjD xVGpDEtyIDwKTIwi29Sgoy5jP0TGsDqRmgP+vN7wu9u0bynewFN1jusOe3PIhC1e OM8ypv4XFQP2wWYq3oEiooKZdbDd6EPCZcSSGZqb+WP6Px8+BZbJjsprJsH7mrJr xK4iRnJo1wWflFUxWZEAJMXJqEU3CIOotGHAn8ylalWsMyTVq+pUZ6OC2Ov9qkze cTZSw2ay3SUzd83VwCQ455KZ1NRINdQ3xdzc7DYSKT2u/I684NxVvbxAdt2Gk+MS ikYzcTA16eIZ4e5nJGDf+6GzHac1XjG3waPmhzbd/4aCLbLlrikh4NEDp1BkSqG8 kn44A5nROu+88PliJKEGh1z7QS/dDAA2nAU43ojZGpFBj12DmYxZoA+fnkqprM2X AyMvhkHMVz1KFZPlrVUe =RRUy -----END PGP SIGNATURE-----
participants (2)
-
Eric Rasche
-
John Chilton