Repository installation on multiple servers
Hi Everyone This is my first time on the mailing list. I am the administrator for a production installation of Galaxy at Agriculture and Agri-Food Canada in Ottawa, Canada. Currently, I have a galaxy configured to start 3 web servers and 3 handlers. When installing a tool from a toolshed, only one of the three servers sees the installation (the one that processed the installation request). To resolve the issue, galaxy needs to be restarted, which reloads all the tools. Is this the expected behaviour when a proxy server (apache) load balances 3 galaxy web servers? If not, where can I start diagnosing the problem? Your assistance is much appreciated. Thank you. Regards, Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca | (613) 759-1228
I had the same concern, and heard that this is being worked on. Meanwhile I’m using a rolling restart method from pjbriggs https://github.com/pjbriggs/galaxy-admin-utils/ I’m pretty happy with this (desipite the 2 minute restart per worker) because users do not see any interruption during restarts as they did before. Brad -- Brad Langhorst, Ph.D. Applications and Product Development Scientist On May 23, 2014, at 8:46 AM, Kandalaft, Iyad <Iyad.Kandalaft@AGR.GC.CA> wrote:
Hi Everyone
This is my first time on the mailing list. I am the administrator for a production installation of Galaxy at Agriculture and Agri-Food Canada in Ottawa, Canada. Currently, I have a galaxy configured to start 3 web servers and 3 handlers. When installing a tool from a toolshed, only one of the three servers sees the installation (the one that processed the installation request). To resolve the issue, galaxy needs to be restarted, which reloads all the tools. Is this the expected behaviour when a proxy server (apache) load balances 3 galaxy web servers? If not, where can I start diagnosing the problem?
Your assistance is much appreciated. Thank you.
Regards,
Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca | (613) 759-1228
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hello Iyad and Brad, Hello Iyad, Yes, this is the current expected behavior. There is a Trllo card here that tracks progress on this issue. http://trello.com/c/B0pV80d0/1281-messaging-and-task-queue This work is currently in progress with a baseline implementation available in the upcoming Galaxy release (currently scheduled for June 2). I'm not involved in this work, so I'm not quite sure whether this baseline implementation will handle comunication between Galaxy web front-ends like you have (and like we have for our public Galaxy instances), but this information should be clarified with the upcoming release notes. Greg Von Kuster On May 23, 2014, at 8:56 AM, "Langhorst, Brad" <Langhorst@neb.com> wrote:
I had the same concern, and heard that this is being worked on.
Meanwhile I’m using a rolling restart method from pjbriggs
https://github.com/pjbriggs/galaxy-admin-utils/
I’m pretty happy with this (desipite the 2 minute restart per worker) because users do not see any interruption during restarts as they did before.
Brad -- Brad Langhorst, Ph.D. Applications and Product Development Scientist
On May 23, 2014, at 8:46 AM, Kandalaft, Iyad <Iyad.Kandalaft@AGR.GC.CA> wrote:
Hi Everyone
This is my first time on the mailing list. I am the administrator for a production installation of Galaxy at Agriculture and Agri-Food Canada in Ottawa, Canada. Currently, I have a galaxy configured to start 3 web servers and 3 handlers. When installing a tool from a toolshed, only one of the three servers sees the installation (the one that processed the installation request). To resolve the issue, galaxy needs to be restarted, which reloads all the tools. Is this the expected behaviour when a proxy server (apache) load balances 3 galaxy web servers? If not, where can I start diagnosing the problem?
Your assistance is much appreciated. Thank you.
Regards,
Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca | (613) 759-1228
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Thank you Greg and Brad, That is a great tool (rolling restarts) and great news (working on inter-process communication) ! Pardon my ignorance, but the rolling restart command refers to a "manager", web, and handler. What are "Manager" galaxy processes? I've had issues where restarting all galaxy web/handler processes at the same time results in workflows being terminated where steps that are currently executing continue until completion but subsequent steps of the workflow receive an (!) icon and indicate that the "none" job status is unknown. I configured Galaxy with the DRMAA runner to an SGE (OGS) cluster. Does the rolling restart method "fix" this issue? Regards, Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics | Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca | (613) 759-1228 -----Original Message----- From: Greg Von Kuster [mailto:greg@bx.psu.edu] Sent: Friday, May 23, 2014 9:08 AM To: Langhorst, Brad Cc: Kandalaft, Iyad; galaxy-dev@bx.psu.edu Subject: Re: [galaxy-dev] Repository installation on multiple servers Hello Iyad and Brad, Hello Iyad, Yes, this is the current expected behavior. There is a Trllo card here that tracks progress on this issue. http://trello.com/c/B0pV80d0/1281-messaging-and-task-queue This work is currently in progress with a baseline implementation available in the upcoming Galaxy release (currently scheduled for June 2). I'm not involved in this work, so I'm not quite sure whether this baseline implementation will handle comunication between Galaxy web front-ends like you have (and like we have for our public Galaxy instances), but this information should be clarified with the upcoming release notes. Greg Von Kuster On May 23, 2014, at 8:56 AM, "Langhorst, Brad" <Langhorst@neb.com> wrote:
I had the same concern, and heard that this is being worked on.
Meanwhile I'm using a rolling restart method from pjbriggs
https://github.com/pjbriggs/galaxy-admin-utils/
I'm pretty happy with this (desipite the 2 minute restart per worker) because users do not see any interruption during restarts as they did before.
Brad -- Brad Langhorst, Ph.D. Applications and Product Development Scientist
On May 23, 2014, at 8:46 AM, Kandalaft, Iyad <Iyad.Kandalaft@AGR.GC.CA> wrote:
Hi Everyone
This is my first time on the mailing list. I am the administrator for a production installation of Galaxy at Agriculture and Agri-Food Canada in Ottawa, Canada. Currently, I have a galaxy configured to start 3 web servers and 3 handlers. When installing a tool from a toolshed, only one of the three servers sees the installation (the one that processed the installation request). To resolve the issue, galaxy needs to be restarted, which reloads all the tools. Is this the expected behaviour when a proxy server (apache) load balances 3 galaxy web servers? If not, where can I start diagnosing the problem?
Your assistance is much appreciated. Thank you.
Regards,
Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca | (613) 759-1228
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
We have a pretty busy cluster, also SGE via DRMAA. I do not notice jobs getting lost. I did not notice that before switching to the rolling restart method either though. I have a vague memory of a config option to keep job state in the database … not sure about that. brad -- Bradley W. Langhorst, Ph.D. Applications and Product Development Scientist On May 23, 2014, at 9:11 AM, Kandalaft, Iyad <Iyad.Kandalaft@AGR.GC.CA<mailto:Iyad.Kandalaft@AGR.GC.CA>> wrote: Thank you Greg and Brad, That is a great tool (rolling restarts) and great news (working on inter-process communication) ! Pardon my ignorance, but the rolling restart command refers to a "manager", web, and handler. What are "Manager" galaxy processes? I've had issues where restarting all galaxy web/handler processes at the same time results in workflows being terminated where steps that are currently executing continue until completion but subsequent steps of the workflow receive an (!) icon and indicate that the "none" job status is unknown. I configured Galaxy with the DRMAA runner to an SGE (OGS) cluster. Does the rolling restart method "fix" this issue? Regards, Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics | Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca<mailto:Iyad.Kandalaft@agr.gc.ca> | (613) 759-1228 -----Original Message----- From: Greg Von Kuster [mailto:greg@bx.psu.edu] Sent: Friday, May 23, 2014 9:08 AM To: Langhorst, Brad Cc: Kandalaft, Iyad; galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu> Subject: Re: [galaxy-dev] Repository installation on multiple servers Hello Iyad and Brad, Hello Iyad, Yes, this is the current expected behavior. There is a Trllo card here that tracks progress on this issue. http://trello.com/c/B0pV80d0/1281-messaging-and-task-queue This work is currently in progress with a baseline implementation available in the upcoming Galaxy release (currently scheduled for June 2). I'm not involved in this work, so I'm not quite sure whether this baseline implementation will handle comunication between Galaxy web front-ends like you have (and like we have for our public Galaxy instances), but this information should be clarified with the upcoming release notes. Greg Von Kuster On May 23, 2014, at 8:56 AM, "Langhorst, Brad" <Langhorst@neb.com> wrote: I had the same concern, and heard that this is being worked on. Meanwhile I'm using a rolling restart method from pjbriggs https://github.com/pjbriggs/galaxy-admin-utils/ I'm pretty happy with this (desipite the 2 minute restart per worker) because users do not see any interruption during restarts as they did before. Brad -- Brad Langhorst, Ph.D. Applications and Product Development Scientist On May 23, 2014, at 8:46 AM, Kandalaft, Iyad <Iyad.Kandalaft@AGR.GC.CA> wrote: Hi Everyone This is my first time on the mailing list. I am the administrator for a production installation of Galaxy at Agriculture and Agri-Food Canada in Ottawa, Canada. Currently, I have a galaxy configured to start 3 web servers and 3 handlers. When installing a tool from a toolshed, only one of the three servers sees the installation (the one that processed the installation request). To resolve the issue, galaxy needs to be restarted, which reloads all the tools. Is this the expected behaviour when a proxy server (apache) load balances 3 galaxy web servers? If not, where can I start diagnosing the problem? Your assistance is much appreciated. Thank you. Regards, Iyad Kandalaft Bioinformatics Programmer Microbial Biodiversity Bioinformatics Science & Technology Branch Agriculture & Agri-Food Canada Iyad.Kandalaft@agr.gc.ca | (613) 759-1228 ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Fri, May 23, 2014 at 9:07 AM, Greg Von Kuster <greg@bx.psu.edu> wrote:
I'm not quite sure whether this baseline implementation will handle comunication between Galaxy web front-ends like you have (and like we have for our public Galaxy instances).
Yep, it sure does. There's still a little bit of work to do on it, but it'll be in the release.
participants (4)
-
Dannon Baker
-
Greg Von Kuster
-
Kandalaft, Iyad
-
Langhorst, Brad