It would be useful to have access $__history_id__ in the same way as one accesses $__user_id__. I am running a local instance of galaxy and have modified the following file to achieve this: lib/galaxy/jobs/__init__.py
At line ~694: incoming['__user_name__'] = user_name + if job.history and job.history.id: + incoming['__history_id__'] = job.history.id + else: + incoming['__history_id__'] = 'unknown' I have tested this change and it appears to give me exactly what I want. My question: does this change appear correct and can it be incorporated into the main galaxy code-base?
Thanks, -tim
At line ~694: incoming['__user_name__'] = user_name
if job.history and job.history.id:
incoming['__history_id__'] = job.history.id
else:
incoming['__history_id__'] = 'unknown'
I have tested this change and it appears to give me exactly what I want. My question: does this change appear correct
Yes, the change is correct.
and can it be incorporated into the main galaxy code-base?
Can you provide a usage scenario for including history_id in the tool dict?
Thanks, J.
Can you provide a usage scenario for including history_id in the tool
dict? A simple example is the creation of a full history based log file. My more detailed need is to parallel an existing analysis pipeline in galaxy, using the same underlying code. Reusing the same code from within galaxy has the great advantage of ensuring exactly the same process is run, allowing the "official" analysis to be repeated by outsiders. This pipeline works on a number of inputs (paired end reads, multiple replicates, etc.) and goes through a number of intervening steps where massive temporary files may be reused in several steps, then ultimately deleted after obtaining final results. The existing pipeline, outside of galaxy, creates these shared files in a temporary directory, exclusive to the "experiment" being run. In Galaxy however, ensuring that an "experiment" named by one user will not collide with one named by another user, accidentally overwriting existing files becomes a great deal easier if the temporary experiment directories have history_id based names. Since the histories could be shared among multiple users, the user_id will not be effective. Within a single history, steps can be rerun, new inputs can be introduced and overwriting temporary files is entirely appropriate. But collisions between 2 separate histories would not be desirable at all. Thanks, -tim
On Mon, Sep 23, 2013 at 7:31 PM, Jeremy Goecks jeremy.goecks@emory.eduwrote:
At line ~694: incoming['__user_name__'] = user_name
if job.history and job.history.id:
incoming['__history_id__'] = job.history.id
else:
incoming['__history_id__'] = 'unknown'
I have tested this change and it appears to give me exactly what I want. My question: does this change appear correct
Yes, the change is correct.
and can it be incorporated into the main galaxy code-base?
Can you provide a usage scenario for including history_id in the tool dict?
Thanks, J.
I'm still not fully understanding your usage scenario.
A simple example is the creation of a full history based log file.
I imagine that this would be an ideal use of the history API. Rather than having tools log history, write a script that uses the API to generate a history log.
My more detailed need is to parallel an existing analysis pipeline in galaxy, using the same underlying code. Reusing the same code from within galaxy has the great advantage of ensuring exactly the same process is run, allowing the "official" analysis to be repeated by outsiders. This pipeline works on a number of inputs (paired end reads, multiple replicates, etc.) and goes through a number of intervening steps where massive temporary files may be reused in several steps, then ultimately deleted after obtaining final results. The existing pipeline, outside of galaxy, creates these shared files in a temporary directory, exclusive to the "experiment" being run. In Galaxy however, ensuring that an "experiment" named by one user will not collide with one named by another user, accidentally overwriting existing files becomes a great deal easier if the temporary experiment directories have history_id based names.
Why not use mktemp() to ensure a unique directory that can house an experiment's data?
Since the histories could be shared among multiple users, the user_id will not be effective.
How are Galaxy histories shared amongst users?
J.
How are Galaxy histories shared amongst users?
My misunderstanding was that histories shared between users remain a single entity. Prompted by your question I have now tried history sharing, and see that the second user makes a distinct copy of the first user's history, thus changing the history_id. The history_id is exclusive to a single user. While a history_id would still provide a level of control that I would desire, I now see I can avoid unintended name collisions with the already available user_id.
Nevertheless... I am not sure I understand the resistance to providing a unique id associated with a history. Thanks, -tim
On Thu, Sep 26, 2013 at 10:22 AM, Jeremy Goecks jeremy.goecks@emory.eduwrote:
I'm still not fully understanding your usage scenario.
A simple example is the creation of a full history based log file.
I imagine that this would be an ideal use of the history API. Rather than having tools log history, write a script that uses the API to generate a history log.
My more detailed need is to parallel an existing analysis pipeline in galaxy, using the same underlying code. Reusing the same code from within galaxy has the great advantage of ensuring exactly the same process is run, allowing the "official" analysis to be repeated by outsiders. This pipeline works on a number of inputs (paired end reads, multiple replicates, etc.) and goes through a number of intervening steps where massive temporary files may be reused in several steps, then ultimately deleted after obtaining final results. The existing pipeline, outside of galaxy, creates these shared files in a temporary directory, exclusive to the "experiment" being run. In Galaxy however, ensuring that an "experiment" named by one user will not collide with one named by another user, accidentally overwriting existing files becomes a great deal easier if the temporary experiment directories have history_id based names.
Why not use mktemp() to ensure a unique directory that can house an experiment's data?
Since the histories could be shared among multiple users, the user_id will not be effective.
How are Galaxy histories shared amongst users?
J.
galaxy-dev@lists.galaxyproject.org