Can you show me a screenshot or json output of the workflow? Situations where the delete intermediates action wouldn't delete a dataset are: 1) Subsequent job that has not finished still needs that dataset. 2) Dataset is marked as an output. 3) The job that creates the dataset isn't finished at the time of the processing of the delete intermediates action. There's a log message that you'd see (in your system galaxy log) if the dataset state is detected as unfinished, something like "Workflow Intermediates cleanup attempted, but non terminal state..." On Tue, Sep 23, 2014 at 10:33 AM, julie dubois <dubjulie@gmail.com> wrote:
OK!!! Thanks it's more clear for me. So I've a bug! I've followed your instructions. the DeleteAction is attached to the final step of my workflow. Before this, 2 steps exist with one output dataset not marked as output of workflow for each of this 2 step. The dataset of my first step is hidden and deleted : good! But the dataset of my second step is hidden but not deleted ! I don't understand why
Thanks Julie
2014-09-23 15:49 GMT+02:00 Dannon Baker <dannon.baker@gmail.com>:
Hi Julie,
You don't attach the 'Delete Intermediate Datasets' action to a dataset you want to delete -- you attach it to one of the 'final' output steps of your workflow, at which point the action will clean up all datasets up to that point.
So, for a simple example, see this workflow for selecting random FASTA lines from FASTQ.
[image: Inline image 1]
"Select random lines" is flagged as an output step, and the "Delete Intermediates" action is attached to it. When the workflow runs, the FASTQ to FASTA step will be deleted.
More detail from the "Run workflow" page, showing the attached action:
[image: Inline image 2]
Does that help explain things?
-Dannon
On Tue, Sep 23, 2014 at 9:37 AM, julie dubois <dubjulie@gmail.com> wrote:
Hi Dannon,
Thanks for your answer. Sorry but I think I don't understand all of your explanations.
First : if an output is marked with a highlighted snowflake, it will be not deleted : it's OK Second : if a dataset is used in several steps of my workflow, it will be deleted when all these steps will be terminated : it's OK
But : "Perhaps the confusion is that they're also marked as 'hidden' since they are not output steps?" I don't understand : so in my mind, when the dataset is marked with a grey snowflake, it is automatically hidden and it's translated in the file of the workflow with this code : "post_job_actions": { "HideDatasetActionout_file1": { "action_arguments": {}, "action_type": "HideDatasetAction", "output_name": "out_file1" } },
But when I add another postJobAction to such dataset , with this translation in the workflow file : "post_job_actions": { "HideDatasetActionout_file1": { "action_arguments": {}, "action_type": "HideDatasetAction", "output_name": "out_file1" }, "DeleteIntermediatesActionout_file1": { "action_arguments": {}, "action_type": "DeleteIntermediatesAction", "output_name": "out_file1" } },
The result in my history is an hidden dataset (visible when I chek the hisotry option "include hidden dataset") but not a deleted dataset (visible when I chek the history option "include deleted datasets") And it is the case for datasets used in several steps of workflow AND for dataset which is just an output of a job not used in an other job.
Is an hidden dataset not deletable ? I've changed the order of postJobActions in the file, but no effect. And If I remove the "HideDatasetAction", the dataset is not removed because it become marked with highligthed snowflake.
So is your final purpose the alone solution to delete intermediates datasets ? And if I correctly understand, I must create a tool that take all my deletable datasets as input and which will remove all of these. In this case, the postJobAction "DeleteIntermediates" is not usefull, is it ?
Thanks.
Julie
Hi Julie!
The action resolves at the completion time of the job it is attached to, deleting datasets of any other already-completed jobs in that workflow which are not marked as an output step (with a highlighted snowflake). This will *not* delete datasets of any job marked as an output.
Perhaps the confusion is that they're also marked as 'hidden' since
not output steps?
The general use-case for this is to put it on one of the 'final' steps of a workflow (or multiple if you have a workflow with two long parallel
2014-09-23 14:15 GMT+02:00 Dannon Baker <dannon.baker@gmail.com>: they are tracks)
to clean up and delete datasets from intermediate jobs (which have already completed) that you don't want to save as an output.
-Dannon
On Tue, Sep 23, 2014 at 4:03 AM, julie dubois <dubjulie@gmail.com> wrote:
Hi all, I'm not sure how do this post-action act . I've been make some test and it's very confused for me.
First, can this post-action be used on output which are used on other steps of the workflow ? if yes, I suppose that galaxy wait for end of use of this output before deleting.
Second, must I higlight the snowfalke of this output to permit its deletion or must I keep the snowflake in grey ?
I post this question in galaxy-dev because I've added this post-action on several outputs in my workflow and it has had no effect on my outputs. And when I open the .ga file which describe my workflow, I've seen just one of this post-action in one output.
Thanks.
Julie ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/