Extracting the metrics files from a composite file such as Picard
Hi, Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools? I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do... I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream.... Regards, Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839 thondeboer@me.com LinkedIn Profile
Hi, Thon, On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer <thondeboer@me.com> wrote:
Hi,
Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?
The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories. The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets. One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets. It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team. If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?
I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...
I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream....
Regards,
Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839 thondeboer@me.com LinkedIn Profile <http://www.linkedin.com/pub/thon-de-boer/1/1ba/a5b>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
Hi Ross, Thanks for the reply...It is indeed as I feared...I'll probably refactor the code so the user is able to get the actual output files. Thon On Feb 20, 2012, at 01:29 PM, Ross <ross.lazarus@gmail.com> wrote:
Hi, Thon,
On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer <thondeboer@me.com> wrote:
Hi,
Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?
The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories. The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets.
One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets. It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team.
If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?
I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...
I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream....
Regards,
Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839
thondeboer@me.com
LinkedIn Profile
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
Thon, I just had an idea - write a new tool that takes an Html dataset from the user's history and a file specification (eg "foo.xls") and 'promotes' the file(s) in the extra_files_path that match the file specification into new history items? That way you can automate the process - of course, including the outputs in workflows when their number is not known at execution may be tricky, but at least this is a generic approach and won't require any changes to any of the tools that generate Html outputs? On Tue, Feb 21, 2012 at 8:45 AM, Anthonius deBoer <thondeboer@me.com> wrote:
Hi Ross,
Thanks for the reply...It is indeed as I feared...I'll probably refactor the code so the user is able to get the actual output files.
Thon
On Feb 20, 2012, at 01:29 PM, Ross <ross.lazarus@gmail.com> wrote:
Hi, Thon,
On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer <thondeboer@me.com> wrote:
Hi,
Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?
The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories. The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets.
One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets. It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team.
If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?
I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...
I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream....
Regards,
Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839
thondeboer@me.com
LinkedIn Profile <http://www.linkedin.com/pub/thon-de-boer/1/1ba/a5b>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
Yeah...I was thinking something like that...I think it is possible to produce a varied amount of datafiles If I recall correctly... http://wiki.g2.bx.psu.edu/Admin/Tools/Multiple%20Output%20Files Thon On Feb 20, 2012, at 07:11 PM, Ross <ross.lazarus@gmail.com> wrote:
Thon, I just had an idea - write a new tool that takes an Html dataset from the user's history and a file specification (eg "foo.xls") and 'promotes' the file(s) in the extra_files_path that match the file specification into new history items? That way you can automate the process - of course, including the outputs in workflows when their number is not known at execution may be tricky, but at least this is a generic approach and won't require any changes to any of the tools that generate Html outputs?
On Tue, Feb 21, 2012 at 8:45 AM, Anthonius deBoer <thondeboer@me.com> wrote:
Hi Ross,
Thanks for the reply...It is indeed as I feared...I'll probably refactor the code so the user is able to get the actual output files.
Thon
On Feb 20, 2012, at 01:29 PM, Ross <ross.lazarus@gmail.com> wrote:
Hi, Thon,
On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer <thondeboer@me.com> wrote:
Hi,
Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?
The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories. The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets.
One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets. It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team.
If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?
I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...
I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream....
Regards,
Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839
thondeboer@me.com
LinkedIn Profile
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
Actually...Is there a way to extract the individual datafiles from a composite datatype? http://wiki.g2.bx.psu.edu/Admin/Datatypes/Composite%20Datatypes There is a wiki on creating Composite Datatypes but not the equivalent reading of composite datatypes, is there? Thon On Feb 20, 2012, at 07:11 PM, Ross <ross.lazarus@gmail.com> wrote:
Thon, I just had an idea - write a new tool that takes an Html dataset from the user's history and a file specification (eg "foo.xls") and 'promotes' the file(s) in the extra_files_path that match the file specification into new history items? That way you can automate the process - of course, including the outputs in workflows when their number is not known at execution may be tricky, but at least this is a generic approach and won't require any changes to any of the tools that generate Html outputs?
On Tue, Feb 21, 2012 at 8:45 AM, Anthonius deBoer <thondeboer@me.com> wrote:
Hi Ross,
Thanks for the reply...It is indeed as I feared...I'll probably refactor the code so the user is able to get the actual output files.
Thon
On Feb 20, 2012, at 01:29 PM, Ross <ross.lazarus@gmail.com> wrote:
Hi, Thon,
On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer <thondeboer@me.com> wrote:
Hi,
Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?
The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories. The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets.
One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets. It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team.
If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?
I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...
I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream....
Regards,
Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839
thondeboer@me.com
LinkedIn Profile
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
Thon, Others may have additional suggestions, but AFAIK: in the xml for the new tool, if "ht" is the name of the Html data tool parameter chosen by the user, then you can pass the absolute path to the directory containing all the files the user could choose from as a parameter eg "-efp $ht.extra_files_path" to the executable script for the tool. The script can then match the user supplied file specification and copy the appropriate file to the tool output file path As you probably know there is a tool xml syntax for creating an arbitrary number of history output files but I would suggest you get the simplest case of a specified single new output, before building the more complex and general scenario. On Wed, Feb 22, 2012 at 8:27 AM, <thondeboer@me.com> wrote:
Actually...Is there a way to extract the individual datafiles from a composite datatype?
http://wiki.g2.bx.psu.edu/Admin/Datatypes/Composite%20Datatypes
There is a wiki on creating Composite Datatypes but not the equivalent reading of composite datatypes, is there?
Thon
On Feb 20, 2012, at 07:11 PM, Ross <ross.lazarus@gmail.com> wrote:
Thon, I just had an idea - write a new tool that takes an Html dataset from the user's history and a file specification (eg "foo.xls") and 'promotes' the file(s) in the extra_files_path that match the file specification into new history items? That way you can automate the process - of course, including the outputs in workflows when their number is not known at execution may be tricky, but at least this is a generic approach and won't require any changes to any of the tools that generate Html outputs?
On Tue, Feb 21, 2012 at 8:45 AM, Anthonius deBoer <thondeboer@me.com>wrote:
Hi Ross,
Thanks for the reply...It is indeed as I feared...I'll probably refactor the code so the user is able to get the actual output files.
Thon
On Feb 20, 2012, at 01:29 PM, Ross <ross.lazarus@gmail.com> wrote:
Hi, Thon,
On Tue, Feb 21, 2012 at 6:47 AM, Thon Deboer <thondeboer@me.com> wrote:
Hi,
Is there a way to extract the individual files from a composite file, such as the HTML files created by the picard tools?
The decision to hide multiple outputs in a single history html object has this as a downside to the benefits of less cluttered histories. The results you want can be manually extracted in the usual ways - eg pasting the relevant html page url into an upload box.. but that doesn't solve your challenge of automating the process for a very large number of datasets.
One possible solution is to add some complexity to each of the relevant tool forms to allow the user to specifically nominate outputs to be returned as individual new history datasets. It is not a huge task but it's not as far as I'm aware, high on the list of priorities for the team.
If you are motivated sufficiently to fix this so it can do what you need, contributions of code to improve Galaxy tools are always very welcome ?
I would like to take the metrics files and use them further down in some workflow, but I only get an HTML file... While these HTML files are nice for quickly looking at the results of one or two files, it becomes a problem if you have 88 samples like I do...
I hate to have to re-write all the picard tools to produce actual files, but maybe there is something I am missing about composite datatype files? The wiki only explains how to CREATE them, not how to use those files downstream....
Regards,
Thon de Boer, Ph.D. Bioinformatics Guru +1-650-799-6839
thondeboer@me.com
LinkedIn Profile <http://www.linkedin.com/pub/thon-de-boer/1/1ba/a5b>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
-- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
participants (4)
-
Anthonius deBoer
-
Ross
-
Thon Deboer
-
thondeboer@me.com