Re: How to resolve the dependency of self-compiled C++ programs when publishing a tool?

Jin Li

11 Jul 2019 11 Jul '19

9:18 a.m.

Hi Devon, Thank you for your instant reply. May I ask how could I quickly put a C++ program in either conda-forge or bioconda? I am new to these package repositories. Thank you. Best regards, Jin On Thu, Jul 11, 2019 at 2:03 AM Devon Ryan <dpryan@dpryan.com> wrote:

...

Put your tool in either conda-forge or bioconda, that will take care of the issue. -- Devon Ryan, Ph.D. Email: dpryan@dpryan.com Data Manager/Bioinformatician Max Planck Institute of Immunobiology and Epigenetics Stübeweg 51 79108 Freiburg Germany

On Thu, Jul 11, 2019 at 8:44 AM Jin Li <lijin.abc@gmail.com> wrote:

...
Hi all,

I want to publish our own tool to the Galaxy Tool Shed. Our tool is our own developed C++ program. Our compiled binaries may not necessarily run in other machines due to different OS platforms. How could I resolve the tool dependency of our own compiled binaries? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Show replies by date

Björn Grüning

11 Jul 11 Jul

9:35 a.m.

New subject: How to resolve the dependency of self-compiled C++ programs when publishing a tool?

Hi Jin, if it is a Bio-related package, follow these instructions: https://bioconda.github.io/contributor/index.html For anything else these once: https://conda-forge.org/#contribute Cheers, Bjoern Am 11.07.19 um 09:18 schrieb Jin Li:

...

Hi Devon,

Thank you for your instant reply. May I ask how could I quickly put a C++ program in either conda-forge or bioconda? I am new to these package repositories. Thank you.

Best regards, Jin

On Thu, Jul 11, 2019 at 2:03 AM Devon Ryan <dpryan@dpryan.com> wrote:

...
Put your tool in either conda-forge or bioconda, that will take care of the issue. -- Devon Ryan, Ph.D. Email: dpryan@dpryan.com Data Manager/Bioinformatician Max Planck Institute of Immunobiology and Epigenetics Stübeweg 51 79108 Freiburg Germany

On Thu, Jul 11, 2019 at 8:44 AM Jin Li <lijin.abc@gmail.com> wrote:

...
Hi all,

I want to publish our own tool to the Galaxy Tool Shed. Our tool is our own developed C++ program. Our compiled binaries may not necessarily run in other machines due to different OS platforms. How could I resolve the tool dependency of our own compiled binaries? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Jin Li

5:27 p.m.

New subject: How to resolve the dependency of self-compiled C++ programs when publishing a tool?

Hi Bjoern, Thank you for your reply. Our tool is a Bio-related package. I will follow instructions to put it in bioconda. Thank you. Best regards, Jin On Thu, Jul 11, 2019 at 2:35 AM Björn Grüning <bjoern.gruening@gmail.com> wrote:

...

Hi Jin,

if it is a Bio-related package, follow these instructions:

https://bioconda.github.io/contributor/index.html

For anything else these once:

https://conda-forge.org/#contribute

Cheers, Bjoern

Am 11.07.19 um 09:18 schrieb Jin Li:

...
Hi Devon,

Thank you for your instant reply. May I ask how could I quickly put a C++ program in either conda-forge or bioconda? I am new to these package repositories. Thank you.

Best regards, Jin

On Thu, Jul 11, 2019 at 2:03 AM Devon Ryan <dpryan@dpryan.com> wrote:

...
Put your tool in either conda-forge or bioconda, that will take care of the issue. -- Devon Ryan, Ph.D. Email: dpryan@dpryan.com Data Manager/Bioinformatician Max Planck Institute of Immunobiology and Epigenetics Stübeweg 51 79108 Freiburg Germany

On Thu, Jul 11, 2019 at 8:44 AM Jin Li <lijin.abc@gmail.com> wrote:

...
Hi all,

I want to publish our own tool to the Galaxy Tool Shed. Our tool is our own developed C++ program. Our compiled binaries may not necessarily run in other machines due to different OS platforms. How could I resolve the tool dependency of our own compiled binaries? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Jin Li

24 Jul 24 Jul

6:24 p.m.

New subject: How to include a large datafile in a bioconda package?

Hi all, I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you. Best regards, Jin

Langhorst, Brad

6:31 p.m.

New subject: How to include a large datafile in a bioconda package?

Hi: I’d be concerned about that file changing or disappearing and causing irreproducibility. If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok. Could it be re-computed locally if necessary (like a genome index)? Maybe others know of examples where this is done. Brad On Jul 24, 2019, at 12:24 PM, Jin Li <lijin.abc@gmail.com<mailto:lijin.abc@gmail.com>> wrote: Hi all, I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you. Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/ Bradley W. Langhorst, Ph.D. Development Group Leader New England Biolabs

Jin Li

6:43 p.m.

New subject: How to include a large datafile in a bioconda package?

Hi Brad, Thank you for your quick reply. I can put the data file to Zenodo so that I will have a permanent location for it. As for re-computing the data file locally, it may need several days to run, so it may be quite inefficient to do the computing. I am expecting an automatic download of the data file when installing the package. Do we have a convention to do that? Thank you. Best regards, Jin On Wed, Jul 24, 2019 at 11:31 AM Langhorst, Brad <Langhorst@neb.com> wrote:

...

Hi:

I’d be concerned about that file changing or disappearing and causing irreproducibility. If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok.

Could it be re-computed locally if necessary (like a genome index)?

Maybe others know of examples where this is done.

Brad

On Jul 24, 2019, at 12:24 PM, Jin Li <lijin.abc@gmail.com> wrote:

Hi all,

I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Bradley W. Langhorst, Ph.D. Development Group Leader New England Biolabs

Dennis, H. E. Cicada Brokaw

8:06 p.m.

New subject: [External] Re: How to include a large datafile in a bioconda package?

When developing tools for CTAT, we used a DataManager to do this sort of thing. So the Admin has to download both the tool and the DataManager, and use the DataManager to download the large file and put it in the desired location on the system. Cicada Dennis ________________________________________ From: Jin Li <lijin.abc@gmail.com> Sent: Wednesday, July 24, 2019 12:43 PM To: Langhorst, Brad Cc: Galaxy Dev List Subject: [External] [galaxy-dev] Re: How to include a large datafile in a bioconda package? This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources. ------- Hi Brad, Thank you for your quick reply. I can put the data file to Zenodo so that I will have a permanent location for it. As for re-computing the data file locally, it may need several days to run, so it may be quite inefficient to do the computing. I am expecting an automatic download of the data file when installing the package. Do we have a convention to do that? Thank you. Best regards, Jin On Wed, Jul 24, 2019 at 11:31 AM Langhorst, Brad <Langhorst@neb.com> wrote:

...

Hi:

I’d be concerned about that file changing or disappearing and causing irreproducibility. If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok.

Could it be re-computed locally if necessary (like a genome index)?

Maybe others know of examples where this is done.

Brad

On Jul 24, 2019, at 12:24 PM, Jin Li <lijin.abc@gmail.com> wrote:

Hi all,

I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Bradley W. Langhorst, Ph.D. Development Group Leader New England Biolabs

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Björn Grüning

10:26 p.m.

New subject: How to include a large datafile in a bioconda package?

Hi Jin, you can use a post-link script in conda. Like here: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picrust2/po... This way the data can be fetch during tool installation. See more information here: https://docs.conda.io/projects/conda-build/en/latest/resources/link-scripts.... Ciao, Bjoern Am 24.07.19 um 18:43 schrieb Jin Li:

...

Hi Brad,

Thank you for your quick reply. I can put the data file to Zenodo so that I will have a permanent location for it.

As for re-computing the data file locally, it may need several days to run, so it may be quite inefficient to do the computing. I am expecting an automatic download of the data file when installing the package. Do we have a convention to do that? Thank you.

Best regards, Jin

On Wed, Jul 24, 2019 at 11:31 AM Langhorst, Brad <Langhorst@neb.com> wrote:

...
Hi:

I’d be concerned about that file changing or disappearing and causing irreproducibility. If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok.

Could it be re-computed locally if necessary (like a genome index)?

Maybe others know of examples where this is done.

Brad

On Jul 24, 2019, at 12:24 PM, Jin Li <lijin.abc@gmail.com> wrote:

Hi all,

I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Bradley W. Langhorst, Ph.D. Development Group Leader New England Biolabs

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Jin Li

10:52 p.m.

New subject: How to include a large datafile in a bioconda package?

Hi Bjoern, Thank you for your direction and information. The post-link script is exactly what I was looking for. I am glad I asked the question here. Thank you. Best regards, Jin On Wed, Jul 24, 2019 at 3:26 PM Björn Grüning <bjoern.gruening@gmail.com> wrote:

...

Hi Jin,

you can use a post-link script in conda.

Like here: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picrust2/po...

This way the data can be fetch during tool installation.

See more information here: https://docs.conda.io/projects/conda-build/en/latest/resources/link-scripts....

Ciao, Bjoern

Am 24.07.19 um 18:43 schrieb Jin Li:

...
Hi Brad,

Thank you for your quick reply. I can put the data file to Zenodo so that I will have a permanent location for it.

As for re-computing the data file locally, it may need several days to run, so it may be quite inefficient to do the computing. I am expecting an automatic download of the data file when installing the package. Do we have a convention to do that? Thank you.

Best regards, Jin

On Wed, Jul 24, 2019 at 11:31 AM Langhorst, Brad <Langhorst@neb.com> wrote:

...
Hi:

I’d be concerned about that file changing or disappearing and causing irreproducibility. If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok.

Could it be re-computed locally if necessary (like a genome index)?

Maybe others know of examples where this is done.

Brad

On Jul 24, 2019, at 12:24 PM, Jin Li <lijin.abc@gmail.com> wrote:

Hi all,

I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Bradley W. Langhorst, Ph.D. Development Group Leader New England Biolabs

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Peter Cock

10:53 p.m.

New subject: How to include a large datafile in a bioconda package?

That seems a good compromise within conda, since BioConda wouldn't want the binary package itself to be too big. (I'm doing something similar with some real sample data for a tool, putting it up on Zenodo. Of course, this is optional for my tool - your use case is different.) The Galaxy Data Manager route seems more appropriate if there is a choice of large data files which could be used with the tool (not just one). Peter On Wed, Jul 24, 2019 at 10:26 PM Björn Grüning <bjoern.gruening@gmail.com> wrote:

...

Hi Jin,

you can use a post-link script in conda.

Like here: https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picrust2/po...

This way the data can be fetch during tool installation.

See more information here: https://docs.conda.io/projects/conda-build/en/latest/resources/link-scripts....

Ciao, Bjoern

Am 24.07.19 um 18:43 schrieb Jin Li:

...
Hi Brad,

Thank you for your quick reply. I can put the data file to Zenodo so that I will have a permanent location for it.

As for re-computing the data file locally, it may need several days to run, so it may be quite inefficient to do the computing. I am expecting an automatic download of the data file when installing the package. Do we have a convention to do that? Thank you.

Best regards, Jin

On Wed, Jul 24, 2019 at 11:31 AM Langhorst, Brad <Langhorst@neb.com> wrote:

...
Hi:

I’d be concerned about that file changing or disappearing and causing irreproducibility. If the URL were to a permanent location (e.g. NCBI or zenodo) maybe it’s ok.

Could it be re-computed locally if necessary (like a genome index)?

Maybe others know of examples where this is done.

Brad

On Jul 24, 2019, at 12:24 PM, Jin Li <lijin.abc@gmail.com> wrote:

Hi all,

I am not sure if this mailing list is a good place to ask a bioconda question. Sorry to bother if not. I want to ask how to include a large data file when publishing a bioconda package. Our program depends on a pre-computed data file, which is too large to be included in the source code package. The data file can be accessed via a public URL. Can I put the downloading command in `build.sh` when publishing a bioconda package? If not, do we have a convention to deal with dependent large datafiles? Thank you.

Best regards, Jin ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

Bradley W. Langhorst, Ph.D. Development Group Leader New England Biolabs

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: %(web_page_url)s

To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/

2223

Age (days ago)

2236

Last active (days ago)

List overview

Download

9 comments

5 participants

participants (5)

Björn Grüning
Dennis, H. E. Cicada Brokaw
Jin Li
Langhorst, Brad
Peter Cock

Re: How to resolve the dependency of self-compiled C++ programs when publishing a tool?

tags

participants (5)