Adding data sources in the main galaxy instance
I have a question regarding data sources currently available on the main instance of Galaxy (https://main.g2.bx.psu.edu). How can one get added as a new data source? I work at the NCBI on the Epigenomics database (http://www.ncbi.nlm.nih.gov/epigenomics) . We currently hold a large volume of NGS data in the form of wig files, and some users have expressed interest in using Galaxy for data analysis. We provide an easy to use user interface for examining these data as seen here (http://www.ncbi.nlm.nih.gov/epigenomics/browse/) and currently host over 4200 data tracks. I think it would useful if we could integrate Galaxy functionality into our resource. I also thing providing our resource as a "Data Source" would also be convenient for users less familiar with our database but are current Galaxy users. I have been looking through the Galaxy wiki and I am struggling to find documentation that details step-by-step, what exactly needs to be done. One thing to note, I am not a developer, I'm the scientific lead for the project and my programming/developing skills are lacking. I was hoping someone could point me to thorough documentation mainly to pass on to developers on my team. I guess also understanding my options with regards to integrating or interfacing with Galaxy would be very valuable to me too. Thank you for any help/suggestions you may have. Ian Ian Fingerman, Ph.D. Staff Scientist NIH/NLM/NCBI Building 45, Room 4AN28D-29 45 Center Drive MSC-6510 Bethesda, MD 20894 Phone: (301) 496-6806 fingerma@ncbi.nlm.nih.gov
Hi Ian, We've tried to make it as simple as possible to support communication with Galaxy. The protocols are described here: http://wiki.galaxyproject.org/Admin/Internals/Data%20Sources Note that page also links to a paper in the journal DATABASE which is entirely about data source integration. -- James Taylor, Assistant Professor, Biology/CS, Emory University On Tue, Mar 12, 2013 at 8:11 AM, Fingerman, Ian (NIH/NLM/NCBI) [E] < ian.fingerman@nih.gov> wrote:
I have a question regarding data sources currently available on the main instance of Galaxy (https://main.g2.bx.psu.edu). How can one get added as a new data source? I work at the NCBI on the Epigenomics database ( http://www.ncbi.nlm.nih.gov/epigenomics) . We currently hold a large volume of NGS data in the form of wig files, and some users have expressed interest in using Galaxy for data analysis. We provide an easy to use user interface for examining these data as seen here ( http://www.ncbi.nlm.nih.gov/epigenomics/browse/) and currently host over 4200 data tracks.****
** **
I think it would useful if we could integrate Galaxy functionality into our resource. I also thing providing our resource as a “Data Source” would also be convenient for users less familiar with our database but are current Galaxy users.****
** **
I have been looking through the Galaxy wiki and I am struggling to find documentation that details step-by-step, what exactly needs to be done. One thing to note, I am not a developer, I’m the scientific lead for the project and my programming/developing skills are lacking. I was hoping someone could point me to thorough documentation mainly to pass on to developers on my team. I guess also understanding my options with regards to integrating or interfacing with Galaxy would be very valuable to me too. ****
** **
Thank you for any help/suggestions you may have.****
** **
Ian****
** **
** **
Ian Fingerman, Ph.D.****
Staff Scientist****
NIH/NLM/NCBI****
Building 45, Room 4AN28D-29****
45 Center Drive MSC-6510****
Bethesda, MD 20894****
** **
Phone: (301) 496-6806****
fingerma@ncbi.nlm.nih.gov****
** **
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Thanks James. I have looked at the information you linked to. I just would like clarification on one point. If our developers follow the information in the DATABASE paper, will our data source be integrated into the public galaxy server (http://usegalaxy.org) and be included as part of the downloadable package? That was my main concern. Ian From: james@taylorlab.org [mailto:james@taylorlab.org] On Behalf Of James Taylor Sent: Thursday, March 14, 2013 1:00 PM To: Fingerman, Ian (NIH/NLM/NCBI) [E] Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Adding data sources in the main galaxy instance Hi Ian, We've tried to make it as simple as possible to support communication with Galaxy. The protocols are described here: http://wiki.galaxyproject.org/Admin/Internals/Data%20Sources Note that page also links to a paper in the journal DATABASE which is entirely about data source integration. -- James Taylor, Assistant Professor, Biology/CS, Emory University On Tue, Mar 12, 2013 at 8:11 AM, Fingerman, Ian (NIH/NLM/NCBI) [E] <ian.fingerman@nih.gov<mailto:ian.fingerman@nih.gov>> wrote: I have a question regarding data sources currently available on the main instance of Galaxy (https://main.g2.bx.psu.edu). How can one get added as a new data source? I work at the NCBI on the Epigenomics database (http://www.ncbi.nlm.nih.gov/epigenomics) . We currently hold a large volume of NGS data in the form of wig files, and some users have expressed interest in using Galaxy for data analysis. We provide an easy to use user interface for examining these data as seen here (http://www.ncbi.nlm.nih.gov/epigenomics/browse/) and currently host over 4200 data tracks. I think it would useful if we could integrate Galaxy functionality into our resource. I also thing providing our resource as a "Data Source" would also be convenient for users less familiar with our database but are current Galaxy users. I have been looking through the Galaxy wiki and I am struggling to find documentation that details step-by-step, what exactly needs to be done. One thing to note, I am not a developer, I'm the scientific lead for the project and my programming/developing skills are lacking. I was hoping someone could point me to thorough documentation mainly to pass on to developers on my team. I guess also understanding my options with regards to integrating or interfacing with Galaxy would be very valuable to me too. Thank you for any help/suggestions you may have. Ian Ian Fingerman, Ph.D. Staff Scientist NIH/NLM/NCBI Building 45, Room 4AN28D-29 45 Center Drive MSC-6510 Bethesda, MD 20894 Phone: (301) 496-6806<tel:%28301%29%20496-6806> fingerma@ncbi.nlm.nih.gov<mailto:fingerma@ncbi.nlm.nih.gov> ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Dear Dev Team, My program that I'm trying to integrate with Galaxy relies on SQLite databases to hold gene annotation information throughout multiple stages of processing. Given that the .db files produced by SQLite are not one of the standard data file types, is there any provision for a misc binary type for either a) data upload or b) intermediate processing? I find a lack of a misc 'binary' format surprising considering I can rename any file as .txt and it will upload as 'txt' format just fine (and presumably be loaded by my program within Galaxy just fine too). Similarly, my program produces image files (data plots) in any of the standard image file formats - PNG, PDF, JPEG. Would users be able to download these from Galaxy once my program has produced them? Any ideas on how to get this to work would be much appreciated. Kind regards, Cameron
On Wed, May 1, 2013 at 4:23 AM, Cameron Jack <cameron.jack@anu.edu.au> wrote:
Dear Dev Team,
My program that I'm trying to integrate with Galaxy relies on SQLite databases to hold gene annotation information throughout multiple stages of processing. Given that the .db files produced by SQLite are not one of the standard data file types, is there any provision for a misc binary type for either a) data upload or b) intermediate processing? I find a lack of a misc 'binary' format surprising considering I can rename any file as .txt and it will upload as 'txt' format just fine (and presumably be loaded by my program within Galaxy just fine too).
There isn't a generic 'binary' datatype as far as I know, but I don't think it would be very useful. Other people have already tried defining their own SQLite3 datatype in Galaxy, but note that for Galaxy's workflows etc to work you must treat it as a write once, read many (WORM) file. http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-December/012302.html I don't see anything using this in the Galaxy Tool Shed yet though (see the entry "Custom datatypes" on the left hand side of the Tool Shed): http://toolshed.g2.bx.psu.edu/
Similarly, my program produces image files (data plots) in any of the standard image file formats - PNG, PDF, JPEG. Would users be able to download these from Galaxy once my program has produced them?
Yes. PDF is already covered by a core datatype, 'pdf'. JPEG is already covered by a core datatype, 'jpg'. PNG is already covered by a core datatype, 'png' Peter
participants (4)
-
Cameron Jack
-
Fingerman, Ian (NIH/NLM/NCBI) [E]
-
James Taylor
-
Peter Cock