Comments inline. On Wed, Oct 10, 2012 at 2:35 PM, Mark Johnson <mjohnson@ncbi.nlm.nih.gov> wrote:
Thanks for the quick turnaround. On the contrary, your examples are very helpful. I'll try your approach. I haven't upgraded Galaxy in about 6 months, but was hoping that change had already been made. (I haven't worked on Galaxy in the interim.) Then do I understand right that the wiki is not up to date with respect to creating new datatypes?
I cannot find anything on the wiki about adding new binary datatypes, does the wiki reference modifying upload.py anywhere?
And does this change mean that new datatypes are something users will be able to distribute through the toolshed?
Hopefully! I haven't tested distributing datatypes at all via the toolshed let alone binary ones, but my guess is that it will work now. Perhaps Greg has a comment on this issue. It seems like it wouldn't work if you had multiple versions of the same datatype installed, but I don't know how Galaxy handles that even in for normal types.
I have a bit of confusion (being a novice Galaxian) about how conversion tools work. The NCBI SRA format has a collection of binary runtime programs (SRA toolkit) for performing various operations on .csra files. So my new Galaxy tools that rely on those programs will have somehow to access them. Is there a formalized way to package Galaxy tools with binaries, so both the Galaxy XML+Python and the binaries are distributed together? Or do you have some way of telling the Galaxy admin what needs to be installed, and where, for Galaxy tools to work?
Greg has been hard at work over the last several months implementing the ability to package scripts for installing dependencies along with your tool repository. http://wiki.g2.bx.psu.edu/ToolShedToolFeatures Again, I haven't played with that feature - I'm waiting on the ability to script interactions with the tool shed and inter-repository dependencies. -John
I tried to follow the link that Nate sent to Pieter on the site I mentioned: http://bitbucket.org/galaxy/galaxy-central/issue/304/make-binary-datatype-up... but BitBucket says
You do not have access to the issues.
Use the links at the top to get back.
:-(
They have disabled the issue tracker, but have promised the data will be transplanted somewhere new at some point. -John
Is there a way to get access to the issues?
Thanks 10e+06
--Mark Johnson NCBI
On 10/10/12 2:45 PM, John Chilton wrote:
Hey Mark,
A few things, instead of printing stuff out you could use the global variable log in the binary.py file. Maybe something like log.warn("SNIFFING csra"). I think your logging statements will be more likely to show up then.
I don't know why what you are doing won't work (it looks like it should), but a recent dist update included some changes that I submitted that allow for the addition of binary datatypes without needing to modify upload.py which I think is a big improvement. I would recommend updating galaxy and transitioning to this new mechanism.
I have been working with Ira Cooke to gather a lot of generally useful proteomics datatypes into one place, this includes Pieter Neerincx RAW file datatype from the original dev list e-mail you linked to so it is a good example of how this has changed. Here is said file:
https://bitbucket.org/iracooke/protk-toolshed/src/tip/lib/galaxy/datatypes/p...
Basically, after you define your binary type all you need to do is register it as a sniffable binary format, in your case this would be adding this line:
Binary.register_sniffable_binary_format('crsa', 'crsa', Crsa)
after the last line of this example: http://pastie.org/5030960. Note there is no indenting this should be at the top level of the python code.
As a side note (since this isn't documented anywhere else), in the proteomics.py example I went a step further to allow for backward compatibility with Galaxy versions predating the inclusion of these binary datatype rewrites by using the following construct:
if hasattr(Binary, 'register_sniffable_binary_format'): Binary.register_sniffable_binary_format('RAW', 'RAW', RAW)
I hope this helps, but I imagine my rambling e-mails only ever serve to confuse issues.
-John
------------------------------------------------ John Chilton Senior Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 Bitbucket: https://bitbucket.org/jmchilton Github: https://github.com/jmchilton Web: http://jmchilton.net
On Wed, Oct 10, 2012 at 1:21 PM, Mark Johnson <mjohnson@ncbi.nlm.nih.gov> wrote:
I'm trying to add CSRA (NCBI Compressed Sequence Read Archive) as a new datatype for Galaxy.
I've followed the instructions on the wiki, and the module seems to load OK. csra shows up as a datatype in the upload view.
But the upload fails, and the uploaded file size is always 0. The actual file I upload is 156k.
Here are my changes:
In binary.py: http://pastie.org/5030960
In upload.py (following the example at http://dev.list.galaxyproject.org/Binary-datatypes-td4135969.html): http://pastie.org/5030967
datatypes_conf.xml: http://pastie.org/5030970
I also have a very hard time debugging Galaxy. Where can I look for an error stream that explains what it's doing? paster.log only tells me the HTTP traffic. I need to know where it is failing to know where to look. And my code needs to be able to, at the very least, print debug messages. How do people generally do that in Galaxy?
Thanks
--Mark Johnson
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: