December 2012 - galaxy-dev - lists.galaxyproject.org

Main toolshed broken??
by Franco Caramia 04 Dec '12

04 Dec '12

Hi list, Trying to upload, delete, update files from a repository I own in the toolshed I keep getting Server Error.. This is the repo: http://fcaramia@toolshed.g2.bx.psu.edu/repos/fcaramia/methylation_analysis_b ismark Is anyone having the same issues?? I could just delete the repository but no option for that is given.. Thanks, Franco This email (including any attachments or links) may contain confidential and/or legally privileged information and is intended only to be read or used by the addressee. If you are not the intended addressee, any use, distribution, disclosure or copying of this email is strictly prohibited. Confidentiality and legal privilege attached to this email (including any attachments) are not waived or lost by reason of its mistaken delivery to you. If you have received this email in error, please delete it and notify us immediately by telephone or email. Peter MacCallum Cancer Centre provides no guarantee that this transmission is free of virus or that it has not been intercepted or altered and will not be liable for any delay in its receipt.

2 1

workflow input param issue
by Marc Logghe 04 Dec '12

04 Dec '12

Hi, I have a workflow that basically needs a select parameter as input. 2 steps in the workflow actually need the very same input. I don't think there is a (easy) way to let the user only input the parameter once and that it is passed to both steps. Anyhow, currently - as a workaround and not very user friendly - the user needs to input the very same parameter twice, for each step where that parameter is required. The first issue however, is that as soon as the first parameter is set, the second is set as well apparently (they have the same name, that could explain) which is fine, but not to the chosen one, eg. both are kind of reset to default. No errors or something, simply reset, which makes it impossible to enter the parameter. Next, I hoped to solve the issue by upgrading galaxy to the most recent version. This was not the case since the workflow did not run at all anymore, which brings me to the second issue (see exception dump below). Any ideas anyone ? Thanks and regards, Marc Error - <type 'exceptions.AttributeError'>: 'list' object has no attribute 'output_name' URL: http://smith:8889/workflow/run?id=4b187121143038ff File '/home/galaxy/galaxy-dev/eggs/Paste-1.6-py2.7.egg/paste/exceptions/errormiddleware.py', line 143 in __call__ app_iter = self.application(environ, start_response) File '/home/galaxy/galaxy-dev/eggs/Paste-1.6-py2.7.egg/paste/recursive.py', line 80 in __call__ return self.application(environ, start_response) File '/home/galaxy/galaxy-dev/eggs/Paste-1.6-py2.7.egg/paste/httpexceptions.py', line 632 in __call__ return self.application(environ, start_response) File '/home/galaxy/galaxy-dev/lib/galaxy/web/framework/base.py', line 160 in __call__ body = method( trans, **kwargs ) File '/home/galaxy/galaxy-dev/lib/galaxy/webapps/galaxy/controllers/workflow.py', line 1523 in run enable_unique_defaults=trans.app.config.enable_unique_workflow_defaults) File '/home/galaxy/galaxy-dev/lib/galaxy/web/framework/__init__.py', line 836 in fill_template return self.fill_template_mako( filename, **kwargs ) File '/home/galaxy/galaxy-dev/lib/galaxy/web/framework/__init__.py', line 847 in fill_template_mako return template.render( **data ) File '/home/galaxy/galaxy-dev/eggs/Mako-0.4.1-py2.7.egg/mako/template.py', line 296 in render return runtime._render(self, self.callable_, args, data) File '/home/galaxy/galaxy-dev/eggs/Mako-0.4.1-py2.7.egg/mako/runtime.py', line 660 in _render **_kwargs_for_callable(callable_, data)) File '/home/galaxy/galaxy-dev/eggs/Mako-0.4.1-py2.7.egg/mako/runtime.py', line 692 in _render_context _exec_template(inherit, lclcontext, args=args, kwargs=kwargs) File '/home/galaxy/galaxy-dev/eggs/Mako-0.4.1-py2.7.egg/mako/runtime.py', line 718 in _exec_template callable_(context, *args, **kwargs) File '/home/galaxy/galaxy-dev/database/compiled_templates/base.mako.py', line 42 in render_body __M_writer(unicode(next.body())) File '/home/galaxy/galaxy-dev/database/compiled_templates/workflow/run.mako.py', line 171 in render_body __M_writer(unicode(do_inputs( tool.inputs, step.state.inputs, errors.get( step.id, dict() ), "", step, None, used_accumulator ))) File '/home/galaxy/galaxy-dev/database/compiled_templates/workflow/run.mako.py', line 40 in do_inputs return render_do_inputs(context.locals_(__M_locals),inputs,values,errors,prefix,step,other_values,already_used) File '/home/galaxy/galaxy-dev/database/compiled_templates/workflow/run.mako.py', line 435 in render_do_inputs __M_writer(unicode(row_for_param( input, values[ input.name ], other_values, errors, prefix, step, already_used ))) File '/home/galaxy/galaxy-dev/database/compiled_templates/workflow/run.mako.py', line 338 in row_for_param return render_row_for_param(context,param,value,other_values,error_dict,prefix,step,already_used) File '/home/galaxy/galaxy-dev/database/compiled_templates/workflow/run.mako.py', line 498 in render_row_for_param __M_writer(unicode(conn.output_name)) AttributeError: 'list' object has no attribute 'output_name' ________________________________________ THIS E-MAIL MESSAGE IS INTENDED ONLY FOR THE USE OF THE INDIVIDUAL OR ENTITY TO WHICH IT IS ADDRESSED AND MAY CONTAIN INFORMATION THAT IS PRIVILEGED, CONFIDENTIAL AND EXEMPT FROM DISCLOSURE. If the reader of this E-mail message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately at ablynx(a)ablynx.com. Thank you for your co-operation. "NANOBODY" and "NANOCLONE" are registered trademarks of Ablynx N.V. ________________________________________

3 4

Re: [galaxy-dev] pass more information on a dataset merge
by Alex.Khassapov＠csiro.au 04 Dec '12

04 Dec '12

Hi John, My colleague (Neil) has a bit of a problem with the multi file support: When I try and use the option "Upload Directory of files" I get the error below Error Traceback: View as: Interactive | Text | XML (full) ⇝ AttributeError: 'Bunch' object has no attribute 'multifiles' URL: http://140.253.78.218/library_common/upload_library_dataset Module weberror.evalexception.middleware:364 in respond view >> app_iter = self.application(environ, detect_start_response) Module paste.debug.prints:98 in __call__ view >> environ, self.app) Module paste.wsgilib:539 in intercept_output view >> app_iter = application(environ, replacement_start_response) Module paste.recursive:80 in __call__ view >> return self.application(environ, start_response) Module paste.httpexceptions:632 in __call__ view >> return self.application(environ, start_response) Module galaxy.web.framework.base:160 in __call__ view >> body = method( trans, **kwargs ) Module galaxy.web.controllers.library_common:855 in upload_library_dataset view >> **kwd ) Module galaxy.web.controllers.library_common:1055 in upload_dataset view >> json_file_path = upload_common.create_paramfile( trans, uploaded_datasets ) Module galaxy.tools.actions.upload_common:342 in create_paramfile view >> multifiles = uploaded_dataset.multifiles, AttributeError: 'Bunch' object has no attribute 'multifiles' Any ideas? Should we check if 'multifiles' attribute is set? Or some other call is missing which should set it to NULL if it's missing? -Alex -----Original Message----- From: jmchilton(a)gmail.com [mailto:jmchilton@gmail.com] On Behalf Of John Chilton Sent: Wednesday, 17 October 2012 3:21 AM To: Khassapov, Alex (CSIRO IM&T, Clayton) Subject: Re: [galaxy-dev] pass more information on a dataset merge Wow, thanks for the rapid feedback! I have made the changes you have suggested. It seems you must be interested in this idea/implementation. Let me know if you have specific use cases/requirements in mind and/or if you would be interested in write access to the repository. -John On Mon, Oct 15, 2012 at 11:51 PM, <Alex.Khassapov(a)csiro.au> wrote: > Hi John, > > I tried your galaxy-central-homogeneous-composite-datatypes implementation, works great thank you (and Jorrit). > > A couple of fixes: > > 1. Add multi_upload.xml to too_conf.xml 2. > lib/galaxy/tools/parameters/grouping.py line 322 (in get_filenames( context )) - > "if ftp_files is not None:" > Remove "is not None" as ftp_files is empty [], but not None, then line 331 "user_ftp_dir = os.path.join( trans.app.config.ftp_upload_dir, trans.user.email )" throws an exeption if ftp_upload_dir isn't set. > > Alex > > -----Original Message----- > From: galaxy-dev-bounces(a)lists.bx.psu.edu > [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of John Chilton > Sent: Tuesday, 16 October 2012 1:07 AM > To: Jorrit Boekel > Cc: galaxy-dev(a)lists.bx.psu.edu > Subject: Re: [galaxy-dev] pass more information on a dataset merge > > Here is an implementation of the implicit multi-file composite datatypes piece of that idea. I think the implicit parallelism may be harder. > > https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-dat > atypes/compare > > Jorrit do you have any objection to me trying to get this included in galaxy-central (this is 95% code I stole from you)? I made the changes against a clean galaxy-central fork and included nothing proteomics specific in anticipation of trying to do that. I have talked with Jim Johnson about the idea and he believes it would be useful his mothur metagenomics tools, so the idea is valuable outside of proteomics. > > Galaxy team, would you be okay with including this and if so is there anything you would like to see either at a high level or at the level of the actual implementation. > > -John > > ------------------------------------------------ > John Chilton > Senior Software Developer > University of Minnesota Supercomputing Institute > Office: 612-625-0917 > Cell: 612-226-9223 > Bitbucket: https://bitbucket.org/jmchilton > Github: https://github.com/jmchilton > Web: http://jmchilton.net > > On Mon, Oct 8, 2012 at 9:24 AM, John Chilton <chilton(a)msi.umn.edu> wrote: >> Jim Johnson and I have been discussing that approach to handling >> fractionated proteomics samples as well (composite datatypes, not the >> specifics of the interface for parallelizing). >> >> My perspective has been that Galaxy should be augmented with better >> native mechanisms for grouping objects in histories, operating over >> those groups, building workflows that involve arbitrary numbers of >> inputs, etc... Composite data types are kindof a kludge, I think they >> are more useful for grouping HTML files together when you don't care >> about operating on the constituent parts you just want to view pages >> a as a report or something. With this proteomic data we are working >> with, the individual pieces are really interesting right? You want to >> operate on the individual pieces with the full array of tools (not >> just these special tools that have the logic for dealing with the >> composite datatypes), you want to visualize the files, etc... Putting >> these component pieces in the composite data type extra_files path >> really limits what you can do with the pieces in Galaxy. >> >> I have a vague idea of something that I think could bridge some of >> the gaps between the approaches (though I have no clue on the >> feasibility). Looking through your implementation on bitbucket it >> looks like you are defining your core datatypes (MS2, CruxSequest) as >> subclasses of this composite data type (CompositeMultifile). My >> recommendation would be to try to define plain datatypes for these >> core datatype (MS2, CruxSequest) and then have the separate composite >> datatype sort of delegate to the plain datatypes. >> >> You could then continue to explicitly declare subclasses of the >> composite datatype (maybe MS2Set, CruxSequestSet), but also maybe >> augement the tool xml so you can do implicit data type instances the >> way you can with tabular data for instance (instead of defining >> columns you would define the datatype to delegate to). >> >> The next step would be to make the parallelism implicit (i.e pull it >> out of the tool wrapper). Your tool wrappers wouldn't reference the >> composite datatypes, they would reference the simple datatypes, but >> you could add a little icon next to any input that let you replace a >> single input with a composite input for that type. It would be kind >> of like the run workflow page where you can replace an input with a >> multiple inputs. If a composite input (or inputs) are selected the >> tool would then produce composite outputs. >> >> For the steps that actually combine multiple inputs, I think in your >> case this is perculator maybe (a tool like interprophet or Scaffold >> that merges peptide probabilities across runs and groups proteins), >> then you could have the same sort of implicit replacement but instead >> of for single inputs it could do that for multi-inputs (assuming the >> Galaxy powers that be accept my fixes for multi-input tool parameters: >> https://bitbucket.org/galaxy/galaxy-central/pull-request/76/multi-input-dat…) >> >> The upshot of all of that would be that then even if these composites >> datatypes aren't used widely, other people could still use your >> proteomics tools (my users are definitely interested in Crux for >> instance) and you could then use other developers' proteomic tools >> with your composite datatypes even though they weren't designed with >> that use case in mind (I have msconvert, myrimatch, idpicker, >> proteinpilot, Ira Cooke has X! Tandem, OMSSA, TPP, and NBIC has an >> entire suite of label free quant tools). A third benefit would be >> that people working in other -omicses could make use of the >> homogenous composite datatype implementation without needing to >> rewrite their wrappers and datatypes. >> >> There is probably something that I am missing that makes this very >> difficult, let me know if you think this is a good idea and what its >> feasibility might be. I forked your repo and set off to try to >> implement some of this stuff last week and I ended up with my galaxy >> pull requests to improve batching workflows and multi-input tool >> parameters instead, but I hope to eventually get around to it. >> >> -John >> >> ------------------------------------------------ >> John Chilton >> Senior Software Developer >> University of Minnesota Supercomputing Institute >> Office: 612-625-0917 >> Cell: 612-226-9223 >> Bitbucket: https://bitbucket.org/jmchilton >> Github: https://github.com/jmchilton >> Web: http://jmchilton.net >> >> On Mon, Oct 1, 2012 at 8:24 AM, Jorrit Boekel >> <jorrit.boekel(a)scilifelab.se> wrote: >>> Dear list, >>> >>> I thought I was working with fairly large datasets, but they have >>> recently started to include ~2Gb files in sets of >50. I have ran >>> these sort of things before as merged data by using tar to roll them >>> up in one set, but when dealing with >100Gb tarfiles, Galaxy on EC2 >>> seems to get very slow, although that's probably because of my >>> implementation of dataset type detection (untar and read through files). >>> >>> Since tarring/untarring isn't very clean, I want to switch from >>> tarring to creating composite files on merge by putting a tool's >>> results into the dataset.extra_files_path. This doesn't seem to be >>> supported yet, because we currently pass in do_merge the output >>> dataset.filename to the respective datatype's merge method. >>> >>> I would like to pass more data to the merge method (let's say the >>> whole dataset object) to be able to get the composite files directory and 'merge' >>> the files in there. Good idea, bad idea? If anyone has views on >>> this, I'd love to hear them. >>> >>> cheers, >>> jorrit >>> >>> ___________________________________________________________ >>> Please keep all replies on the list by using "reply all" >>> in your mail client. To manage your subscriptions to this and other >>> Galaxy lists, please use the interface at: >>> >>> http://lists.bx.psu.edu/ > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/

4 3

TCP non-blocking connect() to timed-out in select() after 10000 milliseconds
by dengyongyilong 04 Dec '12

04 Dec '12

hi,when I use the "display at UCSC main",I got an error: " TCP non-blocking connect() to [link] timed-out in select() after 10000 milliseconds - Cancelling! Couldn't open http://[link]/root/display_as?id=9&display_app=ucsc&authz_method=display_at " how can I solve this problem? look for your help,thanks! dengyongyilong

1 0

FW: pass more information on a dataset merge
by Alex.Khassapov＠csiro.au 04 Dec '12

04 Dec '12

Thanks John, works fine. -Alex -----Original Message----- From: Burdett, Neil (ICT Centre, Herston - RBWH) Sent: Tuesday, 4 December 2012 9:57 AM To: Khassapov, Alex (CSIRO IM&T, Clayton) Cc: Szul, Piotr (ICT Centre, Marsfield) Subject: RE: [galaxy-dev] pass more information on a dataset merge Thanks Alex, seems to work now so I checked in the code to our repository Neil ________________________________________ From: jmchilton(a)gmail.com [jmchilton(a)gmail.com] On Behalf Of John Chilton [chil0060(a)umn.edu] Sent: Tuesday, December 04, 2012 4:26 AM To: Khassapov, Alex (CSIRO IM&T, Clayton) Cc: Burdett, Neil (ICT Centre, Herston - RBWH); Szul, Piotr (ICT Centre, Marsfield); galaxy-dev(a)lists.bx.psu.edu Subject: Re: [galaxy-dev] pass more information on a dataset merge Hey Alex, Until I have bullied this stuff into galaxy-central, you should probably e-mail me directly and not the dev list. That said thanks for the heads up, that there was a definitely a bug. I pushed out this changeset to the bitbucket repository: https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-datatype… I should mention that I have sort of abandoned the bitbucket repository for this work in lieu of github, so that I can rebase as Galaxy changes and keep clean changesets. https://github.com/jmchilton/galaxy-central/tree/multifiles Since I am posting this on the mailing list I might as well post a little summary of what has been done: - For each datatype, an implicit multiple file version of that datatype is created. A new multiple upload tool/ftp directory tool has been implemented to create these. - For any simple tool input you can chose a multiple file version of that input instead and then all outputs will become multiple file versions of the outputs. Uses task splitting stuff to distribute jobs across files. - For multiple input tools, you can choose either multiple inputs individuals (no change there) or a single composite version. Consistent interface for file path, display name, extension, etc... in tool wrapper. - It should work with most existing tools and datatypes without change. - Everything enabled with a single option in universe.ini Upshots: - Makes workflows with arbitrary merging (and to a lesser extent branching) and arbitrary number of input files possible. - Original base name is saved throughout analysis (when possible), so sample/replicate/fraction/lane/etc tracking is easier. I started working on the metadata piece last night, once that is done I was planning on making a little demo video to post to this list to try to sell the 3 outstanding small pull requests related to this work and the massive one that would follow those up :). -John On Sun, Dec 2, 2012 at 8:52 PM, <Alex.Khassapov(a)csiro.au> wrote: > Hi John, > > My colleague (Neil) has a bit of a problem with the multi file support: > > When I try and use the option "Upload Directory of files" I get the > error below > > Error Traceback: > View as: Interactive | Text | XML (full) > ⇝ AttributeError: 'Bunch' object has no attribute 'multifiles' > URL: http://140.253.78.218/library_common/upload_library_dataset > Module weberror.evalexception.middleware:364 in respond view >>> app_iter = self.application(environ, detect_start_response) > Module paste.debug.prints:98 in __call__ view >>> environ, self.app) > Module paste.wsgilib:539 in intercept_output view >>> app_iter = application(environ, replacement_start_response) > Module paste.recursive:80 in __call__ view >>> return self.application(environ, start_response) > Module paste.httpexceptions:632 in __call__ view >>> return self.application(environ, start_response) > Module galaxy.web.framework.base:160 in __call__ view >>> body = method( trans, **kwargs ) > Module galaxy.web.controllers.library_common:855 in upload_library_dataset view >>> **kwd ) > Module galaxy.web.controllers.library_common:1055 in upload_dataset view >>> json_file_path = upload_common.create_paramfile( trans, >>> uploaded_datasets ) > Module galaxy.tools.actions.upload_common:342 in create_paramfile view >>> multifiles = uploaded_dataset.multifiles, > AttributeError: 'Bunch' object has no attribute 'multifiles' > > Any ideas? Should we check if 'multifiles' attribute is set? Or some other call is missing which should set it to NULL if it's missing? > > -Alex > > -----Original Message----- > From: jmchilton(a)gmail.com [mailto:jmchilton@gmail.com] On Behalf Of > John Chilton > Sent: Wednesday, 17 October 2012 3:21 AM > To: Khassapov, Alex (CSIRO IM&T, Clayton) > Subject: Re: [galaxy-dev] pass more information on a dataset merge > > Wow, thanks for the rapid feedback! I have made the changes you have suggested. It seems you must be interested in this idea/implementation. Let me know if you have specific use cases/requirements in mind and/or if you would be interested in write access to the repository. > > -John > > On Mon, Oct 15, 2012 at 11:51 PM, <Alex.Khassapov(a)csiro.au> wrote: >> Hi John, >> >> I tried your galaxy-central-homogeneous-composite-datatypes implementation, works great thank you (and Jorrit). >> >> A couple of fixes: >> >> 1. Add multi_upload.xml to too_conf.xml 2. >> lib/galaxy/tools/parameters/grouping.py line 322 (in get_filenames( context )) - >> "if ftp_files is not None:" >> Remove "is not None" as ftp_files is empty [], but not None, then line 331 "user_ftp_dir = os.path.join( trans.app.config.ftp_upload_dir, trans.user.email )" throws an exeption if ftp_upload_dir isn't set. >> >> Alex >> >> -----Original Message----- >> From: galaxy-dev-bounces(a)lists.bx.psu.edu >> [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of John >> Chilton >> Sent: Tuesday, 16 October 2012 1:07 AM >> To: Jorrit Boekel >> Cc: galaxy-dev(a)lists.bx.psu.edu >> Subject: Re: [galaxy-dev] pass more information on a dataset merge >> >> Here is an implementation of the implicit multi-file composite datatypes piece of that idea. I think the implicit parallelism may be harder. >> >> https://bitbucket.org/galaxyp/galaxy-central-homogeneous-composite-da >> t >> atypes/compare >> >> Jorrit do you have any objection to me trying to get this included in galaxy-central (this is 95% code I stole from you)? I made the changes against a clean galaxy-central fork and included nothing proteomics specific in anticipation of trying to do that. I have talked with Jim Johnson about the idea and he believes it would be useful his mothur metagenomics tools, so the idea is valuable outside of proteomics. >> >> Galaxy team, would you be okay with including this and if so is there anything you would like to see either at a high level or at the level of the actual implementation. >> >> -John >> >> ------------------------------------------------ >> John Chilton >> Senior Software Developer >> University of Minnesota Supercomputing Institute >> Office: 612-625-0917 >> Cell: 612-226-9223 >> Bitbucket: https://bitbucket.org/jmchilton >> Github: https://github.com/jmchilton >> Web: http://jmchilton.net >> >> On Mon, Oct 8, 2012 at 9:24 AM, John Chilton <chilton(a)msi.umn.edu> wrote: >>> Jim Johnson and I have been discussing that approach to handling >>> fractionated proteomics samples as well (composite datatypes, not >>> the specifics of the interface for parallelizing). >>> >>> My perspective has been that Galaxy should be augmented with better >>> native mechanisms for grouping objects in histories, operating over >>> those groups, building workflows that involve arbitrary numbers of >>> inputs, etc... Composite data types are kindof a kludge, I think >>> they are more useful for grouping HTML files together when you don't >>> care about operating on the constituent parts you just want to view >>> pages a as a report or something. With this proteomic data we are >>> working with, the individual pieces are really interesting right? >>> You want to operate on the individual pieces with the full array of >>> tools (not just these special tools that have the logic for dealing >>> with the composite datatypes), you want to visualize the files, >>> etc... Putting these component pieces in the composite data type >>> extra_files path really limits what you can do with the pieces in Galaxy. >>> >>> I have a vague idea of something that I think could bridge some of >>> the gaps between the approaches (though I have no clue on the >>> feasibility). Looking through your implementation on bitbucket it >>> looks like you are defining your core datatypes (MS2, CruxSequest) >>> as subclasses of this composite data type (CompositeMultifile). My >>> recommendation would be to try to define plain datatypes for these >>> core datatype (MS2, CruxSequest) and then have the separate >>> composite datatype sort of delegate to the plain datatypes. >>> >>> You could then continue to explicitly declare subclasses of the >>> composite datatype (maybe MS2Set, CruxSequestSet), but also maybe >>> augement the tool xml so you can do implicit data type instances the >>> way you can with tabular data for instance (instead of defining >>> columns you would define the datatype to delegate to). >>> >>> The next step would be to make the parallelism implicit (i.e pull it >>> out of the tool wrapper). Your tool wrappers wouldn't reference the >>> composite datatypes, they would reference the simple datatypes, but >>> you could add a little icon next to any input that let you replace a >>> single input with a composite input for that type. It would be kind >>> of like the run workflow page where you can replace an input with a >>> multiple inputs. If a composite input (or inputs) are selected the >>> tool would then produce composite outputs. >>> >>> For the steps that actually combine multiple inputs, I think in your >>> case this is perculator maybe (a tool like interprophet or Scaffold >>> that merges peptide probabilities across runs and groups proteins), >>> then you could have the same sort of implicit replacement but >>> instead of for single inputs it could do that for multi-inputs >>> (assuming the Galaxy powers that be accept my fixes for multi-input tool parameters: >>> https://bitbucket.org/galaxy/galaxy-central/pull-request/76/multi-input-dat…) >>> >>> The upshot of all of that would be that then even if these >>> composites datatypes aren't used widely, other people could still >>> use your proteomics tools (my users are definitely interested in >>> Crux for >>> instance) and you could then use other developers' proteomic tools >>> with your composite datatypes even though they weren't designed with >>> that use case in mind (I have msconvert, myrimatch, idpicker, >>> proteinpilot, Ira Cooke has X! Tandem, OMSSA, TPP, and NBIC has an >>> entire suite of label free quant tools). A third benefit would be >>> that people working in other -omicses could make use of the >>> homogenous composite datatype implementation without needing to >>> rewrite their wrappers and datatypes. >>> >>> There is probably something that I am missing that makes this very >>> difficult, let me know if you think this is a good idea and what its >>> feasibility might be. I forked your repo and set off to try to >>> implement some of this stuff last week and I ended up with my galaxy >>> pull requests to improve batching workflows and multi-input tool >>> parameters instead, but I hope to eventually get around to it. >>> >>> -John >>> >>> ------------------------------------------------ >>> John Chilton >>> Senior Software Developer >>> University of Minnesota Supercomputing Institute >>> Office: 612-625-0917 >>> Cell: 612-226-9223 >>> Bitbucket: https://bitbucket.org/jmchilton >>> Github: https://github.com/jmchilton >>> Web: http://jmchilton.net >>> >>> On Mon, Oct 1, 2012 at 8:24 AM, Jorrit Boekel >>> <jorrit.boekel(a)scilifelab.se> wrote: >>>> Dear list, >>>> >>>> I thought I was working with fairly large datasets, but they have >>>> recently started to include ~2Gb files in sets of >50. I have ran >>>> these sort of things before as merged data by using tar to roll >>>> them up in one set, but when dealing with >100Gb tarfiles, Galaxy >>>> on EC2 seems to get very slow, although that's probably because of >>>> my implementation of dataset type detection (untar and read through files). >>>> >>>> Since tarring/untarring isn't very clean, I want to switch from >>>> tarring to creating composite files on merge by putting a tool's >>>> results into the dataset.extra_files_path. This doesn't seem to be >>>> supported yet, because we currently pass in do_merge the output >>>> dataset.filename to the respective datatype's merge method. >>>> >>>> I would like to pass more data to the merge method (let's say the >>>> whole dataset object) to be able to get the composite files directory and 'merge' >>>> the files in there. Good idea, bad idea? If anyone has views on >>>> this, I'd love to hear them. >>>> >>>> cheers, >>>> jorrit >>>> >>>> ___________________________________________________________ >>>> Please keep all replies on the list by using "reply all" >>>> in your mail client. To manage your subscriptions to this and >>>> other Galaxy lists, please use the interface at: >>>> >>>> http://lists.bx.psu.edu/ >> ___________________________________________________________ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/

1 0

Tool Shed - Upload Files To Repository Crash
by Adam Carr (NBI) 03 Dec '12

03 Dec '12

Hi, I'm having difficulties with the Tool Shed. I've got a freshly installed setup of galaxy-dist to run the community from and I'm trying to Upload files to a repository but getting an error when I do. My user requires the "NGS: QC and manipulation" section of tools as well as others and so I'm trying to import these into a local Tool Shed. I have therefore downloaded the tar.gz file from the public Galaxy Tool Shed, of the repository of the first tool listed within "NGS: QC and manipulation", the FASTQ Groomer, which is "sharplabtool" from http://toolshed.g2.bx.psu.edu/repository/view_repository?id=ecc93bc8f0382e9… Every time I attempt to do this upload, I have the following Server Error reported: ### Module paste.exceptions.errormiddleware:143 in __call__ >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> app_iter = self.application(environ, start_response) Module paste.debug.prints:98 in __call__ >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> environ, self.app) Module paste.wsgilib:539 in intercept_output >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> app_iter = application(environ, replacement_start_response) Module paste.recursive:80 in __call__ >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> return self.application(environ, start_response) Module paste.httpexceptions:632 in __call__ >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> return self.application(environ, start_response) Module galaxy.web.framework.base:160 in __call__ >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> body = method( trans, **kwargs ) Module galaxy.web.framework:94 in decorator >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> return func( self, trans, *args, **kwargs ) Module galaxy.webapps.community.controllers.upload:183 in upload >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> set_repository_metadata_due_to_new_tip( trans, repository, content_alert_str=content_alert_str, **kwd ) Module galaxy.webapps.community.controllers.common:634 in set_repository_metadata_due_to_new_tip >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> error_message, status = set_repository_metadata( trans, repository, content_alert_str=content_alert_str, **kwd ) Module galaxy.webapps.community.controllers.common:583 in set_repository_metadata >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> persist=False ) Module galaxy.util.shed_util_common:711 in generate_metadata_for_changeset_revision >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> invalid_files_and_errors_tups = check_tool_input_params( app, files_dir, name, tool, sample_file_metadata_paths ) Module galaxy.util.shed_util_common:246 in check_tool_input_params >> <http://galaxy-tools.nbi.ac.uk:9009/upload/upload?repository_id=9ed9121ed2b0…> if options.tool_data_table or options.missing_tool_data_table_name: AttributeError: 'str' object has no attribute 'tool_data_table' ### Can anyone help me as to why this happens and how I fix it? Any alternative suggestions as to how I can clone repo's from one Tool Shed to another would also be appreciated. Many Thanks, Adam. --- Adam Carr Linux Support & Developmnent E adam.carr(a)nbi.ac.uk T +44 1603 450161 W jic.ac.uk ifr.ac.uk tgac.ac.uk tsl.ac.uk NBI Partnership Ltd Norwich Research Park Norwich NR4 7UH The NBI Partnership Ltd provides non-scientific services to the Institute of Food Research, the John Innes Centre, The Genome Analysis Centre and The Sainsbury Laboratory

1 0

December 3, 2012 Distribution & News Brief
by Jennifer Jackson 03 Dec '12

03 Dec '12

December 3, 2012 Distribution & News Brief <http://wiki.galaxyproject.org/News/2012_12_03_DistributionNewsBrief> * Complete News Brief <http://wiki.galaxyproject.org/DevNewsBriefs/2012_12_03>* * Highlights:* * *NGS: Mapping* tools *Bowtie <http://bowtie-bio.sourceforge.net/index.shtml>* and *Lastz <http://www.bx.psu.edu/%7Ersharris/lastz/>* have moved from the *Galaxy distribution <https://bitbucket.org/galaxy/galaxy-dist>* to the *Galaxy Main Tool Shed <http://toolshed.g2.bx.psu.edu/>*. * Improvements in the display of *Tool Shed <http://wiki.galaxyproject.org/Tool%20Shed>* repository dependencies and contents. * More *Tool Shed <http://toolshed.g2.bx.psu.edu/>* updates including details of the Functional test framework <http://wiki.galaxyproject.org/HostingALocalToolShed#Functional_test_framewo…>, a new hgweb.config file and HgWebConfigManager tool, plus other management features. * Updated *UI* display and functionality for datasets and histories: new *paused* state and "resume/paused" toggle, new *History menu options*, and improved *Scatterplot* visualizations. * The *SGE* job runner has now been fully deprecated and replaced with *DRMAA*. * Several enhancements to aid with *reproducibility*: "Re-run" and "Extract workflow" validates datasets and tools, respectively, and a new data tables registry within the Administration menu, along with associated tools, corrects or warns about tool migration issues. * Highlights from the new Galaxy *CloudMan <http://usegalaxy.org/cloud>* release and /*December 2012 Galaxy Update <http://wiki.galaxyproject.org/GalaxyUpdates/2012_12>*/. * http://getgalaxy.org* *http://bitbucket.org/galaxy/galaxy-dist* *http://galaxy-dist.readthedocs.org* new: $ hg clone http://www.bx.psu.edu/hg/galaxy galaxy-dist upgrade: $ hg pull -u -r f364d992270c * Thanks for using Galaxy!* The Galaxy Team <http://wiki.galaxyproject.org/Galaxy%20Team> -- Jennifer Jackson http://galaxyproject.org

1 0

How to get the dataset history ID using the Galaxy API
by liram_vardi＠agilent.com 03 Dec '12

03 Dec '12

Hello all, I am trying to write a tool that traces datasets creation for a given Galaxy history id. Basically, I am trying to start with one dataset and recursively trace back its ancestors. To this end, I communicate with Galaxy using the great BioBlend Python package. In particular, I'm using the "show_dataset(history_id, dataset_id)" method. While it looks pretty straight forward, I got stuck on the following situation - Let's say I have a dataset with name of: "25: Reorder SAM/BAM on data 21: reordered bam". By this name, the ancestor of this data set is "data 21". But who is "data 21"?? Is it not just the 20st dataset on the history contents list? Well, unfortunately, not necessarily: if some history files were deleted (and subsequently purged), say dataset #17-19, they are indeed removed from the history list, however, the name of the dataset (i.e., "21") isn't correspondingly modified... I cannot just "pull out" the 20st dataset on the history contents list... That is, I am not able to find the history id number of the dataset using "show_dataset(history_id, dataset_id)" or any other API command... To be clear, When I say "History id" I mean - for a dataset "25: Reorder SAM/BAM on data 21: reordered bam", then "25" is the history id of dataset "Reorder SAM/BAM on data 21: reordered bam" Any help will be much appreciated! Liram

1 0

Auto delete dataset after workflow run?
by Praveen Raj Somarajan 03 Dec '12

03 Dec '12

Hello All, Is there a way to set "auto delete" some datafiles at the end of workflow run? I'm asking this because it is sometimes useful to delete some outputs (say intermediate files, .logs, etc), for example "auto delete" .sam file after executing sam-to-bam tool, which would free-up lot of space. This is especially useful when work with large number of samples. Look forward to any comments/suggestions. Best, Raj. ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD

3 4

Re: [galaxy-dev] 回复： Galaxy admin users
by Hans-Rudolf Hotz 03 Dec '12

03 Dec '12

Please keep all replies on the list by using "reply all" On 12/03/2012 03:49 PM, 泽蔡 wrote: > Hi Hans-Rudolf > Do you mean Crtl+C and then sh run.sh --reload? > If so ,I did. > If this is the way you kill and start your Galaxy server, then yes. - maybe the web page is cached? - maybe there is a typo and/or upper-case lower-case mix up - try using a space between the "=" and your e-mail address Regards, Hans-Rudolf > *发件人：* Hans-Rudolf Hotz <hrh(a)fmi.ch> > *收件人：* 泽蔡 <caizexi123(a)yahoo.com.cn> > *抄送：* "galaxy-dev(a)bx.psu.edu" <galaxy-dev(a)bx.psu.edu> > *发送日期：* 2012年12月3日, 星期一, 下午 10:10 > *主题:* Re: [galaxy-dev] Galaxy admin users > > > Have you restarted your Galaxy server? > > Regards, Hans-Rudolf > > On 12/03/2012 02:28 PM, 泽蔡 wrote: > > Hi all, > > I install a local instance on a linux system. And I tried to use the > > admin features of Galaxy, so I add my Galaxy login ( email ) to the list > > in the following config setting in the Galaxy configuration file > > universe_wsgi.ini. > > > > # this should be a comma-separated list of valid Galaxy users > > admin_users =xxxx@xxxx(the <mailto:xxxx@xxxx(the> email I registered) > > > > > > > > But when I login, I did not see the "Admin" menu item on the top > Galaxy menu bar. Why? Should I change anything esle? > > > > > > > > ___________________________________________________________ > > Please keep all replies on the list by using "reply all" > > in your mail client. To manage your subscriptions to this > > and other Galaxy lists, please use the interface at: > > > > http://lists.bx.psu.edu/ > > > >

1 0