August 2008 - galaxy-user - lists.galaxyproject.org

Re: [galaxy-user] local sequence storage]
by Greg Von Kuster 12 Sep '08

12 Sep '08

4 3

workflow support in local instance
by Michael Rusch 28 Aug '08

28 Aug '08

I checked out galaxy from subversion today and installed it on a server here. I'm not seeing any workflow support through the interface. Is there something that needs to be done to enable it? Is the svn repos out-of-date? Am I missing something? Michael

2 1

newbie questions
by Michael Rusch 21 Aug '08

21 Aug '08

We're strongly considering switching to Galaxy from a piece of home-built software that we're in the process of developing. So, I have a couple of newbie questions to see what people's experience is. How does Galaxy scale? Does anybody have experience with scaling to thousands of datasets, or working with datasets in the hundreds of megabytes? We have traditionally done most of our work using a MySQL backend. I haven't (yet) received the green light from our sysadmin to install Postgres, and I'm wondering if anybody has any experience running on MySQL. Is it possible? Are there pitfalls? Has anybody by any chance implemented support for condor as a job scheduler? I think that's it for now. Thanks, Michael

2 1

Re: [galaxy-user] galaxy-user Digest, Vol 26, Issue 6
by Ross 20 Aug '08

20 Aug '08

Michael, I don't have any experience with Condor, but we're finding that the Galaxy framework scales very well - mostly because it doesn't do any of the computationally intense stuff itself - it hands that off through the job runner. Our internal Galaxy works fine with very large (6k subjects, Affy 6.0 snp chips, 9.6k subjects, Affy 5.0 snp chips...) datasets. Tools take a while to run (!) but Galaxy itself is more or less indifferent to the size of files because it only stores references (paths eg) to the disk files in the database - not the actual gigagobs of data. A collection of 100gb files takes about the same space in the Galaxy database tables as a collection of 1k ones as far as I can tell. A user's experience of Galaxy tool operation will obviously be impacted by the effects of physically shuffling large datafiles around for the cluster backend when a tool is run, so the cluster architecture, and the way datasets are made available to cluster nodes for processing is a key issue for very large datasets I suspect. On backends, I believe the party-line is that both PostgreSQL and MySQL are fully supported. We've used MySQL as our backend for nearly 2 years without any problems with released Galaxy versions - all 3 database backends are now all auto-tested before release AFAIK. Arguably, Postresql might be a better choice technically, and operationally, that's what runs the primary Galaxy site so is likely to work! My group remain familiar and comfortable with MySQL and don't have the energy to swap over. If you were going to swap, do it before you build a large userbase unless you have a bored DBA available to unload and reload a set of Galaxy history and user tables mid-stream. On Thu, Aug 21, 2008 at 2:00 AM, <galaxy-user-request(a)bx.psu.edu> wrote: > > 1. newbie questions (Michael Rusch) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 19 Aug 2008 16:53:27 -0500 > From: Michael Rusch <mcrusch(a)wisc.edu> > Subject: [galaxy-user] newbie questions > To: galaxy-user(a)bx.psu.edu > Message-ID: <8085BDD01E3A4A40A0F01E5C73BBA505(a)gel.local> > Content-Type: text/plain; charset="us-ascii" > > We're strongly considering switching to Galaxy from a piece of home-built > software that we're in the process of developing. So, I have a couple of > newbie questions to see what people's experience is. > > > > How does Galaxy scale? Does anybody have experience with scaling to > thousands of datasets, or working with datasets in the hundreds of > megabytes? > > > > We have traditionally done most of our work using a MySQL backend. I > haven't (yet) received the green light from our sysadmin to install > Postgres, and I'm wondering if anybody has any experience running on MySQL. > Is it possible? Are there pitfalls? > > > > Has anybody by any chance implemented support for condor as a job scheduler? > -- python -c "foo = map(None,'moc.liamg(a)surazal.ssor'); foo.reverse(); print ''.join(foo)"

1 0

megablast
by Ali Tofigh 18 Aug '08

18 Aug '08

I have a large set of short reads (from human) that I'm trying to analyze with galaxy. Specifically I want to map my nucleotide sequences to the human genome. I found the "short read analysis" section of galaxy. However, I can't seem to find any information on how to interpret the output from megablast. Could someone tell me what the numbers in the different columns represent? And when chosing target databases, what is "nt" and "wgs"? Thanks in advance, /Ali

2 1

[GALAXY] Galaxy Workflows are Here
by Anton Nekrutenko 18 Aug '08

18 Aug '08

Dear Galaxy User: Galaxy has just been upgraded with the most radical feature since its conception. It now features WORKFLOWS! Now you can: 1. Construct workflows by example ================================= Suppose you have just performed an analysis and now you want to re- run it again, modify its parameters, or use different input datasets. Previously (15 min ago) you would have had to redo it again. But now you can simply convert you existing history into a workflow! Watch how to construct workflow from a history here: http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC4/ Watch how to modify workflow's parameters here: http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC5/ 2. Construct workflows from scratch =================================== Galaxy now features an interactive workflow editor where things can be dragged around and connected. Watch how here: http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC7/ 3. Make custom tools out of workflows ===================================== If you have a workflow that you want to use over and over again, you can now put it on the tool menu. Watch how here: http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC6/ Finally, if you collaborate with someone, just share your workflows with that person. It is as easy as sharing histories. Workflows are in beta --------------------- Workflow support is currently in beta testing. Workflows may not work with all tools, may fail unexpectedly, and may not be compatible with future updates to Galaxy. If you encounter abnormal behavior, e-mail us to galaxy-bugs(a)bx.psu.edu. More on Galaxy -------------- To learn more about galaxy see here: http://galaxy.psu.edu and here: http://galaxy.psu.edu/screencasts.html anton nekrutenko on behalf of Galaxy Team (http://g2.trac.bx.psu.edu/wiki/GalaxyTeam) Anton Nekrutenko Asst. Professor Department of Biochemistry and Molecular Biology Center for Comparative Genomics and Bioinformatics Penn State University anton(a)bx.psu.edu http://nekrut.bx.psu.edu 814.865.4752

1 0

New Tool Question - how to copy meta data from input to output?
by Assaf Gordon 18 Aug '08

18 Aug '08

Hello, I'm writing a tool which accepts interval files and output interval files. The input file has the metadata set (chr/start/end columns). How can I copy the metadata to the generated output file? Currently, the output file's metadata is always set to chrom=1, start=2, end=3 (which I guess is the default for BED format). Other galaxy tools do it, so I'm sure it's possible. Which python code is responsible for that (to be included in my tool's XML) ? Thanks, Gordon.

2 3

Problems upgrading to mercurial
by Assaf Gordon 06 Aug '08

06 Aug '08

Hello, I'm trying to upgrade an existing galaxy installation to use mercurial, and experiencing some problems. In one directory (called "galaxy_devel_old") I've got the old galaxy code base, updated to SVN revision r2771. On another directory (called "galaxy_devel") I've got the new galaxy code base, which was created using: $ hg clone http://www.bx.psu.edu/hg/galaxy galaxy_devel Both installations use the same configuration files, the same postgres database, and the same 'database/files' directory. The old galaxy works fine. With the new galaxy, I get the following SQL error (at the bottom of the text log): ----------------- URL: http://tango:8081/root/history File 'build/bdist.solaris-2.11-i86pc/egg/weberror/evalexception/middleware.py', line 364 in respond File 'build/bdist.solaris-2.11-i86pc/egg/paste/debug/prints.py', line 98 in __call__ File 'build/bdist.solaris-2.11-i86pc/egg/paste/wsgilib.py', line 539 in intercept_output File 'build/bdist.solaris-2.11-i86pc/egg/beaker/session.py', line 103 in __call__ File 'build/bdist.solaris-2.11-i86pc/egg/paste/recursive.py', line 80 in __call__ File 'build/bdist.solaris-2.11-i86pc/egg/paste/httpexceptions.py', line 632 in __call__ File '/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/framework/base.py', line 125 in __call__ body = method( trans, **kwargs ) File '/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/controllers/root.py', line 66 in history return trans.fill_template( "root/history.mako", history=history ) File '/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/framework/__init__.py', line 497 in fill_template return self.fill_template_mako( filename, **kwargs ) File '/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/framework/__init__.py', line 507 in fill_template_mako return template.render( **data ) File 'build/bdist.solaris-2.11-i86pc/egg/mako/template.py', line 114 in render File 'build/bdist.solaris-2.11-i86pc/egg/mako/runtime.py', line 287 in _render File 'build/bdist.solaris-2.11-i86pc/egg/mako/runtime.py', line 304 in _render_context File 'build/bdist.solaris-2.11-i86pc/egg/mako/runtime.py', line 337 in _exec_template File '/media/sdb1/galaxy/galaxy_devel/database/compiled_templates/root/history.mako.py', line 35 in render_body if bool( [ data for data in history.active_datasets if data.state in ['running', 'queued', '', None ] ] ): File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/attributes.py', line 53 in __get__ File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/attributes.py', line 208 in get File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/strategies.py', line 226 in lazyload File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/query.py', line 359 in select_whereclause File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/query.py', line 1078 in _select_statement File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/query.py', line 977 in execute File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/session.py', line 195 in execute File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py', line 517 in execute File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py', line 557 in execute_clauseelement File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py', line 568 in execute_compiled File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py', line 581 in _execute_raw File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py', line 599 in _execute SQLError: (ProgrammingError) column history_dataset_association.copied_from_history_dataset_association_id does not exist LINE 1: ...ation.blurb AS history_dataset_association_blurb, history_da... ^ 'SELECT history_dataset_association.designation AS history_dataset_association_designation, anon_d37b.update_time AS anon_d37b_update_time, anon_d37b.deleted AS anon_d37b_deleted, anon_d37b._extra_files_path AS anon_d37b__extra_files_path, anon_d37b.purged AS anon_d37b_purged, anon_d37b.purgable AS anon_d37b_purgable, anon_d37b.state AS anon_d37b_state, anon_d37b.create_time AS anon_d37b_create_time, anon_d37b.file_size AS anon_d37b_file_size, anon_d37b.id AS anon_d37b_id, anon_d37b.external_filename AS anon_d37b_external_filename, history_dataset_association.visible AS history_dataset_association_visible, history_dataset_association.create_time AS history_dataset_association_create_time, history_dataset_association.dataset_id AS history_dataset_association_dataset_id, history_dataset_association.id AS history_dataset_association_id, history_dataset_association.parent_id AS history_dataset_association_parent_id, history_dataset_association.metadata AS history_dataset_association_metadata, history_dataset_association.blurb AS history_dataset_association_blurb, history_dataset_association.copied_from_history_dataset_association_id AS history_dataset_association_copied_from_history_dataset_a_1, history_dataset_association.peek AS history_dataset_association_peek, history_dataset_association.update_time AS history_dataset_association_update_time, history_dataset_association.deleted AS history_dataset_association_deleted, history_dataset_association.history_id AS history_dataset_association_history_id, history_dataset_association.hid AS history_dataset_association_hid, history_dataset_association.info AS history_dataset_association_info, history_dataset_association.name AS history_dataset_association_name, history_dataset_association.extension AS history_dataset_association_extension \nFROM history_dataset_association LEFT OUTER JOIN dataset AS anon_d37b ON anon_d37b.id = history_dataset_association.dataset_id \nWHERE history_dataset_association.history_id = %(lazy_245d)s AND NOT history_dataset_association.deleted ORDER BY history_dataset_association.hid ASC, anon_d37b.id' {'lazy_245d': 37L} ----------------- I guess I didn't perform some required database alteration, I just don't know which one... Please advise. Thanks for your help, Gordon.

1 0

Galaxy source repository change
by Nate Coraor 05 Aug '08

05 Aug '08

Hello, If you simply use the Galaxy Main or Test sites hosted at PSU, you can safely ignore this message. It pertains to people who've checked out their own copy of Galaxy for development or local use. Galaxy development has recently moved source control systems, from Subversion to Mercurial. This means that anyone using a local copy of Galaxy will need to make some changes to be able to download future updates. The preferred method of obtaining Galaxy source is through Mercurial directly (the 'hg' command): hg clone http://www.bx.psu.edu/hg/galaxy galaxy_dist Tarballs are also available via the zip/gz/bz2 links here: http://www.bx.psu.edu/hg And the repository is mirrored in Subversion here: svn co http://www.bx.psu.edu/svn/galaxy galaxy_dist Unfortunately, modifications to local copies of Galaxy will need to be transferred from your old checkout to the new checkout. Please let us know via <galaxy-bugs(a)bx.psu.edu> of any issues. Thanks, and thank you for using Galaxy. --nate

3 3

Importing features from BioMart
by Alexandre Gattiker 03 Aug '08

03 Aug '08

Hello, I would like to import features from BioMart into galaxy. However the 'embedded' BioMart view yields an HTML table. How do I convert that to GFF or BED? Best regards -- -------------------------------------------------------- Alexandre Gattiker Bioinformatics & Biostatistics Core Facility EPFL School of Life Sciences / Faculté des Sciences de la vie FSV http://people.epfl.ch/Alexandre.Gattiker

2 4