workflow support in local instance
by Michael Rusch
I checked out galaxy from subversion today and installed it on a server
here. I'm not seeing any workflow support through the interface. Is there
something that needs to be done to enable it? Is the svn repos out-of-date?
Am I missing something?
Michael
14 years, 5 months
newbie questions
by Michael Rusch
We're strongly considering switching to Galaxy from a piece of home-built
software that we're in the process of developing. So, I have a couple of
newbie questions to see what people's experience is.
How does Galaxy scale? Does anybody have experience with scaling to
thousands of datasets, or working with datasets in the hundreds of
megabytes?
We have traditionally done most of our work using a MySQL backend. I
haven't (yet) received the green light from our sysadmin to install
Postgres, and I'm wondering if anybody has any experience running on MySQL.
Is it possible? Are there pitfalls?
Has anybody by any chance implemented support for condor as a job scheduler?
I think that's it for now.
Thanks,
Michael
14 years, 5 months
Re: [galaxy-user] galaxy-user Digest, Vol 26, Issue 6
by Ross
Michael, I don't have any experience with Condor, but we're finding
that the Galaxy framework scales very well - mostly because it doesn't
do any of the computationally intense stuff itself - it hands that off
through the job runner. Our internal Galaxy works fine with very
large (6k subjects, Affy 6.0 snp chips, 9.6k subjects, Affy 5.0 snp
chips...) datasets. Tools take a while to run (!) but Galaxy itself is
more or less indifferent to the size of files because it only stores
references (paths eg) to the disk files in the database - not the
actual gigagobs of data. A collection of 100gb files takes about the
same space in the Galaxy database tables as a collection of 1k ones as
far as I can tell. A user's experience of Galaxy tool operation will
obviously be impacted by the effects of physically shuffling large
datafiles around for the cluster backend when a tool is run, so the
cluster architecture, and the way datasets are made available to
cluster nodes for processing is a key issue for very large datasets I
suspect.
On backends, I believe the party-line is that both PostgreSQL and
MySQL are fully supported. We've used MySQL as our backend for nearly
2 years without any problems with released Galaxy versions - all 3
database backends are now all auto-tested before release AFAIK.
Arguably, Postresql might be a better choice technically, and
operationally, that's what runs the primary Galaxy site so is likely
to work! My group remain familiar and comfortable with MySQL and don't
have the energy to swap over. If you were going to swap, do it before
you build a large userbase unless you have a bored DBA available to
unload and reload a set of Galaxy history and user tables mid-stream.
On Thu, Aug 21, 2008 at 2:00 AM, <galaxy-user-request(a)bx.psu.edu> wrote:
>
> 1. newbie questions (Michael Rusch)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 19 Aug 2008 16:53:27 -0500
> From: Michael Rusch <mcrusch(a)wisc.edu>
> Subject: [galaxy-user] newbie questions
> To: galaxy-user(a)bx.psu.edu
> Message-ID: <8085BDD01E3A4A40A0F01E5C73BBA505(a)gel.local>
> Content-Type: text/plain; charset="us-ascii"
>
> We're strongly considering switching to Galaxy from a piece of home-built
> software that we're in the process of developing. So, I have a couple of
> newbie questions to see what people's experience is.
>
>
>
> How does Galaxy scale? Does anybody have experience with scaling to
> thousands of datasets, or working with datasets in the hundreds of
> megabytes?
>
>
>
> We have traditionally done most of our work using a MySQL backend. I
> haven't (yet) received the green light from our sysadmin to install
> Postgres, and I'm wondering if anybody has any experience running on MySQL.
> Is it possible? Are there pitfalls?
>
>
>
> Has anybody by any chance implemented support for condor as a job scheduler?
>
--
python -c "foo = map(None,'moc.liamg(a)surazal.ssor'); foo.reverse();
print ''.join(foo)"
14 years, 5 months
megablast
by Ali Tofigh
I have a large set of short reads (from human) that I'm trying to analyze
with galaxy. Specifically I want to map my nucleotide sequences to the human
genome. I found the "short read analysis" section of galaxy. However, I
can't seem to find any information on how to interpret the output from
megablast. Could someone tell me what the numbers in the different columns
represent?
And when chosing target databases, what is "nt" and "wgs"?
Thanks in advance,
/Ali
14 years, 5 months
[GALAXY] Galaxy Workflows are Here
by Anton Nekrutenko
Dear Galaxy User:
Galaxy has just been upgraded with the most radical feature since its
conception. It now features WORKFLOWS!
Now you can:
1. Construct workflows by example
=================================
Suppose you have just performed an analysis and now you want to re-
run it again, modify its parameters, or use different input datasets.
Previously (15 min ago) you would have had to redo it again. But now
you can simply convert you existing history into a workflow!
Watch how to construct workflow from a history here:
http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC4/
Watch how to modify workflow's parameters here:
http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC5/
2. Construct workflows from scratch
===================================
Galaxy now features an interactive workflow editor where things can
be dragged around and connected.
Watch how here:
http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC7/
3. Make custom tools out of workflows
=====================================
If you have a workflow that you want to use over and over again, you
can now put it on the tool menu.
Watch how here:
http://screencast.g2.bx.psu.edu/galaxy/WorkFlow_SC6/
Finally, if you collaborate with someone, just share your workflows
with that person. It is as easy as sharing histories.
Workflows are in beta
---------------------
Workflow support is currently in beta testing. Workflows may not work
with all tools, may fail unexpectedly, and may not be compatible with
future updates to Galaxy. If you encounter abnormal behavior, e-mail
us to galaxy-bugs(a)bx.psu.edu.
More on Galaxy
--------------
To learn more about galaxy see here:
http://galaxy.psu.edu
and here:
http://galaxy.psu.edu/screencasts.html
anton nekrutenko
on behalf of Galaxy Team (http://g2.trac.bx.psu.edu/wiki/GalaxyTeam)
Anton Nekrutenko
Asst. Professor
Department of Biochemistry and Molecular Biology
Center for Comparative Genomics and Bioinformatics
Penn State University
anton(a)bx.psu.edu
http://nekrut.bx.psu.edu
814.865.4752
14 years, 5 months
New Tool Question - how to copy meta data from input to output?
by Assaf Gordon
Hello,
I'm writing a tool which accepts interval files and output interval files.
The input file has the metadata set (chr/start/end columns).
How can I copy the metadata to the generated output file?
Currently, the output file's metadata is always set to chrom=1,
start=2, end=3 (which I guess is the default for BED format).
Other galaxy tools do it, so I'm sure it's possible. Which python code
is responsible for that (to be included in my tool's XML) ?
Thanks,
Gordon.
14 years, 5 months
Problems upgrading to mercurial
by Assaf Gordon
Hello,
I'm trying to upgrade an existing galaxy installation to use mercurial,
and experiencing some problems.
In one directory (called "galaxy_devel_old") I've got the old galaxy
code base, updated to SVN revision r2771.
On another directory (called "galaxy_devel") I've got the new galaxy
code base, which was created using:
$ hg clone http://www.bx.psu.edu/hg/galaxy galaxy_devel
Both installations use the same configuration files, the same postgres
database, and the same 'database/files' directory.
The old galaxy works fine.
With the new galaxy, I get the following SQL error (at the bottom of the
text log):
-----------------
URL: http://tango:8081/root/history
File
'build/bdist.solaris-2.11-i86pc/egg/weberror/evalexception/middleware.py',
line 364 in respond
File 'build/bdist.solaris-2.11-i86pc/egg/paste/debug/prints.py', line 98
in __call__
File 'build/bdist.solaris-2.11-i86pc/egg/paste/wsgilib.py', line 539 in
intercept_output
File 'build/bdist.solaris-2.11-i86pc/egg/beaker/session.py', line 103 in
__call__
File 'build/bdist.solaris-2.11-i86pc/egg/paste/recursive.py', line 80 in
__call__
File 'build/bdist.solaris-2.11-i86pc/egg/paste/httpexceptions.py', line
632 in __call__
File '/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/framework/base.py',
line 125 in __call__
body = method( trans, **kwargs )
File
'/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/controllers/root.py',
line 66 in history
return trans.fill_template( "root/history.mako", history=history )
File
'/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/framework/__init__.py',
line 497 in fill_template
return self.fill_template_mako( filename, **kwargs )
File
'/media/sdb1/galaxy/galaxy_devel/lib/galaxy/web/framework/__init__.py',
line 507 in fill_template_mako
return template.render( **data )
File 'build/bdist.solaris-2.11-i86pc/egg/mako/template.py', line 114 in
render
File 'build/bdist.solaris-2.11-i86pc/egg/mako/runtime.py', line 287 in
_render
File 'build/bdist.solaris-2.11-i86pc/egg/mako/runtime.py', line 304 in
_render_context
File 'build/bdist.solaris-2.11-i86pc/egg/mako/runtime.py', line 337 in
_exec_template
File
'/media/sdb1/galaxy/galaxy_devel/database/compiled_templates/root/history.mako.py',
line 35 in render_body
if bool( [ data for data in history.active_datasets if data.state in
['running', 'queued', '', None ] ] ):
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/attributes.py',
line 53 in __get__
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/attributes.py',
line 208 in get
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/strategies.py',
line 226 in lazyload
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/query.py', line
359 in select_whereclause
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/query.py', line
1078 in _select_statement
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/query.py', line
977 in execute
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/orm/session.py',
line 195 in execute
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py',
line 517 in execute
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py',
line 557 in execute_clauseelement
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py',
line 568 in execute_compiled
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py',
line 581 in _execute_raw
File 'build/bdist.solaris-2.11-i86pc/egg/sqlalchemy/engine/base.py',
line 599 in _execute
SQLError: (ProgrammingError) column
history_dataset_association.copied_from_history_dataset_association_id
does not exist
LINE 1: ...ation.blurb AS history_dataset_association_blurb, history_da...
^
'SELECT history_dataset_association.designation AS
history_dataset_association_designation, anon_d37b.update_time AS
anon_d37b_update_time, anon_d37b.deleted AS anon_d37b_deleted,
anon_d37b._extra_files_path AS anon_d37b__extra_files_path,
anon_d37b.purged AS anon_d37b_purged, anon_d37b.purgable AS
anon_d37b_purgable, anon_d37b.state AS anon_d37b_state,
anon_d37b.create_time AS anon_d37b_create_time, anon_d37b.file_size AS
anon_d37b_file_size, anon_d37b.id AS anon_d37b_id,
anon_d37b.external_filename AS anon_d37b_external_filename,
history_dataset_association.visible AS
history_dataset_association_visible,
history_dataset_association.create_time AS
history_dataset_association_create_time,
history_dataset_association.dataset_id AS
history_dataset_association_dataset_id, history_dataset_association.id
AS history_dataset_association_id, history_dataset_association.parent_id
AS history_dataset_association_parent_id,
history_dataset_association.metadata AS
history_dataset_association_metadata, history_dataset_association.blurb
AS history_dataset_association_blurb,
history_dataset_association.copied_from_history_dataset_association_id
AS history_dataset_association_copied_from_history_dataset_a_1,
history_dataset_association.peek AS history_dataset_association_peek,
history_dataset_association.update_time AS
history_dataset_association_update_time,
history_dataset_association.deleted AS
history_dataset_association_deleted,
history_dataset_association.history_id AS
history_dataset_association_history_id, history_dataset_association.hid
AS history_dataset_association_hid, history_dataset_association.info AS
history_dataset_association_info, history_dataset_association.name AS
history_dataset_association_name, history_dataset_association.extension
AS history_dataset_association_extension \nFROM
history_dataset_association LEFT OUTER JOIN dataset AS anon_d37b ON
anon_d37b.id = history_dataset_association.dataset_id \nWHERE
history_dataset_association.history_id = %(lazy_245d)s AND NOT
history_dataset_association.deleted ORDER BY
history_dataset_association.hid ASC, anon_d37b.id' {'lazy_245d': 37L}
-----------------
I guess I didn't perform some required database alteration, I just don't
know which one...
Please advise.
Thanks for your help,
Gordon.
14 years, 6 months
Galaxy source repository change
by Nate Coraor
Hello,
If you simply use the Galaxy Main or Test sites hosted at PSU, you can
safely ignore this message. It pertains to people who've checked out
their own copy of Galaxy for development or local use.
Galaxy development has recently moved source control systems, from
Subversion to Mercurial. This means that anyone using a local copy of
Galaxy will need to make some changes to be able to download future updates.
The preferred method of obtaining Galaxy source is through Mercurial
directly (the 'hg' command):
hg clone http://www.bx.psu.edu/hg/galaxy galaxy_dist
Tarballs are also available via the zip/gz/bz2 links here:
http://www.bx.psu.edu/hg
And the repository is mirrored in Subversion here:
svn co http://www.bx.psu.edu/svn/galaxy galaxy_dist
Unfortunately, modifications to local copies of Galaxy will need to be
transferred from your old checkout to the new checkout.
Please let us know via <galaxy-bugs(a)bx.psu.edu> of any issues. Thanks,
and thank you for using Galaxy.
--nate
14 years, 6 months
Importing features from BioMart
by Alexandre Gattiker
Hello,
I would like to import features from BioMart into galaxy. However the
'embedded' BioMart view yields an HTML table. How do I convert that to
GFF or BED?
Best regards
--
--------------------------------------------------------
Alexandre Gattiker Bioinformatics & Biostatistics Core Facility
EPFL School of Life Sciences / Faculté des Sciences de la vie FSV
http://people.epfl.ch/Alexandre.Gattiker
14 years, 6 months