Collaborating on Galaxy - sharing and re-sharing histories
by Assaf Gordon
Hi all,
I recently had to work closely with someone with our galaxy server, sharing histories back-and-forth with them.
Each time they would run couple of jobs and share the history with me, and I'll check their results, perhaps run couple of jobs of my own and re-share the new results with them.
The problem is that collaborating like that is quite annoying.
Besides the need to go through so many clicks (of publish/share and list-histories-shared-with-me/clone/switch),
another problem is that my histories list looks like this:
===
Clone of 'Clone of 'Clone of 'XXXX, BLAT' shared by 'gordon(a)cshl.edu' (active items only)' shared by 'xxxxxx(a)cshl.edu' (active items only)' shared by 'gordon(a)cshl.edu' (active items only)
===
Clone of 'Clone of 'xxxxxx, BLAT' shared by 'gordon(a)cshl.edu' (active items only)' shared by 'xxxxxxx(a)cshl.edu' (active items only)
===
Clone of 'xxxxxx, BLAT' shared by 'gordon(a)cshl.edu' (active items only)
===
and so on...
each history adds just one or two more useful datasets, and it becomes cumbersome to manage all of them (not to mention that my histories-list and shared-histories-list are littered with the same history over and over).
I think that in order to take galaxy to the 'next-level' of collaborative frame-work, a truly shared history mechanism will be useful:
A single history instance, that once shared, every change made to it (by any user) will immediately appear on all users sharing this history.
conceptually similar to http://piratepad.net/ (previously Etherpad), where all participants immediately see all the changes made to the content.
Thanks,
-gordon
12 years, 6 months
XML definition
by Sébastien HARISPE
Hi,
I am trying to integrate some tools into Galaxy but I have encountered some
difficulties in attempting to understand how the XML tool definition works
exactly ... despite the XML tag documentation available at
http://bitbucket.org/galaxy/galaxy-central/wiki/ToolConfigSyntax .
Imagine a case where the presence of certain arguments depends on:
- its value e.g we don't want to include it if its value is undefined
- other arguments' values e.g if the value of x is greater than the y
value we don't want to include the x argument on the command line
How can I specify it using the XML definition?
A simplified case:
A tool needs arguments A and B or an input file F
We want to propose two modes to configure it
- [1] a basic mode where we can graphically set A and B values using fields
such as text value...
- [2] an advanced mode where we can specify a file containing complex
configuration to upload
tool command line:
for [1] tool.py -A arg1 -B arg2
[2] tool.py -F confile
I don't want to include the -F command line argument if -A and -B are
defined
I currently manage these cases using wrappers...quite boring
Where can we find:
- advanced documentation for XML definition
- commented advanced examples
- related discussions
Best regards
Seb
[sorry for the bad english]
12 years, 6 months
Re: [galaxy-dev] [galaxy-user] microbes data in local instance of galaxy
by Daniel Blankenberg
Hi Alex,
This data was obtained using scripts that are found under $GALAXY_ROOT/scripts/microbes/ and there is a README.txt file available here as well. However, it has been some time since these scripts have been used and it is possible that they have become stale and would require some tweaking to get working properly (IIRC there was some messy webpage scrapping involved).
Thanks for using Galaxy,
Dan
On Jul 13, 2010, at 2:30 AM, Bossers, Alex wrote:
> Hi All,
> We have or local galaxy instance running which works fine.
> In the get data section the Microbes tool has no local ncbi data. The public instance has it.
> What is the best/easiest way to get that data into our local instance of galaxy. Have been browsing the wikis and looked through library and dataset documentations but was unable to resolve this at first glance.
> Any help/guidance appreciated.
> Thanks
> Alex
> _______________________________________________
> galaxy-user mailing list
> galaxy-user(a)lists.bx.psu.edu
> http://lists.bx.psu.edu/listinfo/galaxy-user
12 years, 6 months
Workflow copy history item
by Dennis Gascoigne
The copy history item is available from the Edit Attributes page for a
history item. It would be more useful and a bit more logical if this option
was available from the history drop down menu on the right and the history
menu in the saved histories page.
The reason/advantage to this is;
* The history item copy page offers the capacity to copy one or more items
from the history, it is not limited to the history item you click to the
page from
* It is much easier to
* copy items between histories by selecting the option from the
history menu, or from the history menus in the saved histories page, VS
* opening a history, clicking on an edit attribute, scroll to the
bottom of the page, click copy history item
My apologies if this seems picky but I do this a lot.
Cheers
Dennis
12 years, 6 months
History link in main menu
by Dennis Gascoigne
This might seem minor, but a link to History manager in the main menu at the
top (between Analyze and Workflow) would be really handy. It is much more
common a destination than workflows but is an extra click/select away
through the workflow menu on the right. A couple of users in our group have
bought this up, and I kind of agree.
Cheers
Dennis
12 years, 6 months
Workflow Error
by Sumedha Ganjoo
Hi,
I am using a tool that interacts with Web services via Galaxy. Run on
its own in the Galaxy GUI,
it works fine. But when we run it in a workflow, with other tools it
sometimes runs correctly , sometimes
gives the following error:
OperationalError: (OperationalError) database is locked u'UPDATE job
SET update_time=?, command_line=?
WHERE job.id = ?' ['2010-07-12 14:10:18.398960', 'python
${installationDirectory}/tools/${............
.....................................................
I was wondering if Galaxy has some timeouts implemented in their
workflows because of which if the first
step takes a while the second step of the workflow is executed simultaneously?
Or any other explanation for such behavior.
I would really appreciate a reply as soon as possible.
Thanks.
Regards,
Sumedha
--
Sumedha Ganjoo
Graduate Assistant,
Department Of Computer Science,
University Of Georgia,
Athens, GA , USA
12 years, 6 months
TopHat and other tools with too many options
by Assaf Gordon
Hi all,
I'm in the process of adapting TopHat to our needs, and there are just to many options...
It's OK if you run it from the command line, but in Galaxy it looks like a big mess.
Similar to Bowtie's tool situation, the "common" option are not specific enough (to our needs) and the "full options" mode is too hard to use (which will result in users not using it at all, or using it wrong).
I'd like to request/propose a change in the way the GUI is rendered based on the XML tool.
Mainly, to create logical parameter "groups": parameters which logically go together, and are related to one another.
In the XML file, it could look like:
<inputs>
<group name="Introns" help="These settings control the intron sensitivity.">
<param name="min_intron_length" type="integer" value="70" />
<param name="max_intron_length" type="integer" value="500000" />
</group>
<group name="Quality" help="XXXXXX">
<param name="max_multihits" type="integer" value="40" label="Maximum number of alignments to be allowed" />
<param name="junction_filter" type="float" value="0.15" label="Minimum isoform fraction: />
</group>
...
</inputs>
And in the HTML output, the groups will be visually distinct, with some nice hide/expand javascript trick, see (fake) example here:
http://cancan.cshl.edu/labmembers/gordon/files/galaxy_advanced_options.html
IMHO, there are couple of advantages in this layout:
1. The "big-picture" of the available settings is immediately visitble (e.g. "introns", "quality", "segments" etc.).
2. Parameters are separated into logical groups, easier to understand what's being changed (as opposed to one very long cryptic list of parameters).
3. Advanced vs. Simple options are clearly marked
4. Since parameter groups can be hidden, when they are expanded they can contain a help paragraph - this is much easier for the user than scrolling up/down to see the help section below (also - the relevant help section now appears right next to the parameters).
I guess this is not a trivial change, but without it it will get harder and harder to integrate complex tools.
Comments are welcomed,
-gordon
12 years, 6 months
Cleanup hidden datasets cleanup script
by Dennis Gascoigne
Hi guys;
We have taken advantage of your recent excellent additions to the workflow
and have a lot of histories where intermediate datasets are hidden. It is
becoming quite critical to us to catch these hidden history items in the
cleanup scripts and delete them as they consume a lot of room and we cannot
identify another way of removing them.
The obvious place to remove hidden datasets appears to be with an option in
the cleanup procedures -In previous discussions you have mentioned that this
is something you propose doing and it would at first seem relatively quick.
What is your timing on this, or has it already been done. If it is a while
away, is there anything I should consider in writing/amedning a query to
roll our own?
Cheers
Dennis
12 years, 6 months
Re: [galaxy-dev] [galaxy-bugs] help about compress file
by Jeremy Goecks
Eric,
The galaxy-dev mailing list (cc'd) is a good place to ask your question. This isn't my area of expertise, so hopefully someone else can chime in and help you out.
J.
On Jul 9, 2010, at 11:12 AM, Eric Aguiar wrote:
> Jeremy,
>
> I followed all steps in the galaxy tutorial, but I didn't have success.
> I'm trying to create a datatype for megabase chromatograms (.esd) very similar to the Ab1 ones.
>
> Here is my configurations.
>
>
> 1 - Creating datatypes in datatypes_conf.xml
>
> <datatype extension="zip" type="galaxy.datatypes.binary:Esd" mimetype="application/zip" display_in_upload="true"/>
> <datatype extension="esd" type="galaxy.datatypes.binary:Esd" mimetype="application/octet-stream" display_in_upload="true"/>
>
> 2 - Defining types in lib/galaxy/datatypes/binary.py
>
> class Esd( Binary ):
> """Class describing an ab1 binary sequence file"""
> file_ext = "esd"
>
> def set_peek( self, dataset, is_multi_byte=False ):
> if not dataset.dataset.purged:
> dataset.peek = "Binary chromatograms sequence file"
> dataset.blurb = data.nice_size( dataset.get_size() )
> else:
> dataset.peek = 'file does not exist'
> dataset.blurb = 'file purged from disk'
> def display_peek( self, dataset ):
> try:
> return dataset.peek
> except:
> return "Binary esd sequence file (%s)" % ( data.nice_size( dataset.get_size() ) )
>
>
> class Zip( Binary ):
> """Class describing a zip archive of binary sequence files"""
> file_ext = "zip"
>
> def set_peek( self, dataset, is_multi_byte=False ):
> if not dataset.dataset.purged:
> zip_file = zipfile.ZipFile( dataset.file_name, "r" )
> num_files = len( zip_file.namelist() )
> dataset.peek = "Archive of %s binary sequence files" % ( str( num_files ) )
> dataset.blurb = data.nice_size( dataset.get_size() )
> else:
> dataset.peek = 'file does not exist'
> dataset.blurb = 'file purged from disk'
> def display_peek( self, dataset ):
> try:
> return dataset.peek
> except:
> return "Binary sequence file archive (%s)" % ( data.nice_size( dataset.get_size() ) )
> def get_mime( self ):
> """Returns the mime type of the datatype"""
> return 'application/zip'
>
>
> When I'm going to send the file in zip format (.esd files compressed),it shows me the following error:
> "An error occurred running this job: Invalid 'File Format' for archive consisting of binary files - use 'Binseq.zip'"
>
> I tried somethings, but I don't have success.
>
> Thank you,
>
> On 07/08/2010 05:50 PM, Jeremy Goecks wrote:
>>
>>> I would like to know about the use of compressed files (zip format) in get data app. I'm trying to send .esd files compressed but the program shows me the following error message: "The uploaded file contains inappropriate content".
>>
>>
>> Hi Eric,
>>
>> Galaxy can accept zip files with a single compressed file but does not recognize .esd files. You'll need to convert the esd file into a format that Galaxy recognizes or run your own Galaxy instance and write your own datatype for Galaxy:
>>
>> http://bitbucket.org/galaxy/galaxy-central/wiki/AddingDatatypes
>>
>> Thanks,
>> J.
>
>
> --
> <eric_vcard.png>
12 years, 6 months
Adding job
by Filip Balejko
Hi,
I'm working on the project for Google Summer of Code:
https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Co...
Demo instance can be found here: http://137.110.191.252:8080/ (under
the "Phylogenetics" section)
As you can see, detailed results of the SLAC analysis are generated on demand.
In this version computation is done when the request is handled (in
datamonkey controller).
It bypasses the job queue and I guess it might not be considered as
the best approach.
I was thinking about creating jobs for those on-demand computations.
What do you think about this solution?
I understand that to accomplish this, I have to create another tool
and put my code there.
Is it the only way to use the job queue?
Is it possible to add a job which isn't displayed in the history box?
best regards,
Filip Balejko
12 years, 6 months