July 2011 - galaxy-dev - lists.galaxyproject.org

New <version_string> tag in tool XML
by Peter Cock 08 Jul '11

08 Jul '11

Hi all, This new feature looks very exciting for tracking reproducibility - now in theory both the tool wrapper version (explicit in the XML) and any underlying tool version will both be recorded: https://bitbucket.org/galaxy/galaxy-central/changeset/84b20f29dfdf I have updated the NCBI BLAST+ wrappers to use the new feature, please merge/transplant this branch/commit: https://bitbucket.org/peterjc/galaxy-central/src/blast_2011_07_07 https://bitbucket.org/peterjc/galaxy-central/changeset/11f425f4e137 I hope the new tool version string will get its own field in the database soon (and be viewable via the web interface). Note that string may well be multi-line, e.g. $ blastp -version blastp: 2.2.25+ Package: blast 2.2.25, build Apr 6 2011 13:59:28 Unless that is you go for more fragile solutions like these to ensure a single line: $ samtools 2>&1 | grep -i "^Version" Version: 0.1.16 (r963:234) $ bwa 2>&1 | grep -i "^Version" Version: 0.5.9-r16 Thanks, Peter P.S. I would say the name <version_string> is a bit confusing, I'd have picked <version_cmd> or <version_command> instead.

2 1

Re: [galaxy-dev] [galaxy-user] Fwd: deleting datasets from history
by Hans-Rudolf Hotz 07 Jul '11

07 Jul '11

Hi Sergei This is a question better asked on 'galaxy-dev(a)bx.psu.edu' since you refer to your local Galaxy installation. In order to remove the data from your file system, you need to run the 'cleanup scripts', as described on this wiki page: https://bitbucket.org/galaxy/galaxy-central/wiki/Config/PurgeHistoriesAndDa… Regards, Hans On 07/06/2011 03:33 PM, Sergei Ryazansky wrote: > > > -------- Исходное сообщение -------- > Тема: deleting datasets from history > Дата: Tue, 5 Jul 2011 19:58:45 +0300 > От: Sergei Ryazansky <s.ryazansky(a)gmail.com> > Кому: galaxy-user-request(a)lists.bx.psu.edu > > > > Hello all, > > > After the deleating datasets from the history panel in our Galaxy mirror > the indicator at the top right corner shows the same amount of used > space as before deleting. Also, the files corresponded to the datasets > remains in the Galaxy database/files/000 directory. It seems, that > deleting of datasets from history is only delete the launch to file but > not the file itself. How to configure the Galaxy mirror to delete not > only records in history panel but also the corresponed files? > Thank you in advance! > > > > ___________________________________________________________ > The Galaxy User list should be used for the discussion of > Galaxy analysis and other features on the public server > at usegalaxy.org. Please keep all replies on the list by > using "reply all" in your mail client. For discussion of > local Galaxy instances and the Galaxy source code, please > use the Galaxy Development list: > > http://lists.bx.psu.edu/listinfo/galaxy-dev > > To manage your subscriptions to this and other Galaxy lists, > please use the interface at: > > http://lists.bx.psu.edu/

4 6

error showing errors
by Assaf Gordon 07 Jul '11

07 Jul '11

Hi, I've encountered a strange error when trying to view a failed job/dataset. I'm using the latest stable revision ( june 23rd, 720455407d1c ). A user executed a job and it failed (bad parameters, so no problem here). The program (bowtie) failed, returned information in STDERR. The "peek" field in the "history_dataset_association" table contains the STDERR message from bowtie (can be seen in the red rectangle in the history pane). So far, so good. But: 1. when the user clicked on the green "bug" icon, the following python exception occured: ========== URL: http://genomics.cshl.edu/galaxy/dataset/errors File '/localdata1/galaxy/galaxy_prod/eggs/WebError-0.8a-py2.6.egg/weberror/evalexception/middleware.py', line 364 in respond app_iter = self.application(environ, detect_start_response) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/debug/prints.py', line 98 in __call__ environ, self.app) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/wsgilib.py', line 539 in intercept_output app_iter = application(environ, replacement_start_response) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/recursive.py', line 80 in __call__ return self.application(environ, start_response) File '/localdata1/galaxy/galaxy_prod/lib/galaxy/web/framework/middleware/remoteuser.py', line 112 in __call__ return self.app( environ, start_response ) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/httpexceptions.py', line 632 in __call__ return self.application(environ, start_response) File '/localdata1/galaxy/galaxy_prod/lib/galaxy/web/framework/base.py', line 145 in __call__ body = method( trans, **kwargs ) TypeError: errors() takes exactly 3 arguments (2 given) ========== I guess this is because the "ID=XXXX" CGI parameter is missing from the URL. It happened couple of times (using Chrome), then never happened again. 2. The user shares the history with me (for me to debug it). Cloning the history isn't useful (because the job information like STDERR and STDOUT is lost). When viewing the shared history (without cloning), the "bug" icon has the following URL: http://genomics.cshl.edu/galaxy/dataset/errors?id=2218&use_panels=True and gives the following exception: === URL: http://genomics.cshl.edu/galaxy/dataset/errors?id=2218&use_panels=True File '/localdata1/galaxy/galaxy_prod/eggs/WebError-0.8a-py2.6.egg/weberror/evalexception/middleware.py', line 364 in respond app_iter = self.application(environ, detect_start_response) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/debug/prints.py', line 98 in __call__ environ, self.app) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/wsgilib.py', line 539 in intercept_output app_iter = application(environ, replacement_start_response) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/recursive.py', line 80 in __call__ return self.application(environ, start_response) File '/localdata1/galaxy/galaxy_prod/lib/galaxy/web/framework/middleware/remoteuser.py', line 112 in __call__ return self.app( environ, start_response ) File '/localdata1/galaxy/galaxy_prod/eggs/Paste-1.6-py2.6.egg/paste/httpexceptions.py', line 632 in __call__ return self.application(environ, start_response) File '/localdata1/galaxy/galaxy_prod/lib/galaxy/web/framework/base.py', line 145 in __call__ body = method( trans, **kwargs ) TypeError: errors() got an unexpected keyword argument 'use_panels' === regards, -gordon

1 0

Error when viewing data in GeneTrack
by Lewis, Brian Andrew 07 Jul '11

07 Jul '11

I'm seeing the following error in the log when clicking on the "View in GeneTrack" link for a specific dataset: Error - <type 'exceptions.Exception'>: A data display parameter is in the error state: genetrack_file URL: http://localhost:8081/galaxy/display_application/cbbbf59e8f08c98c/genetrack… File '/usr/local/galaxy-dist/eggs/Paste-1.6-py2.6.egg/paste/exceptions/errormiddleware.py', line 143 in __call__ app_iter = self.application(environ, start_response) File '/usr/local/galaxy-dist/eggs/Paste-1.6-py2.6.egg/paste/recursive.py', line 80 in __call__ return self.application(environ, start_response) File '/usr/local/galaxy-dist/eggs/Paste-1.6-py2.6.egg/paste/httpexceptions.py', line 632 in __call__ return self.application(environ, start_response) File '/usr/local/galaxy-dist/lib/galaxy/web/framework/base.py', line 145 in __call__ body = method( trans, **kwargs ) File '/usr/local/galaxy-dist/lib/galaxy/web/controllers/dataset.py', line 607 in display_application display_link = display_app.get_link( link_name, data, dataset_hash, user_hash, trans ) File '/usr/local/galaxy-dist/lib/galaxy/datatypes/display_applications/application.py', line 176 in get_link return PopulatedDisplayApplicationLink( self.links[ link_name ], data, dataset_hash, user_hash, trans ) File '/usr/local/galaxy-dist/lib/galaxy/datatypes/display_applications/application.py', line 116 in __init__ self.ready, self.parameters = self.link.build_parameter_dict( self.data, self.dataset_hash, self.user_hash, trans ) File '/usr/local/galaxy-dist/lib/galaxy/datatypes/display_applications/application.py', line 57 in build_parameter_dict if param.ready( other_values ): File '/usr/local/galaxy-dist/lib/galaxy/datatypes/display_applications/parameters.py', line 120 in ready raise Exception( 'A data display parameter is in the error state: %s' % ( self.name ) ) Exception: A data display parameter is in the error state: genetrack_file This is happening on a local instance of Galaxy, which has been updated to the most recent revision. Before updating The specific data came from the UCSC table browser. ~ Brian

2 1

Re: [galaxy-dev] New galaxy tool shed
by Greg Von Kuster 07 Jul '11

07 Jul '11

Hello Marcel, http://usegalaxy.org/community On Jul 7, 2011, at 8:14 AM, Marcel Schumann wrote: > Hi Greg, hi Dave, > > just a short&trivial question today: ;-) > > I would like to put a link to the galaxy tool shed on my ISMB poster ... wouldn't it perhaps make sense to setup a (forwarding) address that people can remember more easily than the current one (http://toolshed.g2.bx.psu.edu/)? > How about toolshed.getgalaxy.org, for example? If you could setup something like this and send me the address, I would put that one on my poster ... > > > Cheers & thanks, > Marcel > > > On 06/30/2011 08:58 PM, Marcel Schumann wrote: >> Ah, just by the way, I just got an idea for another metadata field: some >> more detailed author and contact infos. That is, some fields where the >> user (if he/she wants to) can enter his/her full name/affiliation and a >> contact email-address ... the latter perhaps protected in some way to >> ward off spam (http://www.google.com/recaptcha/mailhide/, or something >> similar). >> >> Just an idea ... and just in case you weren't planning this feature yet ;-) >> >> >> cu, >> Marcel >> >> On 06/30/2011 08:21 PM, Marcel Schumann wrote: >>> Hi Greg, >>> >>> great! Thanks! >>> The description looks much better this way and the new metadata features >>> will certainly be very helpful as well ... >>> >>> >>> Cheers, >>> Marcel >>> >>> >>> On 06/30/2011 07:55 PM, Greg Von Kuster wrote: >>>> >>>> This issue has been fixed - thanks for pointing it out. >>>> >>>> >>>> Yes, attributes like tool_id, version, etc, fall into the category of >>>> repository metadata, a new feature which I am currently implementing. >>>> It should be available fairly soon. >>>> >>>> >>>> >>>> Greg Von Kuster >>>> Galaxy Development Team >>>> greg(a)bx.psu.edu >>>> >>>> >>>> >>> >>> >> >> > > > -- > Marcel Schumann > > University of Tuebingen > Wilhelm Schickard Institute for Computer Science > Division for Applied Bioinformatics > Room C304, Sand 14, D-72076 Tuebingen > > phone: +49 (0)7071-29 70437 > fax: +49 (0)7071-29 5152 > email: schumann(a)informatik.uni-tuebingen.de Greg Von Kuster Galaxy Development Team greg(a)bx.psu.edu

1 0

Re: [galaxy-dev] How to set java heap size in Galaxy?
by liu bo 07 Jul '11

07 Jul '11

Hi Roman, Thanks for your explanation. I'm sorry for this late reply. I think the problem is on my app. I'll further check that. Thanks, Bo Date: Thu, 30 Jun 2011 14:53:08 +0200 From: Roman Valls <brainstorm(a)nopcode.org> To: galaxy-dev(a)lists.bx.psu.edu Subject: Re: [galaxy-dev] How to set java heap size in Galaxy? Message-ID: <4E0C71B4.2010308(a)nopcode.org> Content-Type: text/plain; charset=ISO-8859-1 Hello Liu, I'm keeping our mail thread in the mailing list, as advised by its guidelines on the footer of each mail. Now, galaxy runs on top of paster, a minimal python webserver, but it has nothing to do with Java, looks like your out of memory error comes from your app then. Have you tried the suggested settings from Marco ? Namely: java -Xmx512m I bet there's an environment variable or config setting for Tomcat too if you're unsure about where your jar's are called. Hope that helps ! Roman On 2011-06-29 12:02, liu bo wrote: > Thank you, Roman. > Our app invoke a workflow server. > I guess Galaxy has a self container like Tomcat, that is why we can > access it via 127.0.0.1 from browser. > I know in Tomcat, there is a file "catalina.sh" for setting JAVA_OPTS. > Is there a similiar file in Galaxy to set JAVA_OPTS? > Thank you very much. > > Regards, > Bo > > On Tue, Jun 28, 2011 at 4:18 AM, Roman Valls <brainstorm(a)nopcode.org> wrote: >> Hi Liu, >> >> Does your app execute Picard/GATK at some point ? In that case those >> would be the ones triggering your Java OOM error, in that case Bo's >> suggestion is the way to go. That's my best guess since Galaxy itself >> doesn't use java either (only python AFAIK). >> >> Regards, >> Roman >> >> On 2011-06-27 18:44, liu bo wrote: >>> Hi Marco, >>> >>> Thanks for your kind reply. >>> My app is a python file, and there's not an explicit command to run java. >>> I just wonder whether there is a file to configure Galaxy, in order to >>> set the Java heap size. >>> Thank you. >>> >>> Best wishes, >>> Bo >>> >>> On Mon, Jun 27, 2011 at 7:51 PM, Marco Moretto <marco.moretto(a)gmail.com> wrote: >>>> Hi Liu, >>>> if you set up your application in the XML to run with Java, like java -jar >>>> yourapplication.jar, it is sufficient to add the -Xmx parameter. Something >>>> like >>>> java -Xmx512m -jar yourapplication.jar >>>> Greets >>>> --- >>>> Marco >>>> >>>> >>>> On 27 June 2011 12:15, liu bo <liubo03(a)gmail.com> wrote: >>>>> >>>>> Dear all, >>>>> >>>>> I have added an application to Galaxy's Tools. >>>>> Now there occurs an error "java.lang.OutOfMemoryError: Java heap space". >>>>> Do you know how to set java heap size in Galaxy? >>>>> Thank you very much. >>>>> >>>>> Best regards, >>>>> Bo >>>>> ___________________________________________________________ >>>>> Please keep all replies on the list by using "reply all" >>>>> in your mail client. To manage your subscriptions to this >>>>> and other Galaxy lists, please use the interface at: >>>>> >>>>> http://lists.bx.psu.edu/ >>>> >>>> >>> >>> ___________________________________________________________ >>> Please keep all replies on the list by using "reply all" >>> in your mail client. To manage your subscriptions to this >>> and other Galaxy lists, please use the interface at: >>> >>> http://lists.bx.psu.edu/ >> ___________________________________________________________ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ >>

1 0

From NCBI SRA to UCSC viewer pipeline.
by colin molter 07 Jul '11

07 Jul '11

Hi all, i am trying to use a local instance of my galaxy to pre-format data stored at sra-ncbi. Does anyone has a working pipeline that (s)he could share. Here is the pipeline I am using, with some questions. 1/ download sra files to my server. 2/ transform them in fastq using the sra toolbox. 3/ upload them in galaxy, by using the 'add to data library' 4/ use the fastq groomer to enable to use the fast q in galaxy. Note: i guess that the data at sra are already in the fastq sanger format. So it could be nice to be able to skip that point (it took 10 hours to groom a fastq of 25Gb). 5/ MAP with Bowtie --> fastq to SAM 6/ filter SAM 7/ SAM to BAM problems: * sra data i got are RNAseq. I heard that bowtie is not good because can't deal with the splicing (so bowtie is ok for genome but not for RNAseq) ==> what is the best way to align RNAseq? Tophat? The problem is that i heard that if tophat can deal with gaps, it looses information about deletions. Someone told me that it could be better to use BWA and then to add a further step to deal with the splicing and the gaps. Any information? * to see my data in the IGV, an index (BAI) should be created. Normally, IGV could create it itself, but it didn't work. I heard that data should be ordered. The SAM i got from Bowtie is ordered by name and it should be ordered by chromosom and position. Is it right? In that case i could use the sort tool of galaxy and apply it on the SAM before to transform it in a BAM. Is it right? any other/related hints. Is there not a simple tutorial/screencast about this process that i guess most of the galaxy users have already did? thx colin -- Colin Molter University of Brussels - InSilico Team - http://insilico.ulb.ac.be/

2 2

custom builds as tool in toolbox panel?
by Michal Stuglik 07 Jul '11

07 Jul '11

hi, Would it be possible to build in a "Custom builds" form user menu to toolbox panel as a tool? Now it is not possible to use this feature in workflow ( as a part of pipeline)... all the best, michal

1 0

suggestions for the SAM-to-BAM tool
by Assaf Gordon 07 Jul '11

07 Jul '11

Hi, Couple of things that can be slightly improved in the SAM-to-BAM tool: 1. "Reference list" is not informative (it's the technical way to say: "list of chromosomes and their sizes based on a FASTA file"). Users do not generally know what "reference list" is. 2. The "Locally Cached" option is not informative (I had to look in the source code to understand what it means). What it should say is something like: "Get list of chromosomes/sizes based on the dataset's organism/database" (could be shorter, but should be friendly enough). 3. There's no option of having the chromosome list in the SAM file header. Some SAM files will contain the header (can even be done in the standard bowtie tool wrapper) - saves the need to specify where to get the "reference list" from. 4. Autodetection in the "set-metadata" step will go a long way here: if the SAM file already have a header, then no need to even ask about it. If it doesn't have a header but have a DBKEY, then we're still OK. If no DBKEY and no header, then complain or ask for a FASTA file from current history. (I realize the implementing this feature is hard and annoying, I don't imply that it's easy to do, just that it's needed). 5. Inside the python script (sam_to_bam.py) there's a comment that says: "for some reason the samtools view command gzips the resulting bam file without warning" . Not sure why one cares about that, but "samtools view -u" will output an uncompressed BAM file. 6. samtools support piping, so a lot of I/O (and some time) can be spared by piping the two commands together: samtools view -u -b -S "INPUT.SAM" | samtools sort - OUTPUT Instead of running two commands and generating a temporary unsorted BAM file. -gordon

2 1

rgWebLogo3 with Probabilities instead of Entropy
by Assaf Gordon 07 Jul '11

07 Jul '11

Hello Ross and Galaxy team, May I suggest this small patch, that enables WebLogo to plot either Entropy bits (the current default) or just nucleotides probabilities ? It uses the standard "-U" parameter of "weblogo3". -gordon

2 2