October 2012 - galaxy-dev - lists.galaxyproject.org

Galaxy processing
by Scooter Willis 31 Oct '12

31 Oct '12

Getting up to speed on Galaxy and couldn't find examples or discussion related to the architecture and was hoping an expert could give some quick pointers/guidance. Where do I find info if the installed applications make use of multiple nodes via MPI(etc) which would indicate the benefit of starting up X number of nodes for faster processing? If a workflow has multiple initial inputs for say processing NGS exome data from tumor and blood(gets compared later in the workflow) will each step get sent to a different node(without a dependency) or will the entire workflow run on one node? If I have NGS data for 20 patients sitting in a S3 bucket and want a specific workflow run against each patient data input(s) does this require manual selection of files by a user or can the workflow be automated? Can I programmatically start a workflow remotely(via REST) where I have automated the process of uploading NGS data to S3 and know the input file(s) per workflow? Is it possible to present credentials in a workflow for downloading a file via S3 where I require authentication before a file can be downloaded? Working with NGS data for patients so trying to understand how I can keep security tight. Currently planning on restricting download to IP address for the cluster but gets a little complicated for what amazon is doing behind the scenes in its internal network. I would also like to push results/output back to S3 and didn't see anything obvious to do this. Gets a little complicated in that you would need to probably put results back in the same S3 bucket/new folder where the original source files came from. I saw mention of using scp to move files but that doesn't help to put results back in S3. So far I really like what I have seen and hope Galaxy becomes the future toolbox for our work. Does a roadmap exist for what is planned in the future? For example any additional tools NGS tools like Abyss going to make into the build? Interested in NGS software that handles the dynamics of cancer for gene fusion events, CNVs(etc) when dealing with NGS data. Thanks Scooter

2 1

output name of downloaded datasets
by julie dubois 31 Oct '12

31 Oct '12

Hello, My goal is to introduce, in the xml file of one tool like MACS for example, a supplementary command to redirect the output in another directory (+ creating link between this and the directory of galaxy outputs). But I want to rename my output with the same name that the downloading tools create in this way : GALAXY-NumOfDatasetInHistory[NameOfInput].bed And I don't find where this downloading tool is and so I don't find how create this name. Thanks. julie

2 1

Re: [galaxy-dev] Accessing Galaxy API from Java
by Brad Chapman 31 Oct '12

31 Oct '12

Scooter; (cc'ing the dev list and updating the subject line in case others are interested) > I have been looking for Java related API's to run workflows externally and > haven't found anything searching message forums etc. Would like to > automate data coming off up hiseq uploaded to Amazon S3 and then > programmatically from external process import the fastq files and kick off > a workflow to process. If you know of any docs or Java API for doing this > kind of external control can you point me to it. John Chilton has a Java library to access the API through Java: https://github.com/jmchilton/blend4j which should cover lots of this. If you're interested in other JVM languages, I built a small Clojure wrapper around this to simplify some tasks: https://github.com/chapmanb/clj-blend We'd definitely love to have more people involved, so if any functionality you need is missing please feel free to submit pull requests. Brad

1 0

Error trying to run functional tests on a single tool
by Dan Tenenbaum 31 Oct '12

31 Oct '12

Hi, I'm trying to test out the functional testing mechanism by running it on an existing Galaxy tool. First I ran ./run_functional_tests.sh -list which produced a list of tools I can test. I chose 'vcf_annotate' and tested it as follows: ./run_functional_tests.sh -id vcf_annotate This produced a lot of output which included an exception trace. The output was not conclusive as to whether the test ran or was successful. The output is too long for this mailing list but you can find it here: https://gist.github.com/3988398 I am reluctant to try and excerpt the relevant bits because it's hard for me to know what is relevant and what is not. I am running the latest Galaxy (just did hg pull/hg update and migrated). This is on a Mac OS X 10.7.4 machine with python 2.7. When I run the same command on a linux machine, it works (though it took me a while to find the test output; it was buried in a lot of output that also contained (apparently irrelevant) stack traces). So perhaps there is something wrong with my configuration. Hope someone can help me out. Also had a couple of newbie questions about the functional test framework. 1) Why does it use tool_conf.xml.sample instead of tool_conf.xml? Can I change it to use tool_conf.xml? This way I do not need to add tools to two places in order to test them. (Plus the name of tool_conf.xml.sample indicates that it is just a demo file). 2) run_functional_tests.sh -list lists tools (such as 'upload1') that do not have functional tests, so cannot (if my understanding is correct) be tested with this script. Perhaps it would make more sense not to list these tools? Thanks, Dan

1 0

November 2012 Galaxy Update
by Dave Clements 31 Oct '12

31 Oct '12

1 0

user management problem
by Jordi Vaquero 31 Oct '12

31 Oct '12

Hello, I am trying to configure my galaxy instance and I have two problem. The first one is that I cannot delete users, I created some users for testing, I enabled the option on the universe_wsg.ini, and the button appears, but the users set only marked as deleted but they didn't disappear from the users list. Is that normal? The second problem is that I am trying to set an email confirmation for ensure that the users email exists, there is any way to do that? I have introduced the email information on the ini file, but I cannot see any other option for enabling that. Thanks to everyone for your help Jordi

1 0

Amazon
by Scooter Willis 31 Oct '12

31 Oct '12

Started up a cluster on Amazon using the Launch a Galaxy Cloud Instance and got the following message. Since I don't have any control over where the instances are run not sure how I can control this. The last 4 or 5 times I have started up an existing instance has worked with no problem. Messages (CRITICAL messages cannot be dismissed.) 1. [CRITICAL] Volume 'vol-f882ca85' is located in the wrong availability zone for this instance. You MUST terminate this instance and start a new one in zone 'us-east-1a'. (2012-10-31 14:25:20)

2 2

Incorrect chain order for SSL certificates on Galaxy main
by Brad Chapman 31 Oct '12

31 Oct '12

Hi all; I ran into SSL certification errors when using Java to connect to Galaxy main via the API. My knowledge of this stuff is minimal, but I did some searching and discovered that the certificate chain on Galaxy main is a problem: https://www.ssllabs.com/ssltest/analyze.html?d=main.g2.bx.psu.edu Looking at the chain with openssl shows a swap of the AddTrust and Internet2 certificates: $ openssl s_client -connect main.g2.bx.psu.edu:443 CONNECTED(00000003) depth=2 C = SE, O = AddTrust AB, OU = AddTrust External TTP Network, CN = AddTrust External CA Root verify error:num=19:self signed certificate in certificate chain verify return:0 --- Certificate chain 0 s:/C=US/postalCode=16802/ST=PA/L=University Park/O=The Pennsylvania State University/OU=Center for Comparative Genomics and Bioinformatics/CN=bigsky.bx.psu.edu i:/C=US/O=Internet2/OU=InCommon/CN=InCommon Server CA 1 s:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root 2 s:/C=US/O=Internet2/OU=InCommon/CN=InCommon Server CA i:/C=SE/O=AddTrust AB/OU=AddTrust External TTP Network/CN=AddTrust External CA Root --- As a result, more picky verification mechanisms fail because of the self signed certificate in the middle of the chain instead of as the root. It appears you can fix this by adjusting the order of certificates in nginx: http://webmasters.stackexchange.com/questions/27842/how-to-prevent-ssl-cert… http://nginx.org/en/docs/http/configuring_https_servers.html#chains Hope this helps, Brad

2 1

Galaxy local install
by Yamshchikov, Vladimir 31 Oct '12

31 Oct '12

Hi, I contacted with this question vendor tech support (Dell), but they could not answer (or did not want to) and directed me to Galaxy developers. I am using RHEL58 and SciLinux55 and want to install a local instance of Galaxy. Both my systems are based on Python 2.4. Question - can I install Python 2.6/2.7 locally without messing up the system? I was advised earlier not to make system install, but being unhealthy curious I did and ended up with reinstalling SciLinux 55 from scratch. How to make sure 2.6/2.7 will not mess up the system's Python? Thanks, Vladimir

2 1

Trackster and gff file with multiple chromosome annotations
by Yec'han Laizet 31 Oct '12

31 Oct '12

Hello, Is it possible to load a unique gff file with the annotations of several chromosomes for my custom build in one step (one gff file)? With the current version of galaxy, it seems that I can load a gff file referring only to one chromosome. That's pretty tedious to load 43 gff files separatly for my custom build... If I try, I get this error: Traceback (most recent call last): File "~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py", line 91, in main() File "~/galaxy-dist/lib/galaxy/datatypes/converters/interval_to_fli.py", line 30, in main for feature in read_unordered_gtf( open( in_fname, 'r' ) ): File "~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py", line 389, in read_unordered_gtf feature = GFFFeature( None, intervals=intervals ) File "~/galaxy-dist/lib/galaxy/datatypes/util/gff_util.py", line 65, in __init__ ( interval.chrom, self.chrom ) ) ValueError: interval chrom does not match self chrom: SAGS2 != SAGS1 Thanks Yec'han ================================================ Yec'han LAIZET Ingenieur Plateforme Genome Transcriptome Tel: 05 57 12 27 75 _________________________________ INRA-UMR BIOGECO 1202 Equipe Genetique 69 route d'Arcachon 33612 CESTAS ================================================

2 4