July 2010 - galaxy-user - lists.galaxyproject.org

Fwd: galaxy tool suggestion
by Anton Nekrutenko 14 Jul '10

14 Jul '10

Begin forwarded message: > From: Dan Jones <djones(a)psu.edu> > Date: July 13, 2010 11:02:50 AM EDT > To: Anton Nekrutenko <anton(a)bx.psu.edu> > Subject: galaxy tool suggestion > > > Hi Anton, > > This is Dan (from your bioinformatics class a couple years ago). I have been playing around on galaxy with a couple of new 454 metagenomics datasets. I have been going back and forth between the tools 'Build base quality distribution' and 'filter FASTQ' to assess quality of my data and determine how it is affected by filtering certain length and quality sequences (using FASTQ lets me simultaneously operate on the seq and qual scores). I am mainly trying to understand a systematic decrease in quality that occurs after about 50% sequence length. But, in order to go back and build a base quality distribution boxplot, I need to extract the qual scores from the fastq file, and I currently can't find a way to do this on Galaxy (unless I am missing something obvious, very possible! I see an option to convert fastq to fasta, but I don't get the .qual file with it). I wrote a short py script to do this (attached), and I think that something like it to extract a .qual file from FASTQ would be a nice addition to the galaxy toolbox. > > Hope all is well! > > Dan > > > --- > Daniel Jones > PhD Candidate, Penn State University > Department of Geosciences > 242 Deike Building > University Park, PA 16802 > cell: 651-245-2775 > lab: 814-865-9340 > > Anton Nekrutenko http://nekrut.bx.psu.edu http://usegalaxy.org

1 0

get data from SRA
by Brown, Stuart 14 Jul '10

14 Jul '10

I have a suggestion to make Galaxy even more useful You could enable direct data loading of Next-Gen sequence data from NCBI/SRA. As it is now, if someone wants to make use of the NextGen tools, they have to upload a big data file from their desktop. You have enabled very rapid data loading from UCSC, so perhaps you can work the same magic to get data from SRA. — Stuart Brown ------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. =================================

2 1

mac installation help
by Montano, Luis 13 Jul '10

13 Jul '10

Hi, I'm working for Dr. Richard McCombie in Cold Spring Harbor Laboratory. I have tried to install Galaxy locally in my Mac, but there are some issues and it doesn't complete the installation. Are there any special instructions for mac or could you specify key points to review during the installation process? Well, thanks and I am looking forward to reading your answer. Luis URP 2010 Cold Spring Harbor Laboratory

2 1

microbes data in local instance of galaxy
by Bossers, Alex 13 Jul '10

13 Jul '10

Hi All, We have or local galaxy instance running which works fine. In the get data section the Microbes tool has no local ncbi data. The public instance has it. What is the best/easiest way to get that data into our local instance of galaxy. Have been browsing the wikis and looked through library and dataset documentations but was unable to resolve this at first glance. Any help/guidance appreciated. Thanks Alex

1 0

Rgenetics tool
by Ross 12 Jul '10

12 Jul '10

Arun, if you have a plink format genotype file (pbed is best - compressed) in your history, it should appear among the input files for the eigenstrat tool - it seems to be ok on Main right now. They can be uploaded using the get data upload tool - you have to manually set the datatype to pbed so the three file parts can be uploaded together. The eigensoft tool uses a subset since WGA SNPs contain a lot of redundant information - an LD independent set is more efficient and loses little information. If you select a pbed it should be automatically converted - and run more quickly the second time On Tue, Jul 13, 2010 at 12:03 AM, <galaxy-user-request(a)lists.bx.psu.edu> wrote: > Send galaxy-user mailing list submissions to > galaxy-user(a)lists.bx.psu.edu > Message: 7 > Date: Fri, 9 Jul 2010 13:37:16 -0400 > From: "Arun Tiwari" <Arun_Tiwari(a)camh.net> > To: <galaxy-user(a)bx.psu.edu> > Subject: [galaxy-user] Rgenetics tool > Message-ID: <ACF8150B8AE2E042834D0198F418098040AB99(a)camhems-4.camh.ca> > Content-Type: text/plain; charset="iso-8859-1" > > Hello, > > I was wondering in what format should I upload my genotype data file obtained from PLINK. I tried uploading the binary file (bed, bim, fam) as wellas the ped/map file but I am unable to run any analysis as they never appear in the input file section. I wanted to run eigenstrat using the rgenetics tools for my data. > > Thanks a lot for your help, > > Arun Tiwari > > Centre for Addiction and Mental Health, > Toronto, Canada >

1 0

Strong GMOD presence at BOSC/ISMB
by Scott Cain 12 Jul '10

12 Jul '10

Hello, I am pleased to point out that GMOD has a large presence at the BOSC and ISMB meetings this year. The following SIGs, posters and talks are related to GMOD projects, and there will likely be an informal GMOD lunch during ISMB (I'm still working on the details). Talks: NGS Analysis with Galaxy on the Cloud Monday 3:30-3:55, TT26, Anton Nekrutenko Sample Tracking and automated data processing in Galaxy Monday 4:00-4:25, TT28, Anton Nekrutenko GBrowse2 Saturday 4:00-4:15, Lincoln Stein, part of BOSC SIG GMOD Presents GBrowse 2.0 and JBrowse Tuesday 11:15-11:40, TT32, Scott Cain Demonstration of the Pathway Tools Software and BioCyc Databases, Tuesday 2:45-3:10, TT38, Peter Karp Posters: ISGA - An Intuitive Web Server for Prokaryotic Genome Annotation and other Analysis, Mon 12:40-2:30, I10, Christopher Hemmerich Online Quantitative Transcriptome Analysis, Mon 12:40-2:30, J60, Regina Bohnert Galaxy NGS functionality from sample tracking to SNP calling: An interactive poster, Mon 12:40-2:30, U60, Ramakrishna Chakrabarty AGeS: A Software System for Annotation and Analysis of Genome Sequences, Sun 12:40-2:30, I01, Nela Zavaljevski GBrowse and Next Generation Sequencing Data, I15, Scott Cain ZFNGenome: A GBrowse-based tool for identifying Zinc Finger Nuclease target sites in model organisms, Mon 12:40-2:30, E18, Deepak Reyon Choosing a Genome Browser for a Model Organism Database: Surveying the Maize Community, Mon 12:40-2:30, E30, Taner Sen WebGBrowse - A Web Server for GBrowse, Mon 12:30-2:30, Z02, Ram Podicheti An Advanced Web Query Interface for Biological Database, Monday 12:40-2:30, E02, Peter Karp SIG: Galaxy: Analyze, Visualize, Communicate Saturday 1:30-5:30, Galaxy Team I look forward to seeing all GMODers at the meeting! Scott -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research

1 1

Rgenetics tool
by Arun Tiwari 09 Jul '10

09 Jul '10

Hello, I was wondering in what format should I upload my genotype data file obtained from PLINK. I tried uploading the binary file (bed, bim, fam) as wellas the ped/map file but I am unable to run any analysis as they never appear in the input file section. I wanted to run eigenstrat using the rgenetics tools for my data. Thanks a lot for your help, Arun Tiwari Centre for Addiction and Mental Health, Toronto, Canada ______________________________________________________________________ This email has been scanned by the CAMH Email Security System. ______________________________________________________________________

2 1

Galaxy fails to upload file locally
by Kevin 09 Jul '10

09 Jul '10

Hi I am having probs uploading large files to a galaxy local install. How can I troubleshoot this? Whereare the uploaded files stored? I am on 64 bit Linux with more than ample disk space. So am not sure what went wrong. Sent from my iPod

2 1

Finding SNP differences between datasets
by Nicola Nadeau 09 Jul '10

09 Jul '10

Hi, I am trying to find SNPs and/or indel variants that differ between two groups of samples. The data are 454 sequence capture results spanning a region of interest for a non-model organism without a complete genome. I have mapped these to my reference sequence of the region and generated BAM and pileup files for them. Does anyone know of a tool or method (either in galaxy or elsewhere) that will allow me to compare datasets and pick out variation between them (as opposed to things that differ from the reference sequence). Many thanks, Nicola

2 1

Is it possible to run processes in parallel?
by Alex Quezada 09 Jul '10

09 Jul '10

Hi, our metagenomics group is experimenting with spliting up assemblies into several chunks, and then combining the outputs. Is it possible to model in Galaxy a process that splits up, and then converges again after all the branches have finished? ************************************************ Alex Quezada Software Developer Genome Science/Joint Genome Institute (B-6) Bioscience Division MS M888 Los Alamos National Laboratory Los Alamos, NM 87545 phone: (505) 606-2153 fax: (505) 665-3024 email: alexq(a)lanl.gov ************************************************

2 1