galaxy-user November 2013

galaxy-user@lists.galaxyproject.org

46 participants
45 discussions

Help with Summary Statistics
by D. A. Cowart 23 May '14

23 May '14

Hello, I am attempting to use Galaxy to calculate the mean sequence read length and identify the range of read lengths for my 454 data. The data has already been organized and sorted by species. The format of the data is as follows: >HD4AU5D01BHBCQCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC >HD4AU5D01A093MCTCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT etc...for each species I have attempted to use the "Summary Statistics" button, however it appears to only be for numerical data and not sequence data. Is this tool/task available via Galaxy? Thank you, Dominique Cowart User name: dac330

6 5

Empty bowtie2 output
by IIHG Galaxy Administrator 10 Dec '13

10 Dec '13

In follow-up to http://user.list.galaxyproject.org/Empty-bowtie2-output-tp4656137.html, is there: - an ETA on when the issue with Bowtie2, in August 2013 distribution, generating empty output will be fixed (if not already fixed) ? - a suggested workaround (revert to an older version of that particular tool etc.) in the meantime ? Thank you. Unrelated: wasn't able to determine how to update that thread to request status, hence creating a new one.

2 1

SNP finding
by Xiefan Fang 03 Dec '13

03 Dec '13

Dear galaxy users, We have done deep sequencing on some known genomic loci using Hiseq2000. I have already mapped the reads to the reference sequences by using Galaxy. In the next step, I want to find SNPs and calculate the SNP percentage within the reads. There are 500,000 to 1,000,000 reads per biological sample. Can I do it with galaxy? If not, is there other programs available in windows? Considering that I am not very familiar with programming. Thanks, Xiefan University of Florida

5 9

FW: [galaxy-bugs] Galaxy tool error report from bsib@leeds.ac.uk
by Irene Bassano 02 Dec '13

02 Dec '13

Hi Jen, thanks. I am a bit confused: on Galaxy the only human genome listed is hg_g1k_v37. So when I uploaded the new data from "Get data", under "Genome" I selected hg_g1k_v37 Now, all i want is to get cufflinks with gene names: which genome am I supposed to use? the only one I knew was hg19 from iGenomes...but seems i cannot use it. Do I have to select a genome when I upload raw fastq data? I havent stated doign anything so far, its just raw data Thanks a lot, Irene ________________________________________ From: Jennifer Jackson [jen(a)bx.psu.edu] Sent: Wednesday, November 27, 2013 10:08 PM To: Irene Bassano Cc: galaxy-bugs(a)bx.psu.edu Subject: Re: [galaxy-bugs] Galaxy tool error report from bsib(a)leeds.ac.uk Hello, iGenomes covers the UCSC build, and this named "Human Feb. 2009 (GRCh37/hg19) (hg19)" in the full name in the UI. The "hg19" key is the important part - as the name may be abbreviated in some tools, but this key will be in all. The genome with the "hg_g1k_v37" key is slightly different and you will have another genome mismatch problem with the RNA-seq (and most other) tools if you combine this genome with data from UCSC (the source of hg19) or the wrong iGenomes file. "hg_g1k_v37" (source: 1000 genomes via GATK) and "hg19" (source: UCSC) are just about the same, but the identifiers are different. If you want to examine the differences, both are available on our rsync server and can be downloaded and compared. On the "Help -> Support" wiki are links for reference genomes. The iGenomes GTF file for hg19 is on the public Main server, if that is more convenient for you, or should you just want to be sure you have the right one. Look in Shared Data -> Data Libraries -> iGenomes. Best, Jen Galaxy team ps. please try to send new questions to one of our lists, thanks! On 11/27/13 12:51 PM, Irene Bassano wrote: > Hi Jen, > I uploaded some fastq files and selected as Genome from the drop down list ""Homo sapiens b37(hg_g1k_v37). > > Is this the same as the genome listed in UCSC "February 2009 (GRCh37/hg19)"? > > I am using iGenomes to get the gene names rather than annotation such as NM_00xxxx and I fused the UCSC website choosing the newest genome, February 2009 > > Thanks, > > best, > Irene > ________________________________________ > From: Jennifer Jackson [jen(a)bx.psu.edu] > Sent: Monday, November 18, 2013 7:26 PM > To: galaxy-bugs(a)bx.psu.edu; Irene Bassano > Subject: Re: [galaxy-bugs] Galaxy tool error report from bsib(a)leeds.ac.uk > > Hi again, > > The database mismatch is a problem here is well for the same reasons. My > guess is that you intended to run dataset #21 against hg19, and that the > run against hg18 was a tool form input mistake? > > Good luck with the next runs, > > Jen > Galaxy team > > On 11/18/13 5:21 AM, galaxy-bugs(a)bx.psu.edu wrote: >> GALAXY TOOL ERROR REPORT >> ------------------------ >> >> This error report was sent from the Galaxy instance hosted on the server >> "usegalaxy.org" >> ----------------------------------------------------------------------------- >> This is in reference to dataset id 7081835 from history id 1686699 >> ----------------------------------------------------------------------------- >> You should be able to view the history containing the related history item >> >> 37: Cufflinks on data 21 and data 34: assembled transcripts >> >> by logging in as a Galaxy admin user to the Galaxy instance referenced above >> and pointing your browser to the following link. >> >> usegalaxy.org/history/view?id=d88c1ef77619eb4f >> ----------------------------------------------------------------------------- >> The user 'bsib(a)leeds.ac.uk' provided the following information: >> >> Same as before but the reference genome is UCSC MAin on Human:refFlat >> ----------------------------------------------------------------------------- >> job id: 6096851 >> tool id: toolshed.g2.bx.psu.edu/repos/devteam/cufflinks/cufflinks/0.0.6 >> job pid or drm id: 136948 >> ----------------------------------------------------------------------------- >> job command line: >> python /galaxy/main/migrated_tools/toolshed.g2.bx.psu.edu/repos/devteam/cufflinks/b01956f26c36/cufflinks/cufflinks_wrapper.py --input=/galaxy-repl/main/files/007/074/dataset_7074022.dat --assembled-isoforms-output=/galaxy-repl/main/files/007/081/dataset_7081835.dat --num-threads="8" -I 300000 -F 0.1 -j 0.15 -G /galaxy-repl/main/psufiles/004/831/dataset_4831015.dat -N -b --ref_file="None" --dbkey=hg18 --index_dir=/galaxy/main/server/tool-data -u >> ----------------------------------------------------------------------------- >> job stderr: >> Error running cufflinks. >> return code = 1 >> Command line: >> cufflinks -q --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 -G /galaxy-repl/main/psufiles/004/831/dataset_4831015.dat -u -N -b /galaxy/data/hg18/sam_index/hg18.fa /galaxy-repl/main/files/007/074/dataset_7074022.dat >> Error: cannot open reference GTF file /galaxy-repl/main/psufiles/004/831/dataset_4831015.dat for reading >> >> >> ----------------------------------------------------------------------------- >> job stdout: >> cufflinks v2.1.1 >> cufflinks -q --no-update-check -I 300000 -F 0.100000 -j 0.150000 -p 8 -G /galaxy-repl/main/psufiles/004/831/dataset_4831015.dat -u -N -b /galaxy/data/hg18/sam_index/hg18.fa >> >> ----------------------------------------------------------------------------- >> job info: >> None >> ----------------------------------------------------------------------------- >> job traceback: >> None >> ----------------------------------------------------------------------------- >> (This is an automated message). > -- > Jennifer Hillman-Jackson > http://galaxyproject.org > -- Jennifer Hillman-Jackson http://galaxyproject.org

2 1

tophat issues
by miroslav.sotak 02 Dec '13

02 Dec '13

To whom it may concern I do have a problem with tophat. I can easily put fastq data to "history" and according to RNA-seq Analysis Exercise provided by Jeremy. We checked the type of Ascii ofset for the quality estimation. I tried even "quality data converter" set to 33 (we do have data of this ASCII offset from 2 different sources) but "tophat for Illumina" simply can not read the data before and even after quality format converter. We do not have any idea what is going on. I am logged in Galaxy with current email, can you check my data and is there any converter for quality offset? Sincerely Miro Sotak

2 1

Re: [galaxy-user] Problem loading BAM into IGV browser - invalid GZIP header error message
by Jim Johnson 02 Dec '13

02 Dec '13

Is galaxy returning an html page rather than the desired bam file? Are you using an nginx or apache proxy server to your galaxy server? I think that may be required, in order to view BAM files in IGV directly from Galaxy. JJ On 11/29/13, 11:00 AM, galaxy-user-request(a)lists.bx.psu.edu wrote: > Send galaxy-user mailing list submissions to > galaxy-user(a)lists.bx.psu.edu > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.bx.psu.edu/listinfo/galaxy-user > or, via email, send a message with subject or body 'help' to > galaxy-user-request(a)lists.bx.psu.edu > > You can reach the person managing the list at > galaxy-user-owner(a)lists.bx.psu.edu > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of galaxy-user digest..." > > > HEY! This is important! If you reply to a thread in a digest, please > 1. Change the subject of your response from "Galaxy-user Digest Vol ..." to the original subject for the thread. > 2. Strip out everything else in the digest that is not part of the thread you are responding to. > > Why? > 1. This will keep the subject meaningful. People will have some idea from the subject line if they should read it or not. > 2. Not doing this greatly increases the number of emails that match search queries, but that aren't actually informative. > > Today's Topics: > > 1. Problem loading BAM into IGV browser - invalid GZIP header > error message (Vosberg, Sebastian) > 2. Re: Problem loading BAM into IGV browser - invalid GZIP > header error message (Jim Robinson) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 29 Nov 2013 11:04:48 +0100 > From: "Vosberg, Sebastian" <sebastian.vosberg(a)helmholtz-muenchen.de> > To: "galaxy-user(a)lists.bx.psu.edu" <galaxy-user(a)lists.bx.psu.edu> > Subject: [galaxy-user] Problem loading BAM into IGV browser - invalid > GZIP header error message > Message-ID: > <20854588711E4A489A3AD70C9BA5548A01AE291C4614(a)XCH11.scidom.de> > Content-Type: text/plain; charset="utf-8" > > Dear all, > > > sometimes I encouter a problem trying to load BAM files directly from Galaxy into the IGV browser. First I am starting the IGV browser locally, then clicking on the appropriate BAM file and on "display with IGV _local_" in Galaxy. In most cases it works, but for some reasons not with specific files. The error message says > > "Error loading http://_URL-to-file_/galaxy_example.bam: An error occured while accessing http://_URL-to-file_/galaxy_example.bam > Invalid GZIP header" > > What does it mean? And why am I able to download the BAM file and load it from HDD into the IGV? > The problem comes with all BAM files of one sample cohort, but not with another (but same sample design and workflow used). Rerunning the workflow doesn't help... > > > I would be very thankful for every kind of help! > > > Best, > Sebastian > > Helmholtz Zentrum M?nchen > Deutsches Forschungszentrum f?r Gesundheit und Umwelt (GmbH) > Ingolst?dter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir?in B?rbel Brumme-Bothe > Gesch?ftsf?hrer: Prof. Dr. G?nther Wess, Dr. Nikolaus Blum, Dr. Alfons Enhsen > Registergericht: Amtsgericht M?nchen HRB 6466 > USt-IdNr: DE 129521671 > > > > ------------------------------ > > Message: 2 > Date: Fri, 29 Nov 2013 09:42:42 -0500 > From: Jim Robinson <jrobinso(a)broadinstitute.org> > To: "Vosberg, Sebastian" <sebastian.vosberg(a)helmholtz-muenchen.de>, > "galaxy-user(a)lists.bx.psu.edu" <galaxy-user(a)lists.bx.psu.edu> > Subject: Re: [galaxy-user] Problem loading BAM into IGV browser - > invalid GZIP header error message > Message-ID: <5298A7E2.7000908(a)broadinstitute.org> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Hi Sebastian, > > Is it possible to share an example bam that exhibits this problem on a > Galaxy server I can reach? Also, which version of IGV are you using > (select Help > About... to see the version). > > -- Jim > >> Dear all, >> >> >> sometimes I encouter a problem trying to load BAM files directly from Galaxy into the IGV browser. First I am starting the IGV browser locally, then clicking on the appropriate BAM file and on "display with IGV _local_" in Galaxy. In most cases it works, but for some reasons not with specific files. The error message says >> >> "Error loading http://_URL-to-file_/galaxy_example.bam: An error occured while accessing http://_URL-to-file_/galaxy_example.bam >> Invalid GZIP header" >> >> What does it mean? And why am I able to download the BAM file and load it from HDD into the IGV? >> The problem comes with all BAM files of one sample cohort, but not with another (but same sample design and workflow used). Rerunning the workflow doesn't help... >> >> >> I would be very thankful for every kind of help! >> >> >> Best, >> Sebastian >> >> Helmholtz Zentrum M?nchen >> Deutsches Forschungszentrum f?r Gesundheit und Umwelt (GmbH) >> Ingolst?dter Landstr. 1 >> 85764 Neuherberg >> www.helmholtz-muenchen.de >> Aufsichtsratsvorsitzende: MinDir?in B?rbel Brumme-Bothe >> Gesch?ftsf?hrer: Prof. Dr. G?nther Wess, Dr. Nikolaus Blum, Dr. Alfons Enhsen >> Registergericht: Amtsgericht M?nchen HRB 6466 >> USt-IdNr: DE 129521671 >> >> ___________________________________________________________ >> The Galaxy User list should be used for the discussion of >> Galaxy analysis and other features on the public server >> at usegalaxy.org. Please keep all replies on the list by >> using "reply all" in your mail client. For discussion of >> local Galaxy instances and the Galaxy source code, please >> use the Galaxy Development list: >> >> http://lists.bx.psu.edu/listinfo/galaxy-dev >> >> To manage your subscriptions to this and other Galaxy lists, >> please use the interface at: >> >> http://lists.bx.psu.edu/ >> >> To search Galaxy mailing lists use the unified search at: >> >> http://galaxyproject.org/search/mailinglists/ > > > ------------------------------ > > _______________________________________________ > galaxy-user mailing list > galaxy-user(a)lists.bx.psu.edu > http://lists.bx.psu.edu/listinfo/galaxy-user > > To search Galaxy mailing lists use the unified search at: > http://galaxyproject.org/search/mailinglists/ > > End of galaxy-user Digest, Vol 89, Issue 25 > ******************************************* -- James E. Johnson, Minnesota Supercomputing Institute, University of Minnesota

2 1

Problem loading BAM into IGV browser - invalid GZIP header error message
by Vosberg, Sebastian 29 Nov '13

29 Nov '13

Dear all, sometimes I encouter a problem trying to load BAM files directly from Galaxy into the IGV browser. First I am starting the IGV browser locally, then clicking on the appropriate BAM file and on "display with IGV _local_" in Galaxy. In most cases it works, but for some reasons not with specific files. The error message says "Error loading http://_URL-to-file_/galaxy_example.bam: An error occured while accessing http://_URL-to-file_/galaxy_example.bam Invalid GZIP header" What does it mean? And why am I able to download the BAM file and load it from HDD into the IGV? The problem comes with all BAM files of one sample cohort, but not with another (but same sample design and workflow used). Rerunning the workflow doesn't help... I would be very thankful for every kind of help! Best, Sebastian Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) Ingolstädter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir´in Bärbel Brumme-Bothe Geschäftsführer: Prof. Dr. Günther Wess, Dr. Nikolaus Blum, Dr. Alfons Enhsen Registergericht: Amtsgericht München HRB 6466 USt-IdNr: DE 129521671

2 1

human genome latest annotaiton
by Irene Bassano 27 Nov '13

27 Nov '13

Hi Jen, I uploaded some fastq files and selected as Genome from the drop down list ""Homo sapiens b37(hg_g1k_v37). Is this the same as the genome listed in UCSC "February 2009 (GRCh37/hg19)"? I am using iGenomes to get the gene names rather than annotation such as NM_00xxxx and I fused the UCSC website choosing the newest genome, February 2009 Thanks, (sorry, I think i sent a mail to bugs report by mistake) best, Irene

2 1

Problems with Picard and GATK tools
by garzetti 27 Nov '13

27 Nov '13

Dear all, I have been trying to analyze some recently acquired WGS reads (re-sequencing with MiSeq) but I am having problems with both Picard and GATK tools and I don't know where the problem is. My fastq reads are already in the sanger/illumina 1.9 format, as recognized by the FastQC tool. I have modified the attributes of the read files from fastq to fastqsanger and successfully performed a BWA mapping against my reference sequence. I have then filtered the resulting SAM file with "NGS: SAM Tools, Filter SAM" to have only paired-mapped reads and reordered the file with "NGS: Picard, Reorder SAM/BAM", allowing the option Truncate sequence names after first whitespace. Since my reads are highly duplicated (from the FastQC output), I have run the "NGS: Picard, Mark Duplicate reads" tool, obtaining the removal of only 2 duplicated reads. I went on adding a Read Group with "NGS: Picard, Add or Replace Groups" and starting the SNP calling with GATK using the tool Realigner Target Creator. And here I have obtained an empty file and I have started thinking something is wrong. So, I have tried to perform the mapping again (as suggested by the GATK wiki when someone got an empty file like me), running the same steps on different sample reads, but I have always the same strange results from the De-duplication step and the Realigner tool. I think there is something wrong during the BWA mapping step, or even in my fastq reads, but I cannot understand what it is. Any idea? And what is the read quality format accepted by Galaxy tools? I know it's the PHRED+33, but how does it look like? Example 1: ??A????BDDDEDDDDGGGGGGGHHHF##77AEFHIIHIHIIIH##77ACFFHHHIHIIHH#5AEFHHHHHHF#55AFHEAEDHHHHHHFFCFHHH#######64#66=+@DDEGGGGDEDEEBEECCECEEGGEGGGGGGGGEEGGA5C0 or Example 2: !!"!!!!#%%%&%%%%((((((()))'!!!!"&')**)*)***)!!!!"$'')))*)**))!!"&'))))))'!!!"')&"&%))))))''$')))!!!!!!!!!!!!!!!%%&((((%&%&&#&&$$&$&&((&((((((((&&(("!$! I did BWA mapping with both types and it worked, but maybe the problems lies somewhere here. I hope someone can help me! Thank you!!!! Debora

1 0

Re: [galaxy-user] Cufflinks returned 0 value in all RPKMs
by Jennifer Jackson 26 Nov '13

26 Nov '13

Hi Dao, To run the analysis correctly on SOLiD data, a local or cloud Galaxy would be needed. A cloud Galaxy is web based and if you follow the links below, you will find exact instructions for getting set up. There are Amazon fees, but they do have various grant programs that you can review at their site for help with that. Galaxy itself is always free!! About tools on the Test server .. these are constantly in flux in terms of dependencies and such, and we don't support them because this is truly a development environment for us. You may find that certain tools, including this one, work on smaller datasets at some point in the future, but for the public this shouldn't be used for serious work. One last option, and I don't know for certain if there is one in here that will accept your datatype and enough quota to do significant work (these also change over time), are other public Galaxy servers. Each is supported by the hosting group. A list is here and you can look through/review the sites to see what is available: http://wiki.galaxyproject.org/PublicGalaxyServers Take care, Jen Galaxy team On 11/26/13 7:43 AM, Ly, Dao wrote: > > Hi > > Thank you very much for your reply. For the time being, I just wanted > to be familiarized with the workflow and the open resource of galaxy > main to analyze NGS. If you could advise me how I can obtain the RPKM > that will be great. I have tried many ways to map but no luck so far. > I think I’m at the end of my wits. > > I did try tophat2 with ion torrent data and it worked fine. This solid > SRA format is giving me a hard time and I can only work on webbase > program. I also try tophat for solid on test server but it failed! > Many thanks again > > Best regards > > Dao > > *From:*Jennifer Jackson [mailto:jen@bx.psu.edu] > *Sent:* November 20, 2013 8:46 PM > *To:* Ly, Dao; galaxy-user(a)lists.bx.psu.edu > *Subject:* Re: [galaxy-user] Cufflinks returned 0 value in all RPKMs > > Hello, > > If the data is RNA from rat, then you will want to be using Tophat > instead of Bowtie. Otherwise the data will not be mapped as spliced > the results will be off in many ways (the fragments counts are a small > symptom of a larger problem). > > You can use 'Tophat for SOLiD' on a suitable local or cloud Galaxy > instance. It is available on the Test server, but tools are not > supported here (we test/break things!) and the quotas are just 10G > with an account. But maybe is a place to do a small trial run before > committing to a cloud server. > http://getgalaxy.org > http://usegalaxy.org/cloud > http://usegalaxy.org/toolshed > > More about RNA-seq is in our wiki and public server, including > link-outs and tutorials, you can get started here: > Example → RNA-seq analysis tools: > http://wiki.galaxyproject.org/Support#Interpreting_scientific_results > See RNA-seq examples: http://wiki.galaxyproject.org/Learn#Other_Tutorials > > Best, > > Jen > Galaxy team > > On 11/20/13 6:09 AM, Ly, Dao wrote: > > Hi > > I have been trying to analyze a rat Solid SRA but I encountered a > problem: cufflinks gave me 0 RPKM in all genes. Here is my workflow > > 1.Get data with EBI SRA: sent the fastaq file directly to galaxy > > 2.Fastaq groomer > > 3.Mapped with bowtie for Solid (paire-ended) with the built- in > index rat rn5 as reference genome > > 4.Sam to Bam the bowtie mapping result > > 5.Cufflinks the bam file > > All RPKMs of gene expression and transcript expression have a 0 > value even thought the RPKM status is OK. I used default setting > for all jobs. Am I missing something? Any help, suggestion will be > greatly appreciated. Thank you very much > > Best regards > > Dao > > > > > ___________________________________________________________ > > The Galaxy User list should be used for the discussion of > > Galaxy analysis and other features on the public server > > at usegalaxy.org. Please keep all replies on the list by > > using "reply all" in your mail client. For discussion of > > local Galaxy instances and the Galaxy source code, please > > use the Galaxy Development list: > > > > http://lists.bx.psu.edu/listinfo/galaxy-dev > > > > To manage your subscriptions to this and other Galaxy lists, > > please use the interface at: > > > > http://lists.bx.psu.edu/ > > > > To search Galaxy mailing lists use the unified search at: > > > > http://galaxyproject.org/search/mailinglists/ > > > > -- > Jennifer Hillman-Jackson > http://galaxyproject.org -- Jennifer Hillman-Jackson http://galaxyproject.org

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

galaxy-user November 2013