March 2012 - galaxy-user - lists.galaxyproject.org

Question about aaChange tool.
by Maha Al Kahtani 15 Mar '12

15 Mar '12

Hello, Recently i have used galaxy to find corresponding amino acids to a list of SNPs that i have. i used aaChange tool from the tool panel for this purpose. After obtaining the result i checked the correctness of this results by searching dbSNP for randomly chosen SNPs. however, some of those SNPs mapped to incorrect amino acid, For example, The SNP( rs11549096), was mapped by Galaxy to the change ( Asp:Tyr/His, ) , Searching the dbSNP for this SNP Shows that the Change should be Asp :Asn and the link for this SNP in the dbSNP is : http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?searchType=adhoc_searc… So, the reference is correct in all cases i searched (more than a thousand), the position of change in the aa is correct, *but* the alternative aa is not correct in all cases. What causes this mistake, and how to solve this problem? Note: my list contains hundreds of thousands of SNPs, so, i can not check each one of them. prompt response is appreciated, Best Regards, Maha

2 2

Tophat error
by David Matthews 14 Mar '12

14 Mar '12

Hi, JUst running a TopHat job which returned the following error: Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect /local/tmp5Ywx45/dataset_942 > ./tophat_out/tmp/dataset_942.fa [Tue Mar 13 12:45:08 2012] Checking for Bowtie Bowtie version: 0.12.7.0 [Tue Mar 13 12:45:08 2012] Checking for Samtools Samtools Version: 0.1.18 [Tue Mar 13 12:45:08 2012] Generating SAM header for /local/tmp5Ywx45/dataset_942 format: fastq quality scale: phred33 (default) [Tue Mar 13 12:45:21 2012] Preparing reads left reads: min. length=56, count=29523921 right reads: min. length=56, count=29543412 [Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942 with Bowtie [Tue Mar 13 13:45:26 2012] Processing bowtie hits [Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against dataset_942 with Bowtie (1/2) [Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against dataset_942 with Bowtie (2/2) [Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against dataset_942 with Bowtie [Tue Mar 13 15:37:46 2012] Processing bowtie hits [Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against dataset_942 with Bowtie (1/2) [Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against dataset_942 with Bowtie (2/2) [Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping Traceback (most recent call last): File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063, in <module> sys.exit(main()) File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main user_supplied_deletions) File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in spliced_alignment [maps[initial_reads[left_reads]].unspliced_bwt, maps[initial_reads[left_reads]].seg_maps[-1]], TypeError: list indices must be integers, not str Does anyone know what this kind of error is? Best Wishes, David. __________________________________ Dr David A. Matthews Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K. Tel. +44 117 3312058 Fax. +44 117 3312091 D.A.Matthews(a)bristol.ac.uk

1 0

Maq consensus calling changes quality score of the read?
by Antony Jose 14 Mar '12

14 Mar '12

Hi, We used the generate pileup tool with consensus base calling option using Maq. In the output, the quality scores of the bases were changed. For example if the input score of bases in the SAM file were 'IIJIH'. They were changed to '22321'. Is this a glitch or is this expected? Thank you. Antony

1 0

Tophat error
by David Matthews 14 Mar '12

14 Mar '12

Hi, JUst running a TopHat job which returned the following error: Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect /local/tmp5Ywx45/dataset_942 > ./tophat_out/tmp/dataset_942.fa [Tue Mar 13 12:45:08 2012] Checking for Bowtie Bowtie version: 0.12.7.0 [Tue Mar 13 12:45:08 2012] Checking for Samtools Samtools Version: 0.1.18 [Tue Mar 13 12:45:08 2012] Generating SAM header for /local/tmp5Ywx45/dataset_942 format: fastq quality scale: phred33 (default) [Tue Mar 13 12:45:21 2012] Preparing reads left reads: min. length=56, count=29523921 right reads: min. length=56, count=29543412 [Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942 with Bowtie [Tue Mar 13 13:45:26 2012] Processing bowtie hits [Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against dataset_942 with Bowtie (1/2) [Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against dataset_942 with Bowtie (2/2) [Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against dataset_942 with Bowtie [Tue Mar 13 15:37:46 2012] Processing bowtie hits [Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against dataset_942 with Bowtie (1/2) [Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against dataset_942 with Bowtie (2/2) [Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping Traceback (most recent call last): File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063, in <module> sys.exit(main()) File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main user_supplied_deletions) File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in spliced_alignment [maps[initial_reads[left_reads]].unspliced_bwt, maps[initial_reads[left_reads]].seg_maps[-1]], TypeError: list indices must be integers, not str Does anyone know what this kind of error is? Best Wishes, David. __________________________________ Dr David A. Matthews Senior Lecturer in Virology Room E49 Department of Cellular and Molecular Medicine, School of Medical Sciences University Walk, University of Bristol Bristol. BS8 1TD U.K. Tel. +44 117 3312058 Fax. +44 117 3312091 D.A.Matthews(a)bristol.ac.uk

1 0

Question about plotting circos plot
by shamsher jagat 14 Mar '12

14 Mar '12

I wonder if is it possible to visualize mutation data in circular plot termed as circos plot e.g @http://www.eurekalert.org/multimedia/pub/31019.php?from=181881 Any suggestion for an alternative tool will also be appreciated. Thanks Shamsher

3 3

Quick question on generated tables
by Carly Hom 14 Mar '12

14 Mar '12

Hello, I am working on a project that involves extracting a list of promoter regions that contain a significant enough H3k27me3 signal. So far I have produced an output in the ENCODE genome browser which is great for the visual representations I will be needed, but I also need to extract the table that was generated. I see the first few lines in a snapshot of the data that galaxy provided me with, but how do I extract that entire table into spreadsheet or txt format? If you could enlighten me on a galaxy function that allows me to do this and I may be glancing over that would be great. I have provided an image of the data that I want to extract from galaxy so there is no confusion of what I am trying to do. Thank you! - Cary -- Caroline Hom Engineering Residential Community Peer Mentor School of Biological and Health Systems Engineering Grader Fulton Undergraduate Teaching Assistant Fulton Undergraduate Researcher, Synthetic Biology and Bioinformatics Tempe, AZ Ph: 602-315-5728 Arizona State University | Ira A. Fulton Schools of Engineering B.S.E. Biomedical Engineering OpenWetWare <http://openwetware.org/wiki/User:Caroline_Hom> LinkedIn<http://www.linkedin.com/profile/view?id=139737398&locale=en_US&trk=tab_pro>

2 1

Error using intersect tool: "problem: 536870912 is larger than the size of this BitSet (536870912)."
by Tim Reddy 14 Mar '12

14 Mar '12

Hello, I'm getting this problem a lot when I'm trying to intersect data sets. There errors are similar to: Skipped 4 invalid lines of 1st dataset, 1st line #19614: "chr10 135498863 135499333 chr10.6783 1000 + 0.4037 15.3 -1 236 ", problem: 536870912 is larger than the size of this BitSet (536870912). This is on the main public server with the hg19 genome while running "overlapping pieces of intervals" on pairs of datasets in the attached history. I've tried different format types, changing the strand, confirming that the provided files to not extend past end of chromosomes, to no avail. Public history: http://main.g2.bx.psu.edu/u/timreddy/h/unnamed-history Seems there was an error to this extent in the archives from 2009, but no response. Tx, Tim -- Timothy E. Reddy, Ph.D. Assistant Professor Department of Biostatistics and Bioinformatics Institute for Genome Sciences and Policy Duke University School of Medicine 919-684-3286 (lab) 919-564-9536 (cell) tim.reddy(a)duke.edu

2 1

Dataset name in dataset file
by Cittaro Davide 14 Mar '12

14 Mar '12

Hi all, I have a question about a weird usage of galaxy: suppose I have 23 files, each for a chromosome, suppose the dataset name (in history) contains the chromosome name but the dataset content don't (i.e. I only have chrom position and 2 associated values), is there a way to add a column so that the content of the column contains the dataset name? In a automatic way, I mean... i.e. data-chr1 1000000 3.1 23 1000100 1.2 23 […] data-chr2 1234411 33.1 24 4225211 12.0 44 […] I would like them to be data-chr1 1000000 3.1 23 data-chr1 1000100 1.2 23 […] data-chr2 1234411 33.1 24 data-chr2 4225211 12.0 44 […] So that I can merge them and strip 'data-'... Oh, I know I can do it outside galaxy and upload the final file, but I would like to know if (and how) this is possible, d /* Davide Cittaro, PhD Head of Bioinformatics Core Center for Translational Genomics and Bioinformatics San Raffaele Scientific Institute Via Olgettina 58 20132 Milano Italy Office: +39 02 26439140 Mail: cittaro.davide(a)hsr.it Skype: daweonline */

2 1

Problem Interset interval tools
by Mónica Pérez Alegre 14 Mar '12

14 Mar '12

Hi I´m this problema when I Try to use Intersect uintervals tools: Info: Skipped 6 invalid lines of 1st dataset, 1st line #1258: "chr9 357415 360396 ORF - YIR002C MPH1 S000001441 Active Verified 2982 ", problem: 536870912 is larger than the size of this BitSet (536870912). Some idea? Thanks regards ☺If you have used the Services of the Genomics Unit of Cabimer, we would be grateful if you would give us a mention in future publications Mónica Pérez Alegre, PhD Genomics Unit CABIMER-CSIC Edif. CABIMER - Avda. Américo Vespucio s/n Parque Científico y Tecnológico Cartuja 93 41092 Seville-SPAIN Tlf: +34 954 467 828 Fax: +34 954 461 664 www.cabimer.es<http://www.cabimer.es/> http://www.cabimer.es/web/es/unidades-apoyo/genomica

2 1

Extracting number of reads from Bowtie analysis
by Jerzy Dyczkowski 14 Mar '12

14 Mar '12

Hello, I am new here! I am aligning Solexa files to genome using tool: NGS: Mapping: Map with Bowtie for Illumina. My question: What is the most easy way to check the number of aligned reads in the output from above? I couldn't find this number directly. I found a way, but it looks circular and unoptimal: to run Bowtie with option: put umapped reads into the file, download the file, open it, count reads, subtract from reads in the original file. However, downloading large files is time-consuming, and I am sure the easier way is somewhere. Further question: good link to help on how to improve number of reads from ChIP study aligned to mouse genome would be also appreciated. best regards, Jerzy

2 1