August 2013 - galaxy-user - lists.galaxyproject.org

watching command line to a query
by lilach noy 30 Aug '13

30 Aug '13

Hello, How can i see the command line a query executes? To be more specific i am new to the Galaxy and plan to use as a way to learn how to run queries locally. In order to understand not only the functions available but also the way i can write it on my own i'd be happy to be able to see the command line that execute my queries. How can I see this? Thank you

4 7

August 2013 Galaxy Update Newsletter is out
by Dave Clements 30 Aug '13

30 Aug '13

1 0

Subtract Whole Dataset
by lilach noy 30 Aug '13

30 Aug '13

Hello all, I am having troubles with the Subtract Whole Dataset function - i've tried subtracting one BAM file from another and got as an output a bai file. I need it to be a BAM file so i can convert it to SAM and see the data in a none binary form. Anyone have any thoughts on the issue? Thank you for trying :) Lilach

2 1

Which Input FASTQ quality scores type should I choose when run FASTQ Groomer?
by Du, Jianguang 30 Aug '13

30 Aug '13

Hi All, I downloaded some RNA-seq datasets from NCBI. The datasets were generated by Illumina Hiseq 2000. I am not sure which "Input FASTQ quality scores type" I should choose when run FASTQ Groomer. Below shows the scores of 2 reads of a dataset, I renamed them as "read 1" and "read 2". 1) Sequence and quality score displayed in Galaxy @read 1 length=51 NTGAGATTCTTGACTAGTTATTTCTGCTTTCAGGGAAGAAATCAGCTGGGC +read 1 length=51 #1=ADADEHHHHHIIGIHJGJJJHJIIJJJH@HEGBFH;FHEH>@HIJJJJ @read 2 length=51 NGAAGAGTCAGTTTTTTGTTTCCCTCATAACTTGCTAGATTCCGGATTGCT +read 2 length=51 #1=DDDEDHHFHHJJJJJIJJHIIIJJJIJJJJJJJIJIJJJJJJIJJJJI 2) Sequence and one chanel quality score shown in SRA of NCBI when I downloaded the dataset. >gnl|SRA|read 1 NTGAGATTCTTGACTAGTTATTTCTGCTTTCAGGGAAGAAATCAGCTGGGC One channel quality score 2 16 28 32 35 32 35 36 39 39 39 39 39 40 40 38 40 39 41 38 41 41 41 39 41 40 40 41 41 41 39 31 39 36 38 33 37 39 26 37 39 36 39 29 31 39 40 41 41 41 41 >gnl|SRA|read 2 NGAAGAGTCAGTTTTTTGTTTCCCTCATAACTTGCTAGATTCCGGATTGCT One channel quality score 2 16 28 35 35 35 36 35 39 39 37 39 39 41 41 41 41 41 40 41 41 39 40 40 40 41 41 41 40 41 41 41 41 41 41 41 40 41 40 41 41 41 41 41 41 40 41 41 41 41 40 Looks like the dataset is generated by illumina that is later than version 1.8 because some of the reads are at score quality of 41. Can I choose "sanger" as "Input FASTQ quality scores type" when I run FASTQ Groomer? Thanks. Jianguang Du

2 1

Barcode Splitter Problem
by Priyanka Vengurlekar 30 Aug '13

30 Aug '13

Hi all, I need some serious help i got output from the Miseq machine in fastq file format. My supevisor asked me to separate barcodes, so since monday i have been struggling to use this in command- line and executed it but either there is some mistake that it doesnt recognize any barcodes at all and does not give me out text file just puts it all in one file called unmatched.I then just to try tried the convert the fastq to fasta command and it said cannot execute the binary file....i broke my head so i tried other commands all said cannot execute binary file... Q1..so first kindly tell me why any of these binary files cant be executed from the root directory? and what extra software do i have to download more then what was within the installation instructions. So i turned to the galaxy web based usage which was all the more harder i could not figure out in the barcode splitter program what the library to split actually require at first i thought the document with barcodes so i uploaded but this thing just does not show anything in the pulldown for library to split. Its thursday and i have to still trim the fastq sequences and nothing seems to work at all with what i am working ....can some body please help me...... Sincerely, Piyu

2 1

Error "out of memory" when trying to retrieve output
by Delong, Zhou 30 Aug '13

30 Aug '13

Hello, I wanted to download the accepted junction .bam file from tophat output of my local instance and I get an "out of memory" error. When I examine the server via command line, I found that a python process used by galaxy occupied more than 80% of total memory (on the virtual machine with 10G of RAM).. I tried curl command to retrieve the datafile after rebooted the virtual machine, and python is activated again and used up all the memory. The bam is around 20G of size, but I never had this kind of problem with other tophat analyses before on my local instance although they are of the same size. The discription on the web mentioned some .dat files that I manage to find on the disk, but not the bam. Can anyone explain what python is doing and how can I solve this please? Thanks, Delong

2 2

Question about expression in Galaxy tools
by 师云 29 Aug '13

29 Aug '13

Hello everyone, I found regular expression could be available in the tool (filter and sort) ->Filter. I wonder whether it could be the same in the tool (Text Manipulation) ->Compute. I have checked that the fuction "len(c4.split('_'))" would return error. So, could anyone tell me if it was possible like this? The file A: chr1 10 40 NM_1234_exon_1 chr2 50 70 NM_1234_exon_2 Change the file A to the file B such as: chr1 10 40 NM_1234 chr2 50 70 NM_1234 May it is useless in the example. But the way could solve problems I met.

2 2

problem with bowtie
by Law, Michael J. 29 Aug '13

29 Aug '13

Hello, I am completely new to using galaxy. I have a quick question. I have uploaded my fastq files generated from my experiment for alignments using bowtie. I try to set up the alignment, only to receive the error message that I don't have any sequences with ASCII encoded quality scores. However, when I contact my sequencing facility, they say that the scores are present in the files and that it may be an issue with galaxy. Any help you can provide would be appreciated! Thanks, Mike Michael Law lawmj(a)rowan.edu<mailto:lawmj@rowan.edu>

3 2

exome-capture sequencing analysis tools?
by Yan He 29 Aug '13

29 Aug '13

Hi Jen and other Galaxy-users, I am working on exome-capture sequencing with NGS. I am wondering if there is a tool to identify SNPs on Galaxy? I would like to get SNP information (position and allele frequency ) for each gene. Any information is highly appreciated! Thanks! Best wishes, Yan

2 1

How to extract geneID from pileup file?
by Yan He 29 Aug '13

29 Aug '13

Dear galaxy-users, I am working on a project to identify and genotype SNPs in targeted genes. I did some analysis using Galaxy. First, mapping to the genome with Bowtie. Second, identify SNPs using MPileup in SAMtools. When I got the pileup file, the SNP information is in which chromosome and what position. I would like to focus on the SNPs within genes. How could I extract the SNP information for each genes (SNP position, coverage)? Is there a tool in Galaxy to fulfill this? Any help is highly appreciated! Best wishes, Yan

2 1