July 2011 - galaxy-user - lists.galaxyproject.org

scramble pbs_python egg
by Luobin Yang 11 Jul '11

11 Jul '11

Hi, I am trying to scramble the pbs_python egg for my galaxy installation, when I execute the following command: "LIBTORQUE_DIR=/usr/local/lib python scripts/scramble.py pbs_python" I got lots messages like the following: scramble(): Egg already exists, remove to force rebuild: /home/galaxy/galaxy-dist/eggs/pysam-0.4.2_kanwei_b10f6e722e9a-py2.6-linux-i686-ucs4.egg ......... but there is message about pbs_python and there is no pbs_python.egg generated in the eggs directory Anything wrong? Thanks, Luobin

1 0

Fwd: Hi
by YOGESH OSTWAL 11 Jul '11

11 Jul '11

---------- Forwarded message ---------- From: YOGESH OSTWAL <yogeshfreebird(a)gmail.com> Date: Mon, Jul 11, 2011 at 1:34 PM Subject: Re: [galaxy-user] Hi To: Ido Tamir <tamir(a)imp.ac.at> thanks a lot. Sorry for disturbing you again. Before starting with the actual data, can I try this analysis with already available IP and input files of datasets of illumina from NGS repository? On Mon, Jul 11, 2011 at 12:29 PM, Ido Tamir <tamir(a)imp.ac.at> wrote: > On Jul 10, 2011, at 8:00 AM, YOGESH OSTWAL wrote: > > > > > Dear Galaxy users, > > > > This is Yogesh, a new galaxy user, very new to programming as well. Can > anybody guide me from where to start to learn ChIP-Seq analysis? > Maybe with galaxy you don't have to program. > Its difficult to help you without knowing what your input data is. > > If you have one IP file and one Input File from a TF binding experiment > from an Illumina machine > you have to: > 0: have a look at the screencasts (galactic quickies) for some of the > tasks. > 1. upload the data. (Get Data section) > 2. Do some quality statistics (FASTX-Toolkit for FASTQ data): Compute > Quality Statistics -> Draw ... > 3. Map the input data files (NGS TOOLBOX BETA _ Map with Bowtie > 4. call the peaks with e.g. MACS (also NGS toolbox). > 5. visualize peaks and raw data in a genome browser (e.g UCSC, IGB or > trackster). > > then it gets more difficult with annotating the peaks etc... > > best, > ido > -- Regards - Yogesh -- Regards - Yogesh

1 0

picard alignment summary metrics failure
by Huge, Andreas 11 Jul '11

11 Jul '11

Hello, When I use the NGS Picard tool "SAM/BAM alignment summary metrics" with a BAM file produced with Tophat in Galaxy I get the following message: INFO:root:## executing java -Xmx4g -jar /galaxy/home/g2main/galaxy_main/tool-data/shared/jars/CollectAlignmentSummaryMetrics.jar VALIDATION_STRINGENCY=LENIENT ASSUME_SORTED=true ADAPTER_SEQUENCE= IS_BISULFITE_SEQUENCED=false MAX_INSERT_SIZE=1000 OUTPUT=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/CollectAlignmentSummaryMetrics.metrics.txt R=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/mm9.fa_fake.fasta TMP_DIR=/tmp INPUT=/galaxy/main_database/files/002/666/dataset_2666199.dat returned status 1 and stderr: [Mon Jul 11 03:27:56 EDT 2011] net.sf.picard.analysis.CollectAlignmentSummaryMetrics MAX_INSERT_SIZE=1000 ADAPTER_SEQUENCE=[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG, IS_BISULFITE_SEQUENCED=false] INPUT=/galaxy/main_database/files/002/666/dataset_2666199.dat OUTPUT=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/CollectAlignmentSummaryMetrics.metrics.txt REFERENCE_SEQUENCE=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/mm9.fa_fake.fasta ASSUME_SORTED=true TMP_DIR=/tmp VALIDATION_STRINGENCY=LENIENT IS_BISULFITE_SEQUENCED=false STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false [Mon Jul 11 03:27:57 EDT 2011] net.sf.picard.analysis.CollectAlignmentSummaryMetrics done. Runtime.totalMemory()=507379712 Exception in thread "main" java.lang.IllegalArgumentException: No enum const class net.sf.samtools.SAMFileHeader$SortOrder.sorted at java.lang.Enum.valueOf(Enum.java:196) at net.sf.samtools.SAMFileHeader$SortOrder.valueOf(SAMFileHeader.java:58) at net.sf.samtools.SAMFileHeader.getSortOrder(SAMFileHeader.java:239) at net.sf.picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:85) at net.sf.picard.analysis.SinglePassSamProgram.doWork(SinglePassSamProgram.java:54) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:157) at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:117) at net.sf.picard.analysis.CollectAlignmentSummaryMetrics.main(CollectAlignmentSummaryMetrics.java:106) If I use the NGS: Picard BAM Index Statistics, the script is running without any failure so I think the BAM format should be okay. Any help would be gratefully received. Thanks. Andreas Huge

1 0

Hi
by YOGESH OSTWAL 11 Jul '11

11 Jul '11

Dear Galaxy users, This is Yogesh, a new galaxy user, very new to programming as well. Can anybody guide me from where to start to learn ChIP-Seq analysis? -- Regards - Yogesh

2 1

Galaxy Has a New Wiki
by Dave Clements 10 Jul '11

10 Jul '11

Hello all, It is my pleasure to announce that Galaxy has a new wiki: http://galaxyproject.org/wiki. This wiki contains all the content from the old bitbucket wiki, plus a bunch of new content (most of which is still work in progress). The new wiki is based on MoinMoin and includes several new or improved features: * Search! * an automatically generated list of all pages (click on All Pages in sidebar) * Ability to upload files and images without using Mercurial. * plus a lot more The content, organization, and look and feel haven't entirely settled yet, so expect things to move around for a bit. You don't need any special knowledge to read the wiki. If you want to update the wiki you'll need to create a login (click on the Login link). Anyone can create a login, but you will need to answer a random (but hopefully easy) question about Galaxy to do so*. You can use either MoinMoin markup or Creole markup (but not both on the same page). We are hoping that the new wiki will be both much easier to use and to update than the old one. If you have any questions or comments, please send them to me or to the list as appropriate. Look for more emails as more features in the wiki become fully functional. Thanks, Dave C. * And you will be asked to answer questions every time you update pages. If you get tired of this (and you will), please send me your wiki login and I will make those annoying (but spam-preventing) questions go away. -- http://getgalaxy.org http://usegalaxy.org/ http://galaxyproject.org/wiki

2 1

How to re-use a parameter in a workflow?
by Robert Curtis Hendrickson 08 Jul '11

08 Jul '11

Galaxy Users, I have a workflow where I'd like the user to input a value once, say a number of nucleotides. That value would then be used as an input parameter to several different tasks, for example, to two instances of "Operate on Genomic Intervals > Get flanks" , where it would be used both for the "offset" and "length of flanking regions(s)" in one instance, and it's value and it's *negative* would be used for the second instance. Thus, the user inputs "20", and Get_flanks(20,20) and Get_flanks(-20,20) get run. For this workflow, it's important that those parameters all be of the same magnitude, or things will get messy later, so I don't want the user having to input them separately, or to have to remember which one gets negated... All suggestions welcome, Curtis

2 2

Passing/Referencing arguments from XML to R script while using XML conditional tag
by Uma Saxena 08 Jul '11

08 Jul '11

Hi, This may appear as a very basic question. I: I have in the past used "name" as the object that allows the argument reference to passed from XML to Rscript running in the background. However I am currently using a conditional tag that makes this slightly incomprehensible. My command line argument for Rscript is given in the <command> tag. I have also tried to define Rscript reference by using XML filter tags "key" or "ref" but perhaps this is the wrong way to go. Argument $geoID is a text reference that that the user writes in. This ID is read into Rscript and within the script, I access the data using R package GEOquery. Argument $input_cel allows the user to upload a CEL file Argument $input_cel allows the user to upload a text tab limited file Codes are given below. The error I get on the galaxy interface is "NotFound: cannot find 'geoID'" <tool id="testtool" name="TEST"> <description> xyz </description> <command>ppgalaxy.r $input $geoID $input_cel $input_exprs $platform $species $exptRecords_dist $exptRecords_consensus $exptRfingerprintTOconsensus $distHistogram</command> <inputs> <param name="input" type="select" label="User Data Source"/> <conditional name="input"> <param name="input_type" type="select" data_key="input" label="User Data Type"> <option value="GEO_data" selected="true">GSM ID</option> <option value="cel.file" >CEL file Upload</option> <option value="data.exprs" >Expression Vector Upload</option> </param> <when value="GEO_data"> <param name="geoID" label="GEO id by GSM" type="text" area="TRUE" size="7" /> <param name="input_cel" type="hidden" label="CEL file" default="0"/> <param name="input_exprs" type="hidden" format="tabular" label="Expression file" default="0"/> </when> II. I call a param's tag within a param's, My codes are given below. The problem is, I should be able to read in three arguments (1) input_exprs : data file (2) platform name (3) selected Species. On the GUI, the platform and species is not visible <when value="data.exprs"> <param name="geoID" type="hidden" label="GEO id by GSM" default="0"/> <param name="input_cel" type="hidden" label="CEL file" default="0"/> <param name="input_exprs" format="tabular" label="Expression file" type="data" > <options name="Platform by GPL" type="text" size="7" value="platform" > <label>Platform Input - GPL </label> </options> <param name="species" type="select" format="text"> <label>Get</label> <option value="human">HOMO SAPIENS</option> <option value="mouse">MUS MUSCULUS</option> </param> </param> </when> Any help or suggestion is much appreciated. Thanks Uma -- *Uma Saxena*

1 0

fastx toolkit on solid qual file
by Zheng, Xin (NIH) [C] 08 Jul '11

08 Jul '11

Hi all, I noticed Galaxy could parse qual file and output kinds of stats, which is said to be based on fastx toolkit. Yet fastx_quality_stats in fastx toolkit doesn't work on qual file of solid. Anyone has idea about the trick? Thanks. Xin Zheng

2 1

samtools output
by Alison Gardner 08 Jul '11

08 Jul '11

Hello, I am having trouble replicating the great output you get after running samtools "Filter pileup on coverage & snps with ten columns (with consensus)" when I try to run samtools locally on my computer. Unfortunately we are unable to use Galaxy with our new data as our files are too large to upload to the website. Do you use some other scripts in the back ground to get such an informative output? When I run the samtools commands samtools pileup -i -vcf RefSeq.fa aln_sorted.bam > aln_ivcf.pileup and then samtools.pl varFilter aln_ivcf.pileup | awk '($3=="*" && $6>=20 && $7>=20 && $8>=10)' > final_aln_ivcf.pileup I do not get useful information in the output that tells me how many reads are calling the alternative allele, & what the alternative allele is. Any help would be gratefully received. Thank you Alison Gardner

2 1

Alternative to FPKM from Cufflinks
by James Chitwood 07 Jul '11

07 Jul '11

Hi all, Is there any way to find out the number of reads aligning to a transcript rather than the FPKM calculated by Cufflinks? I'm also interested in obtaining summary statistics for mapping analyses or RNA-Seq data, such as % of reads aligned, % uniquely aligned, mapped to exons, introns, etc. Is there a tool that would provide a summary table with this information? Thanks for your help and providing this great resource, James

2 1