I am trying to scramble the pbs_python egg for my galaxy installation, when
I execute the following command:
"LIBTORQUE_DIR=/usr/local/lib python scripts/scramble.py pbs_python"
I got lots messages like the following:
scramble(): Egg already exists, remove to force rebuild:
but there is message about pbs_python and there is no pbs_python.egg
generated in the eggs directory
---------- Forwarded message ----------
From: YOGESH OSTWAL <yogeshfreebird(a)gmail.com>
Date: Mon, Jul 11, 2011 at 1:34 PM
Subject: Re: [galaxy-user] Hi
To: Ido Tamir <tamir(a)imp.ac.at>
thanks a lot.
Sorry for disturbing you again.
Before starting with the actual data, can I try this analysis with already
available IP and input files of datasets of illumina from NGS repository?
On Mon, Jul 11, 2011 at 12:29 PM, Ido Tamir <tamir(a)imp.ac.at> wrote:
> On Jul 10, 2011, at 8:00 AM, YOGESH OSTWAL wrote:
> > Dear Galaxy users,
> > This is Yogesh, a new galaxy user, very new to programming as well. Can
> anybody guide me from where to start to learn ChIP-Seq analysis?
> Maybe with galaxy you don't have to program.
> Its difficult to help you without knowing what your input data is.
> If you have one IP file and one Input File from a TF binding experiment
> from an Illumina machine
> you have to:
> 0: have a look at the screencasts (galactic quickies) for some of the
> 1. upload the data. (Get Data section)
> 2. Do some quality statistics (FASTX-Toolkit for FASTQ data): Compute
> Quality Statistics -> Draw ...
> 3. Map the input data files (NGS TOOLBOX BETA _ Map with Bowtie
> 4. call the peaks with e.g. MACS (also NGS toolbox).
> 5. visualize peaks and raw data in a genome browser (e.g UCSC, IGB or
> then it gets more difficult with annotating the peaks etc...
When I use the NGS Picard tool "SAM/BAM alignment summary metrics" with a BAM file produced with Tophat in Galaxy I get the following message:
INFO:root:## executing java -Xmx4g -jar /galaxy/home/g2main/galaxy_main/tool-data/shared/jars/CollectAlignmentSummaryMetrics.jar VALIDATION_STRINGENCY=LENIENT ASSUME_SORTED=true ADAPTER_SEQUENCE= IS_BISULFITE_SEQUENCED=false MAX_INSERT_SIZE=1000 OUTPUT=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/CollectAlignmentSummaryMetrics.metrics.txt R=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/mm9.fa_fake.fasta TMP_DIR=/tmp INPUT=/galaxy/main_database/files/002/666/dataset_2666199.dat returned status 1 and stderr:
[Mon Jul 11 03:27:56 EDT 2011] net.sf.picard.analysis.CollectAlignmentSummaryMetrics MAX_INSERT_SIZE=1000 ADAPTER_SEQUENCE=[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG, AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNNNATCTCGTATGCCGTCTTCTGCTTG, IS_BISULFITE_SEQUENCED=false] INPUT=/galaxy/main_database/files/002/666/dataset_2666199.dat OUTPUT=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/CollectAlignmentSummaryMetrics.metrics.txt REFERENCE_SEQUENCE=/galaxy/main_pool/pool5/tmp/job_working_directory/2368774/dataset_2672529_files/mm9.fa_fake.fasta ASSUME_SORTED=true TMP_DIR=/tmp VALIDATION_STRINGENCY=LENIENT IS_BISULFITE_SEQUENCED=false STOP_AFTER=0 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
[Mon Jul 11 03:27:57 EDT 2011] net.sf.picard.analysis.CollectAlignmentSummaryMetrics done.
Exception in thread "main" java.lang.IllegalArgumentException: No enum const class net.sf.samtools.SAMFileHeader$SortOrder.sorted
If I use the NGS: Picard BAM Index Statistics, the script is running without any failure so I think the BAM format should be okay.
Any help would be gratefully received.
It is my pleasure to announce that Galaxy has a new wiki:
http://galaxyproject.org/wiki. This wiki contains all the content from the
old bitbucket wiki, plus a bunch of new content (most of which is still work
The new wiki is based on MoinMoin and includes several new or improved
* an automatically generated list of all pages (click on All Pages in
* Ability to upload files and images without using Mercurial.
* plus a lot more
The content, organization, and look and feel haven't entirely settled yet,
so expect things to move around for a bit.
You don't need any special knowledge to read the wiki. If you want to
update the wiki you'll need to create a login (click on the Login link).
Anyone can create a login, but you will need to answer a random (but
hopefully easy) question about Galaxy to do so*. You can use either
MoinMoin markup or Creole markup (but not both on the same page).
We are hoping that the new wiki will be both much easier to use and to
update than the old one. If you have any questions or comments, please send
them to me or to the list as appropriate.
Look for more emails as more features in the wiki become fully functional.
* And you will be asked to answer questions every time you update pages. If
you get tired of this (and you will), please send me your wiki login and I
will make those annoying (but spam-preventing) questions go away.
I have a workflow where I'd like the user to input a value once, say a number of nucleotides. That value would then be used as an input parameter to several different tasks, for example, to two instances of "Operate on Genomic Intervals > Get flanks" , where it would be used both for the "offset" and "length of flanking regions(s)" in one instance, and it's value and it's *negative* would be used for the second instance.
Thus, the user inputs "20", and Get_flanks(20,20) and Get_flanks(-20,20) get run.
For this workflow, it's important that those parameters all be of the same magnitude, or things will get messy later, so I don't want the user having to input them separately, or to have to remember which one gets negated...
All suggestions welcome,
This may appear as a very basic question.
I: I have in the past used "name" as the object that allows the argument
reference to passed from XML to Rscript running in the background. However I
am currently using a conditional tag that makes this
slightly incomprehensible. My command line argument for Rscript is given in
the <command> tag. I have also tried to define Rscript reference by using
XML filter tags "key" or "ref" but perhaps this is the wrong way to go.
Argument $geoID is a text reference that that the user writes in. This ID is
read into Rscript and within the script, I access the data using R package
Argument $input_cel allows the user to upload a CEL file
Argument $input_cel allows the user to upload a text tab limited file
Codes are given below. The error I get on the galaxy interface is "NotFound:
cannot find 'geoID'"
<tool id="testtool" name="TEST">
<description> xyz </description>
<command>ppgalaxy.r $input $geoID $input_cel $input_exprs $platform
$species $exptRecords_dist $exptRecords_consensus
<param name="input" type="select" label="User Data Source"/>
<param name="input_type" type="select" data_key="input" label="User
<option value="GEO_data" selected="true">GSM ID</option>
<option value="cel.file" >CEL file Upload</option>
<option value="data.exprs" >Expression Vector Upload</option>
<param name="geoID" label="GEO id by GSM" type="text" area="TRUE"
<param name="input_cel" type="hidden" label="CEL file"
<param name="input_exprs" type="hidden" format="tabular"
label="Expression file" default="0"/>
II. I call a param's tag within a param's, My codes are given below. The
problem is, I should be able to read in three arguments (1) input_exprs :
data file (2) platform name (3) selected Species. On the GUI, the platform
and species is not visible
<param name="geoID" type="hidden" label="GEO id by GSM"
<param name="input_cel" type="hidden" label="CEL file" default="0"/>
<param name="input_exprs" format="tabular" label="Expression file"
<options name="Platform by GPL" type="text" size="7"
<label>Platform Input - GPL </label>
<param name="species" type="select" format="text">
<option value="human">HOMO SAPIENS</option>
<option value="mouse">MUS MUSCULUS</option>
Any help or suggestion is much appreciated.
I noticed Galaxy could parse qual file and output kinds of stats, which is said to be based on fastx toolkit. Yet fastx_quality_stats in fastx toolkit doesn't work on qual file of solid. Anyone has idea about the trick? Thanks.
I am having trouble replicating the great output you get after running samtools
"Filter pileup on coverage & snps with ten columns (with consensus)" when I try
to run samtools locally on my computer.
Unfortunately we are unable to use Galaxy with our new data as our files are too
large to upload to the website.
Do you use some other scripts in the back ground to get such an informative
When I run the samtools commands
samtools pileup -i -vcf RefSeq.fa aln_sorted.bam > aln_ivcf.pileup
samtools.pl varFilter aln_ivcf.pileup | awk '($3=="*" && $6>=20 && $7>=20 &&
$8>=10)' > final_aln_ivcf.pileup
I do not get useful information in the output that tells me how many reads are
calling the alternative allele, & what the alternative allele is.
Any help would be gratefully received.
Is there any way to find out the number of reads aligning to a transcript
rather than the FPKM calculated by Cufflinks?
I'm also interested in obtaining summary statistics for mapping analyses or
RNA-Seq data, such as % of reads aligned, % uniquely aligned, mapped to
exons, introns, etc. Is there a tool that would provide a summary table
with this information?
Thanks for your help and providing this great resource,