Data for hg17 is not yet available for this toolset. You may still be able to use this tool if you upload your own FASTA genome file. Please be sure to respect the GATK genome ordering rules and reorder your BAM file if necessary: http://www.broadinstitute.org/gsa/wiki/index.php/Preparing_the_essential_...
However, because these tools are still under development/experimental we recommended that you remap your sequences against the "hg_g1k_v37" genome if you want to use these tools at this time.
Also, please keep all replies on the mailing list.
Thanks for using Galaxy,
On Aug 18, 2011, at 1:33 PM, Hong, Xiaojing wrote:
> Hi, Dan
> I did set the genome build to HG17 but I still can’t see the select box.
> Xiaojing Hong
> Ph.D candidate
> Department of Biology
> Manak Lab
> 455 Biology Building
> University of Iowa
> Iowa City, IA 52242
> (P) 319-335-0266
> From: Daniel Blankenberg [mailto:email@example.com]
> Sent: Thursday, August 18, 2011 12:19 PM
> To: Hong, Xiaojing
> Cc: galaxy-user(a)lists.bx.psu.edu
> Subject: Re: [galaxy-user] question about the GATK tools
> Hi Xiaojing,
> You'll need to make sure that the dbkey of your input BAM file is set to a genome build that has data available. Click the pencil icon to set the genome build. If you have set the genome build, but still have an empty select box then data may not be available for this build/tool combination, you can request that the needed data be added for this tool. If your reference genome is not available, you can also upload a FASTA file containing the genome and access it directly from the history.
> Thanks for using Galaxy,
> On Aug 17, 2011, at 11:26 AM, Hong, Xiaojing wrote:
> I just uploaded the BAM file for an exome sequencing sample and was trying to use the GATK tools. In the first step, realigner Target creator, I can see my uploaded file but I can’t see any options under the “using reference genome” and the following choices so I can’t click execute. Did I do anything wrong? Thanks!
> Xiaojing Hong
> University of Iowa
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org. Please keep all replies on the list by
> using "reply all" in your mail client. For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
I just uploaded the BAM file for an exome sequencing sample and was trying to use the GATK tools. In the first step, realigner Target creator, I can see my uploaded file but I can't see any options under the "using reference genome" and the following choices so I can't click execute. Did I do anything wrong? Thanks!
University of Iowa
I'm trying to determine if Galaxy will work with Xgrid to set up a small
cluster of Mac's for next-gen sequencing projects. I saw
Galaxy supports Torque PBS or Sun Grid. I don't think that Sun Grid
runs on Snow Leopard and I don't know about Torque. Snow Leopard already
comes with Xgrid so I was hoping to use it.
Thank you for your help,
University of Pittsburgh
If you provide a gene annotation to Cufflinks, the transcripts
produced will match those in the annotation exactly. If you assemble
without a gene annotation, the transcripts produced will match the
reference in some cases, but, in others, will not match the reference
due to small and/or large errors. Because '=' denotes an exact match
between an assembled transcript and a reference transcript, more '='
are to be expected when Cufflinks has a gene annotation.
Finally, a couple procedural issues:
*please send questions about analyses and tool usage to the galaxy-
user mailing list, not galaxy-dev or individual developers;
*please do not send duplicate emails as it can confuse our tracking
system and slow down our response rather than speed it up.
On Aug 17, 2011, at 9:14 AM, Crystal Goh wrote:
> Hi, I am Crystal. I have some problem with Cuffdiff output. Hope can
> get some advice. Thanks.
> After aligning RNA-seq reads with Tophat, I used the Tophat output
> for Cufflinks.
> For Cufflinks, I tried two approaches and compared the results:
> 1st approach: Put zebrafish Ensembl GTF as reference annotation
> 2nd approach: without reference annotation.
> From the output of above 2 approaches, I continued with Cuffcompare
> (with reference annotation) and Cuffdiff,
> Attached word document is the workflow and parameters I set for
> these 2 approaches.
> When I compared the output of Cuffdiff between these 2 approaches, a
> total of 48584 tracking id with class code "=" was observed in
> trancript FPKM tracking file from Approach 1, whereas there is only
> 1248 tracking id with class code '=' from Approach 2 (I attached
> transcript FPKM tracking files from approach 1 and 2)
> In my opinion, I should observe 48584 tracking id with class code
> '=' and additional tracking id with other class codes in transcript
> FPKM tracking file from Approach 2.
> Can I get advice on this?
> Thank you.
> Best regards,
> <Workflow and parameter for 2 approaches.zip><Approach 1 Transcript
> FPKM tracking (Cufflinks with reference annotation).zip><Approach 2
> Transcript FPKM tracking (Cufflinks without reference
I am trying to map fastq files using bowtie and bwa and the job will not
run. It has remained gray for almost 24 hours now. I am able to do other
operations. I tried aligning to both a genome from history and hg18, same
I have a bed file with reads mapped on the genome, I filtered it to have
just chr1, once I got the filtered file, I run MACS on it to calculate peaks
(btw it is a histone modification mapped reads) and the file was generated
when I tried to view it in UCSC genome browser I got an error message
* Error 500: Internal Server Error**
Is this related to galaxy or UCSC GB ? I tried the experiment 4 times and it
is always the case
*Bioinformatics Postdoctoral Research Scientist*
*Institute for Advanced Computer Studies
Center for Bioinformatics and Computational Biology* *(CBCB)*
*University of Maryland, College Park
So, I'm trying to setup a smooth pathway for our users to import their
NGS data into Galaxy. I'd hoped for a solution that would require zero
interaction with an administrator, but things seem more awkward than
they should be:
1. We can automagically push produced NGS data into a user import
directory on the host server. No problem.
2. The user can upload the data into via "Shared Data / Data Libraries
/ <name of data library> / Add datasets / Upload directory of files"
3. _But_ this data library must be created by an administrator, who must
then assign appropriate permissions (access and add) for the appropriate
Have I got this right? I suppose that the last step only needs to be
done once (create an "import library" for every users that I expect to
want to import), but it seems a little fiddlier than expected. Is there
an easier way, have I missed something?
Paul Agapow (paul-michael.agapow(a)hpa.org.uk)
Bioinformatics, Centre for Infections, Health Protection Agency
The information contained in the EMail and any attachments is
confidential and intended solely and for the attention and use of
the named addressee(s). It may not be disclosed to any other person
without the express authority of the HPA, or the intended
recipient, or both. If you are not the intended recipient, you must
not disclose, copy, distribute or retain this message or any part
of it. This footnote also confirms that this EMail has been swept
for computer viruses, but please re-sweep any attachments before
opening or saving. HTTP://www.HPA.org.uk
Recently, I run cufflink in galaxy on the internet. I want to compare two
samples, However, I found no transcript or gene passed the significant
level, even many of them have large FPKM in one sample and 0 FPKM in another
Below is my cufflink process:
I have four samples belong to two group. the test have three samples, and
the control has one sample.
First, using accept_hit.bam from tophat, I run cufflink without annotation
on each sample.
Then, for the four "gtf" files from four samples, I run cuffcompare to
combine these transcript and compare to the annotation genome. However, at
this step, I found the transcript accuracy is very low.
See one example:
Missed exons: 10673/11776 ( 90.6%)
Wrong exons: 1254/2007 ( 62.5%)
Missed introns: 8529/8637 ( 98.7%)
Wrong introns: 2/5 ( 40.0%)
Missed loci: 0/504 ( 0.0%)
Wrong loci: 1248/2002 ( 62.3%)
at last, I run cufdiff between this two group sample.