Thanks Daniel for reply. I got the Hg19 file from the GATK bundle. After your reply I realigned the FASTQ in BWA with same Hg19 I was using from GATK. Following is error log. Please guide. Thanks, Umar INFO 08:47:28,627 HelpFormatter - --------------------------------------------------------------------------------- INFO 08:47:28,630 HelpFormatter - The Genome Analysis Toolkit (GATK) v1.4-18-g80a4ce0, Compiled 2012/01/23 15:33:58 INFO 08:47:28,630 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 08:47:28,630 HelpFormatter - Please view our documentation at http://www.broadinstitute.org/gsa/wiki INFO 08:47:28,630 HelpFormatter - For support, please view our support site at http://getsatisfaction.com/gsa INFO 08:47:28,631 HelpFormatter - Program Args: -T CountCovariates --num_threads 4 -et NO_ET --recal_file /galaxy/main_pool/pool3/tmp/job_working_directory/004/779/4779763/galaxy_dataset_5444683.dat --standard_covs --run_without_dbsnp_potentially_ruining_quality -I /space/g2main/tmp-gatk-Ib7JbA/gatk_input.bam -R /space/g2main/tmp-gatk-Ib7JbA/gatk_input.fasta INFO 08:47:28,631 HelpFormatter - Date/Time: 2012/12/13 08:47:28 INFO 08:47:28,631 HelpFormatter - --------------------------------------------------------------------------------- INFO 08:47:28,632 HelpFormatter - --------------------------------------------------------------------------------- INFO 08:47:28,647 GenomeAnalysisEngine - Strictness is SILENT INFO 08:47:28,667 ReferenceDataSource - Index file /space/g2main/tmp-gatk-Ib7JbA/gatk_input.fasta.fai does not exist. Trying to create it now. PROGRESS UPDATE: file is 7 percent complete PROGRESS UPDATE: file is 15 percent complete PROGRESS UPDATE: file is 22 percent complete PROGRESS UPDATE: file is 28 percent complete PROGRESS UPDATE: file is 33 percent complete PROGRESS UPDATE: file is 39 percent complete PROGRESS UPDATE: file is 44 percent complete PROGRESS UPDATE: file is 49 percent complete PROGRESS UPDATE: file is 53 percent complete PROGRESS UPDATE: file is 57 percent complete PROGRESS UPDATE: file is 62 percent complete PROGRESS UPDATE: file is 66 percent complete PROGRESS UPDATE: file is 73 percent complete PROGRESS UPDATE: file is 79 percent complete PROGRESS UPDATE: file is 86 percent complete PROGRESS UPDATE: file is 91 percent complete PROGRESS UPDATE: file is 96 percent complete INFO 08:55:15,430 ReferenceDataSource - Dict file /space/g2main/tmp-gatk-Ib7JbA/gatk_input.dict does not exist. Trying to create it now. INFO 08:56:07,262 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 08:56:07,336 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.06 ________________________________________ From: Daniel Blankenberg [dan@bx.psu.edu] Sent: Wednesday, December 12, 2012 4:29 PM To: Farooq,Umar (res) Cc: galaxy-user@lists.bx.psu.edu Subject: Re: [galaxy-user] GATK Not running Hi Umar, Can you click the eye icon to view the contents of the 'log' dataset for the GATK run. The end of the log should have the actual error encountered (the text you provided is a bit of a red herring) Since you are using hg19, the most likely cause for the error is that the reference fasta file you are using is not ordered properly, or that your alignments were made using a different genome (e.g. alignment with bwa using built-in hg19 [not ordered properly] and then GATK using a different hg19 fasta from your history.) If you are using a custom genome, make sure that it is GATK-ordered and that the same one is used in all steps; there is an hg19 GATK-ordered fasta file available in a Data library ('GATK') on Main. Thanks for using Galaxy, Dan On Dec 11, 2012, at 12:11 PM, Farooq,Umar (res) wrote:
Hi,
I am trying to incorporate GATK in my pipeline but not been able to make it work. I aligned my data with Hg 19 and then ran sam tool filter and then picard duplicate removal. I uploaded dbSNP and the reference FASTA file for Hg 19 in galaxy to run this pipeline. But for some reason GATK tool for base recalibration will not accept this output file. I wonder if there is sorting or indexing issue but how to fix this in galaxy.
An error occurred running this job: Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/space/g2main [Mon Dec 10 10:30:42 EST 2012] net.sf.picard.sam.CreateSequenceDictionary REFERENCE=/space/g2main/tmp-gatk-tKp41A/gatk_input.fasta OUTPUT=/space/g2main/tmp-gatk-tKp41A/dict3503196447953523717.tmp
Thanks, Umar
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at: