puzzle: tool result always "set dataset state to ERROR"
Hi Guys, I'm doing a very simple test about deseq2...there is a weird situation always happening: It looks like the deseq2 tool executed just fine, without any error, and the result files were created, but after the set_metadata, galaxy always 'set dataset state to ERROR'. The xml file for this test is: <tool id="deseq2 test" name="DESeq2" version="2"> <description>Determines differentially expressed transcripts from read alignments</description> <command> t.sh $input1 $test $out $log </command> <inputs> <param format="txt" name="input1" type="data" label="Quant"/> <param format="txt" name="input2" type="data" label="Conditions"/> <param name="test" type="select" label="please choose control condition"> <options from_dataset="input2"> <column name="value" index="0"/> </options> </param> </inputs> <outputs> <data format="txt" name="out" label="DESeq result"/> <data format="txt" name="log" label="DESeq log file"/> </outputs> </tool> Basically, we have an input file from Partek flow(input1), input2 is one column from input1, which has all the conditions, in our test, it has 3 conditions, CTC, LM, PT. input name "test" is the dropdown list that contain all 3 conditions, and we choose one as control condition, in our case it is CTC. t.sh is very simple, it basically calls R script: Rscript /home/bioinfo/app/galaxy-dist/tools/Deseq/workflow.R $1 $2 $3 $4 now, in the workflow.R, the related output part is: for (i in 2:3) { res <- results(dds, contrasts[i]) ## sort the result table by FDR res <- res[order(res$padj),] ## Output the results write.table(as.data.frame(res), file=paste0(args[i+1]), sep = "\t") } So, it looks like the result files were generated as expected, with the correct information. However, it always was set to state ERROR. Am I missing something? Or did anyone see this before? Any inputs will be greatly appreciated! Thanks, Rui
On Sun, Sep 29, 2013 at 8:47 PM, ruiwang.sz <ruiwang.sz@gmail.com> wrote:
Hi Guys,
I'm doing a very simple test about deseq2...there is a weird situation always happening:
It looks like the deseq2 tool executed just fine, without any error, and the result files were created, but after the set_metadata, galaxy always 'set dataset state to ERROR'.
The xml file for this test is:
<tool id="deseq2 test" name="DESeq2" version="2"> <description>Determines differentially expressed transcripts from read alignments</description> <command> t.sh $input1 $test $out $log </command> <inputs> <param format="txt" name="input1" type="data" label="Quant"/> <param format="txt" name="input2" type="data" label="Conditions"/> <param name="test" type="select" label="please choose control condition"> <options from_dataset="input2"> <column name="value" index="0"/> </options> </param> </inputs>
<outputs> <data format="txt" name="out" label="DESeq result"/> <data format="txt" name="log" label="DESeq log file"/> </outputs> </tool>
Basically, we have an input file from Partek flow(input1), input2 is one column from input1, which has all the conditions, in our test, it has 3 conditions, CTC, LM, PT.
input name "test" is the dropdown list that contain all 3 conditions, and we choose one as control condition, in our case it is CTC.
t.sh is very simple, it basically calls R script:
Rscript /home/bioinfo/app/galaxy-dist/tools/Deseq/workflow.R $1 $2 $3 $4
now, in the workflow.R, the related output part is:
for (i in 2:3) { res <- results(dds, contrasts[i])
## sort the result table by FDR res <- res[order(res$padj),]
## Output the results write.table(as.data.frame(res), file=paste0(args[i+1]), sep = "\t") }
So, it looks like the result files were generated as expected, with the correct information. However, it always was set to state ERROR. Am I missing something? Or did anyone see this before?
Any inputs will be greatly appreciated!
Thanks, Rui
Hi Rui, Was anything written to stderr, like a warning from R itself? You should be able to check via the "i" icon of the dataset in the history view. By default, Galaxy treats any output on stderr as an error, see the <stdio> tag information here: http://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax Peter
Hi Peter, Thanks for the reply. I checked the history view. I don't see anything when I clicked the "i" icon and click the stderr link. However, here is what I see if I click the little bug icon next to the "i": Tool execution generated the following error message: Loading required package: GenomicRanges Loading required package: methods Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following object is masked from ‘package:stats’: xtabs The following objects are masked from ‘package:base’: anyDuplicated, as.data.frame, cbind, colnames, duplicated, eval, Filter, Find, get, intersect, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist Loading required package: IRanges Loading required package: Biobase Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Loading required package: lattice Loading required package: Rcpp Loading required package: RcppArmadillo estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting generalized linear model All these are the normal output message from loading the 'Deseq2' package in R, and run some functions. After I use suppressMessages(Library("DESeq2")) to suppress the library loading messages, now, the following lines appear if I click "i" and then click "stderr" estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting generalized linear model I guess I need to suppress them as well -- but how come they are on stderr not stdout? Thanks, Rui On Sun, Sep 29, 2013 at 3:36 PM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
Hi Guys,
I'm doing a very simple test about deseq2...there is a weird situation always happening:
It looks like the deseq2 tool executed just fine, without any error, and
On Sun, Sep 29, 2013 at 8:47 PM, ruiwang.sz <ruiwang.sz@gmail.com> wrote: the
result files were created, but after the set_metadata, galaxy always 'set dataset state to ERROR'.
The xml file for this test is:
<tool id="deseq2 test" name="DESeq2" version="2"> <description>Determines differentially expressed transcripts from read alignments</description> <command> t.sh $input1 $test $out $log </command> <inputs> <param format="txt" name="input1" type="data" label="Quant"/> <param format="txt" name="input2" type="data" label="Conditions"/> <param name="test" type="select" label="please choose control condition"> <options from_dataset="input2"> <column name="value" index="0"/> </options> </param> </inputs>
<outputs> <data format="txt" name="out" label="DESeq result"/> <data format="txt" name="log" label="DESeq log file"/> </outputs> </tool>
Basically, we have an input file from Partek flow(input1), input2 is one column from input1, which has all the conditions, in our test, it has 3 conditions, CTC, LM, PT.
input name "test" is the dropdown list that contain all 3 conditions, and we choose one as control condition, in our case it is CTC.
t.sh is very simple, it basically calls R script:
Rscript /home/bioinfo/app/galaxy-dist/tools/Deseq/workflow.R $1 $2 $3 $4
now, in the workflow.R, the related output part is:
for (i in 2:3) { res <- results(dds, contrasts[i])
## sort the result table by FDR res <- res[order(res$padj),]
## Output the results write.table(as.data.frame(res), file=paste0(args[i+1]), sep = "\t") }
So, it looks like the result files were generated as expected, with the correct information. However, it always was set to state ERROR. Am I missing something? Or did anyone see this before?
Any inputs will be greatly appreciated!
Thanks, Rui
Hi Rui,
Was anything written to stderr, like a warning from R itself? You should be able to check via the "i" icon of the dataset in the history view.
By default, Galaxy treats any output on stderr as an error, see the <stdio> tag information here: http://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax
Peter
Hi Peter, Thanks! I got it working by putting a suppressMessages() around the function call. I might have some questions on how to generate output file names on the fly...let me make some attempts first. :-) Best, Rui On Sun, Sep 29, 2013 at 10:54 PM, ruiwang.sz <ruiwang.sz@gmail.com> wrote:
Hi Peter,
Thanks for the reply.
I checked the history view. I don't see anything when I clicked the "i" icon and click the stderr link. However, here is what I see if I click the little bug icon next to the "i":
Tool execution generated the following error message:
Loading required package: GenomicRanges Loading required package: methods Loading required package: BiocGenerics Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
The following object is masked from ‘package:stats’:
xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, as.data.frame, cbind, colnames, duplicated, eval, Filter, Find, get, intersect, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table, tapply, union, unique, unlist
Loading required package: IRanges Loading required package: Biobase Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: lattice Loading required package: Rcpp Loading required package: RcppArmadillo estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting generalized linear model
All these are the normal output message from loading the 'Deseq2' package in R, and run some functions. After I use
suppressMessages(Library("DESeq2"))
to suppress the library loading messages, now, the following lines appear if I click "i" and then click "stderr"
estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting generalized linear model
I guess I need to suppress them as well -- but how come they are on stderr not stdout?
Thanks, Rui
On Sun, Sep 29, 2013 at 3:36 PM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
On Sun, Sep 29, 2013 at 8:47 PM, ruiwang.sz <ruiwang.sz@gmail.com> wrote:
Hi Guys,
I'm doing a very simple test about deseq2...there is a weird situation always happening:
It looks like the deseq2 tool executed just fine, without any error, and the result files were created, but after the set_metadata, galaxy always 'set dataset state to ERROR'.
The xml file for this test is:
<tool id="deseq2 test" name="DESeq2" version="2"> <description>Determines differentially expressed transcripts from read alignments</description> <command> t.sh $input1 $test $out $log </command> <inputs> <param format="txt" name="input1" type="data" label="Quant"/> <param format="txt" name="input2" type="data" label="Conditions"/> <param name="test" type="select" label="please choose control condition"> <options from_dataset="input2"> <column name="value" index="0"/> </options> </param> </inputs>
<outputs> <data format="txt" name="out" label="DESeq result"/> <data format="txt" name="log" label="DESeq log file"/> </outputs> </tool>
Basically, we have an input file from Partek flow(input1), input2 is one column from input1, which has all the conditions, in our test, it has 3 conditions, CTC, LM, PT.
input name "test" is the dropdown list that contain all 3 conditions, and we choose one as control condition, in our case it is CTC.
t.sh is very simple, it basically calls R script:
Rscript /home/bioinfo/app/galaxy-dist/tools/Deseq/workflow.R $1 $2 $3 $4
now, in the workflow.R, the related output part is:
for (i in 2:3) { res <- results(dds, contrasts[i])
## sort the result table by FDR res <- res[order(res$padj),]
## Output the results write.table(as.data.frame(res), file=paste0(args[i+1]), sep = "\t") }
So, it looks like the result files were generated as expected, with the correct information. However, it always was set to state ERROR. Am I missing something? Or did anyone see this before?
Any inputs will be greatly appreciated!
Thanks, Rui
Hi Rui,
Was anything written to stderr, like a warning from R itself? You should be able to check via the "i" icon of the dataset in the history view.
By default, Galaxy treats any output on stderr as an error, see the <stdio> tag information here: http://wiki.galaxyproject.org/Admin/Tools/ToolConfigSyntax
Peter
On Mon, Sep 30, 2013 at 7:07 AM, ruiwang.sz <ruiwang.sz@gmail.com> wrote:
Hi Peter,
Thanks! I got it working by putting a suppressMessages() around the function call.
Great, but what I meant was just add this to your XML file to tell Galaxy to ignore the stderr and only use the return code for deciding if there was an error. This way you get to see any warnings from your tool: <stdio> <!-- Anything other than zero is an error --> <exit_code range="1:" /> <exit_code range=":-1" /> </stdio>
I might have some questions on how to generate output file names on the fly...let me make some attempts first. :-)
Best, Rui
Please read this thread first - there are already lots of people working on wrapping deseq2 into Galaxy, http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-September/016444.html Peter
participants (2)
-
Peter Cock
-
ruiwang.sz