On Thu, Apr 25, 2013 at 9:36 AM, Peter Cock <p.j.a.cock(a)googlemail.com> wrote:
On Fri, Apr 5, 2013 at 3:08 PM, Peter Cock
<p.j.a.cock(a)googlemail.com> wrote:
> On Thu, Apr 4, 2013 at 7:19 PM, Daniel Blankenberg <dan(a)bx.psu.edu> wrote:
>> Hi Peter,
>>
>> What is the test error given when you do have a value defined for name in
output?
>>
>>
>> Can you try using 'empty_file.dat'?
>>
>> e.g.
>>
>> <output name="out_file" file="empty_file.dat" >
>>
>> or
>>
>> <output name="out_file" file="empty_file.dat"
compare="contains">
>>
>> etc
>
> Hi Daniel,
>
> That seems to help (plus fixing a typo in one of my child file extensions).
> However there is something else amiss, but my Galaxy is a little out of date:
>
> ...
>
> I will have to checkout the latest Galaxy code and retest in case this
> is something already fixed...
OK, I've updated to the latest galaxy-central default branch. Here's
the slightly revised test for ncbi_makeblastdb.xml,
<tests>
<test>
<!-- makeblastdb -in four_human_proteins.fasta -dbtype prot
-hash_index -title "Just 4 human proteins" -out
four_human_proteins.fasta -->
<param name="dbtype" value="prot"/>
<param name="file" value="four_human_proteins.fasta"/>
<param name="title" value="Just 4 human proteins"/>
<param name="parse_seqids" value=""/>
<param name="hash_index" value="-hash_index"/>
<output name="out_file" file="empty_file.dat"
ftype="blastdbp">
<extra_files type="file"
value="four_human_proteins.fasta.phd" name="blastdb.pdb"/>
<extra_files type="file"
value="four_human_proteins.fasta.phi" name="blastdb.phi"/>
<extra_files type="file"
value="four_human_proteins.fasta.phr" name="blastdb.phr"/>
<extra_files type="file"
value="four_human_proteins.fasta.pin" name="blastdb.pin"/>
<extra_files type="file"
value="four_human_proteins.fasta.pog" name="blastdb.pog"/>
<extra_files type="file"
value="four_human_proteins.fasta.psd" name="blastdb.psd"/>
<extra_files type="file"
value="four_human_proteins.fasta.psi" name="blastdb.psi"/>
<extra_files type="file"
value="four_human_proteins.fasta.psq" name="blastdb.psq"/>
</output>
</test>
</tests>
Here's some of the Galaxy log when I run this example manually through the
web interface:
galaxy.jobs.handler INFO 2013-04-25 15:17:43,820 (43) Job dispatched
galaxy.jobs.runners.local DEBUG 2013-04-25 15:17:44,377 (43)
executing: makeblastdb -version &>
/mnt/galaxy/galaxy-central/database/tmp/GALAXY_VERSION_STRING_43;
makeblastdb -out
"/mnt/galaxy/galaxy-central/database/files/000/dataset_46_files/blastdb"
-hash_index -in "
/mnt/galaxy/galaxy-central/database/files/000/dataset_45.dat " -title
"Just 4 human proteins" -dbtype prot
galaxy.jobs DEBUG 2013-04-25 15:17:44,436 (43) Persisting job
destination (destination id: local:///)
galaxy.jobs.runners.local DEBUG 2013-04-25 15:17:44,751 execution
finished: makeblastdb -version &>
/mnt/galaxy/galaxy-central/database/tmp/GALAXY_VERSION_STRING_43;
makeblastdb -out
"/mnt/galaxy/galaxy-central/database/files/000/dataset_46_files/blastdb"
-hash_index -in "
/mnt/galaxy/galaxy-central/database/files/000/dataset_45.dat " -title
"Just 4 human proteins" -dbtype prot
galaxy.jobs DEBUG 2013-04-25 15:17:45,456 job 43 ended
Now a snippets from the test run (unedited version at end of email).
$ ./run_functional_tests.sh -id ncbi_makeblastdb
...
galaxy.jobs.handler DEBUG 2013-04-25 15:12:44,214 (2) Dispatching to
local runner
galaxy.jobs DEBUG 2013-04-25 15:12:45,599 (2) Persisting job
destination (destination id: local:///)
galaxy.jobs.handler INFO 2013-04-25 15:12:45,711 (2) Job dispatched
galaxy.jobs.runners.local DEBUG 2013-04-25 15:12:46,345 (2) executing:
makeblastdb -version &>
/tmp/tmpovUM3w/database/tmp/GALAXY_VERSION_STRING_2; makeblastdb -out
"/tmp/tmpovUM3w/database/files/000/dataset_2_files/blastdb" -in "
/tmp/tmpovUM3w/database/files/000/dataset_1.dat
/tmp/tmpovUM3w/database/files/000/dataset_1.dat " -title "Just 4
human proteins" -dbtype prot
galaxy.jobs DEBUG 2013-04-25 15:12:46,409 (2) Persisting job
destination (destination id: local:///)
galaxy.jobs.runners.local DEBUG 2013-04-25 15:12:46,897 execution
finished: makeblastdb -version &>
/tmp/tmpovUM3w/database/tmp/GALAXY_VERSION_STRING_2; makeblastdb -out
"/tmp/tmpovUM3w/database/files/000/dataset_2_files/blastdb" -in "
/tmp/tmpovUM3w/database/files/000/dataset_1.dat
/tmp/tmpovUM3w/database/files/000/dataset_1.dat " -title "Just 4
human proteins" -dbtype prot
galaxy.jobs DEBUG 2013-04-25 15:12:47,502 job 2 ended
...
As noted in my last email, for some reason when running the test case,
the input FASTA file is being included on the command line TWICE.
Curiously the -hash_index argument has been omitted. Linked maybe?
Peter,
I have fixed the double listing of the FASTA file. Putting min=1 on a
repeat statement would result in two repeat instances when using
functional tests without this bug fix.
https://bitbucket.org/galaxy/galaxy-central/commits/5e534cc8da856ad598d63...
It is likely also the problem with your mira tests? The hash_index
missing was caused because to the param value you put in the test tag
should be true or false, not the truevalue/falsevalue attributes as
far as I can tell - those are used only by cheetah I guess. Adding the
hash_index parameter creates and additional 5 files - including ones
you listed in your test case.
With these change, I was able to write working functional tests for
your tool using the template you outlined in the Trello card. The .pin
file doesn't match, I think there is something time-based in there so
I had to set two lines of diff. Also, since this e-mail, you now have
two parameters named file, that doesn't go over well yet - so I
renamed mask|file to mask|mask_file.
<test>
<param name="dbtype" value="prot"/>
<param name="file" value="four_human_proteins.fasta"/>
<param name="title" value="Just 4 human proteins"/>
<param name="parse_seqids" value=""/>
<param name="hash_index" value="true"/>
<output name="out_file" file="empty_file.dat"
ftype="blastdbp">
<extra_files type="file" value="four_human_proteins.fasta.phr"
name="blastdb.phr"/>
<extra_files type="file" value="four_human_proteins.fasta.pin"
name="blastdb.pin" lines_diff="2" /> <!-- All lines different,
date
based? -->
<extra_files type="file" value="four_human_proteins.fasta.psq"
name="blastdb.psq"/>
<extra_files type="file" value="four_human_proteins.fasta.pog"
name="blastdb.pog"/>
<extra_files type="file" value="four_human_proteins.fasta.phd"
name="blastdb.phd"/>
<extra_files type="file" value="four_human_proteins.fasta.phi"
name="blastdb.phi"/>
<extra_files type="file" value="four_human_proteins.fasta.psd"
name="blastdb.psd"/>
<extra_files type="file" value="four_human_proteins.fasta.psi"
name="blastdb.psi"/>
</output>
</test>
These changes should work right out of central, does not utilize my
API driven variant on github.
I discovered no problems with auto_primary versus basic composite
types here, just the things listed above.
-John
And then once this has run, as before, the file comparison is hard to
fathom (it is not comparing the correct files to each other).
The example rgClean.xml which Dave Bouvier pointed me at uses
a composite datatypes with a central file ('pbed' which is a subclass
of 'html') while the other examples I've found are 'html'.
i.e. composite_type='auto_primary_file'
It does seem likely at this point that I could be the first person
attempting to write a unit test for a composite datatype without a
primary file (i.e. composite_type='basic').
I'd appreciate being shown an existing working unit test using a
basic composite datatype as an output file - perhaps there is
something on the Tool Shed (which is harder to search than
the main repository where I can use grep)?
Thanks,
Peter
--
$ ./run_functional_tests.sh -id ncbi_makeblastdb
...
galaxy.jobs.handler DEBUG 2013-04-25 15:12:44,214 (2) Dispatching to
local runner
galaxy.jobs DEBUG 2013-04-25 15:12:45,599 (2) Persisting job
destination (destination id: local:///)
galaxy.jobs.handler INFO 2013-04-25 15:12:45,711 (2) Job dispatched
galaxy.jobs.runners.local DEBUG 2013-04-25 15:12:46,345 (2) executing:
makeblastdb -version &>
/tmp/tmpovUM3w/database/tmp/GALAXY_VERSION_STRING_2; makeblastdb -out
"/tmp/tmpovUM3w/database/files/000/dataset_2_files/blastdb" -in "
/tmp/tmpovUM3w/database/files/000/dataset_1.dat
/tmp/tmpovUM3w/database/files/000/dataset_1.dat " -title "Just 4
human proteins" -dbtype prot
galaxy.jobs DEBUG 2013-04-25 15:12:46,409 (2) Persisting job
destination (destination id: local:///)
galaxy.jobs.runners.local DEBUG 2013-04-25 15:12:46,897 execution
finished: makeblastdb -version &>
/tmp/tmpovUM3w/database/tmp/GALAXY_VERSION_STRING_2; makeblastdb -out
"/tmp/tmpovUM3w/database/files/000/dataset_2_files/blastdb" -in "
/tmp/tmpovUM3w/database/files/000/dataset_1.dat
/tmp/tmpovUM3w/database/files/000/dataset_1.dat " -title "Just 4
human proteins" -dbtype prot
galaxy.jobs DEBUG 2013-04-25 15:12:47,502 job 2 ended
galaxy.web.framework DEBUG 2013-04-25 15:12:47,995 This request
returned None from get_history():
http://localhost:8898/history
galaxy.web.framework DEBUG 2013-04-25 15:12:48,097 This request
returned None from get_history():
http://localhost:8898/display
base.twilltestcase INFO 2013-04-25 15:12:48,141 ## files diff on
/mnt/galaxy/galaxy-central/test-data/four_human_proteins.fasta.phd and
/tmp/tmpovUM3w/database/tmp/tmpXjFIurblastdb.pdb lines_diff=0, found
diff = 5
---------------------- >> begin tool stdout << -----------------------
Building a new DB, current time: 04/25/2013 15:12:46
New DB name: /tmp/tmpovUM3w/database/files/000/dataset_2_files/blastdb
New DB title: Just 4 human proteins
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1073741824B
Adding sequences from FASTA; added 4 sequences in 0.000900984 seconds.
Adding sequences from FASTA; added 4 sequences in 0.000420094 seconds.
----------------------- >> end tool stdout << ------------------------
---------------------- >> begin tool stderr << -----------------------
----------------------- >> end tool stderr << ------------------------
FAIL
======================================================================
FAIL: NCBI BLAST+ makeblastdb ( ncbi_makeblastdb ) > Test-1
----------------------------------------------------------------------
Traceback (most recent call last):
File "/mnt/galaxy/galaxy-central/test/functional/test_toolbox.py",
line 171, in test_tool
self.do_it( td, shed_tool_id=shed_tool_id )
File "/mnt/galaxy/galaxy-central/test/functional/test_toolbox.py",
line 102, in do_it
self.verify_dataset_correctness( outfile, hid=elem_hid,
maxseconds=testdef.maxseconds, attributes=attributes,
shed_tool_id=shed_tool_id )
File "/mnt/galaxy/galaxy-central/test/base/twilltestcase.py", line
849, in verify_dataset_correctness
raise AssertionError( errmsg )
AssertionError: History item 2 different than expected, difference (using diff):
( /mnt/galaxy/galaxy-central/test-data/empty_file.dat v.
/tmp/tmpovUM3w/database/tmp/tmpZFzfEJempty_file.dat )
Composite file (blastdb.pdb) of History item 2 different than
expected, difference (using diff):
--- local_file
+++ history_data
@@ -1,4 +1,1 @@
-11117184492
-29249033410
-36665887501
-5392473183
+This is a BLAST protein database.
-------------------- >> begin captured stdout << ---------------------
Uploaded file: four_human_proteins.fasta , ftype: auto , extra:
{'value': 'four_human_proteins.fasta', 'children': []}
button 'in_add' clicked
form 'tool_form' contains the following controls ( note the values )
control 0: <HiddenControl(refresh=refresh) (readonly)>
control 1: <HiddenControl(tool_id=ncbi_makeblastdb) (readonly)>
control 2:
<HiddenControl(tool_state=8002549b010000613665366164613561313035643161303739393466363162623336343338386261633066643736303a3762323235663566373036313637363535663566323233613230333032633230323237343639373436633635323233613230323235633232356332323232326332303232363436323734373937303635323233613230323235633232373037323666373435633232323232633230323236383631373336383566363936653634363537383232336132303232356332323534373237353635356332323232326332303232363936653232336132303232356237623563323235663566363936653634363537383566356635633232336132303330326332303563323236363639366336353563323233613230333137643263323037623563323235663566363936653634363537383566356635633232336132303331326332303563323236363639366336353563323233613230333137643564323232633230323237303631373237333635356637333635373136393634373332323361323032323563323234363631366337333635356332323232376471002e)
(readonly)>
control 3: <RadioControl(dbtype=[*prot, nucl])>
control 4: <SelectControl(in_0|file=[*1])>
control 5: <SubmitControl(in_0_remove=Remove FASTA file 1) (readonly)>
control 6: <SelectControl(in_1|file=[*1])>
control 7: <SubmitControl(in_1_remove=Remove FASTA file 2) (readonly)>
control 8: <SubmitControl(in_add=Add new FASTA file) (readonly)>
control 9: <TextControl(title=)>
control 10: <CheckboxControl(parse_seqids=[true])>
control 11: <HiddenControl(parse_seqids=true) (readonly)>
control 12: <CheckboxControl(hash_index=[*true])>
control 13: <HiddenControl(hash_index=true) (readonly)>
control 14: <SubmitControl(runtool_btn=Execute) (readonly)>
page_inputs (0) {'dbtype': ['prot'], 'hash_index':
['-hash_index'],
'title': ['Just 4 human proteins'], 'parse_seqids': [''],
'in_0|file':
['four_human_proteins.fasta']}
--------------------- >> end captured stdout << ----------------------
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/