
Hello all, After 7 enjoyable hours of cheetah debugging ... for those (like me) who are not well versed in python or cheetah templates, here are couple of debugging tips that can help. =1= RTFM. It's here: http://www.cheetahtemplate.org/docs/users_guide_html/users_guide.html =2= printing something to STDERR, to show the progress of the cheetah template compilation To run python one-liners in your cheetech template, use "#silent". Assuming this is my Galaxy XML tool: ============= <tool id="cshl_bowtie_se1" name="Bowtie" version="0.1.3" > <command interpreter="sh">bowtie_se1.sh "$output" "$input1" #if str($report_aligned_fastq) == "True" "$output_aligned_fastq_2" #else "no-al" #end if #if str($report_nonmappers_fastq) == "True" "$output_nonmappers_fastq_2" #else "no-un" #end if #if str($report_maxmappers_fastq) == "True" "$output_maxmappers_fastq_1" #else "no-max" #end if #if $mismatches_strategy.choice == "endtoend" -v $mismatches_strategy.end_to_end_limit #else if $mismatches_strategy.choice == "seed" --seedlen $mismatches_strategy.seedlen --seedmms $mismatches_strategy.seedmms $mismatches_strategy.best $mismatches_strategy.strata #end if #if $multi_mappers_strategy.choice == "m_and_k" -k $multi_mappers_strategy.max_mapping -m $multi_mappers_strategy.max_mapping #else if $multi_mappers_strategy.choice == "unique" -k 1 -m 1 #else if $multi_mappers_strategy.choice == "all" -a #else if $multi_mappers_strategy.choice == "random_one": -k 1 #end if #if $input1.ext == "fastqillumina": --solexa1.3-quals #else if $input1.ext == "fasta": -f #else if $input1.ext == "fastqsolexa": --solexa-qual #end if [ and so on and on and on ...] ============= Then to do something directly in python, just put "#silent" followed a python command. The most useful command is to print something to STDERR, so it will appear on the screen when you execute the tool. The "#silent" means it will not be part of the generated command line. Example: ========== ... ... <command interpreter="sh">bowtie_se1.sh "$output" "$input1" #silent sys.stderr.write("Hello World from cheetah generated code\n"); #if str($report_aligned_fastq) == "True" "$output_aligned_fastq_2" #else "no-al" #end if ... ... =========== When the tool is executed (from inside Galaxy), and the job is prepared, and the command line is built, the cheetah template will be filled, and the message will be printed to STDERR. =3= Printing the value of a galaxy parameter Same "#silent" as before, but add the variable as a format parameter: === #silent sys.stderr.write(" Execute by user: %s\n" % ( $userEmail ) ) === NOTE: there are several levels of parameters interpolation going on here. "$userEmail" is a cheetah variable (obviously, because "$" doesn't mean anything in python). It is first replaced by cheetah, then executed by python, getting this to work is sometimes tricky. If you have some elabrate conditional/repeat/group parameters, it could look like this: ======= #silent sys.stderr.write(" mismatch strategy parameters: %s\n" % ( str( $mismatch_strategy ) ) ) ======= And the output would look like this (on the terminal showing Galaxy's log output): ========= mismatches_strategy = {'seedmms': <galaxy.tools.InputValueWrapper object at 0x203adfd0>, 'strata': <galaxy.tools.InputValueWrapper object at 0x203ad150>, 'choice': 'seed', 'seedlen': <galaxy.tools.InputValueWrapper object at 0x203adc90>, '__current_case__': 1, 'best': <galaxy.tools.InputValueWrapper object at 0x203ad590>} ========= This assumes that you have a valid XML <param> named "mismatch_strategry", which in my case was defined as: ====== <conditional name="mismatches_strategy" > <param name="choice" type="select" label="Mismatches strategy"> <option value="seed">Seed/Extent</option> <option value="endtoend">End-to-End mismatches limit (-v)</option> <option value="custom">Use custom options (below)</option> </param> <when value="endtoend"> <param name="end_to_end_limit" type="integer" size="2" value="0" label="Max. number of mismatches allowed in the entire read (-v)" help="" /> </when> <when value="seed"> <param name="seedlen" type="integer" size="4" value="28" label="Seed length (--seendlen)" help="Length of the seed fragment in the read. 28 is a good choice for smallRNAs. Use larger values for long reads of DNA/RNA." /> <param name="seedmms" type="integer" size="4" value="3" label="Allows mismatches in the seed (--seedmms)" help="How many mismatches are allowed in the seed fragment of the read. Enter values between 0 and 3." /> <param name="best" type="boolean" truevalue="--best" falsevalue="" checked="True" label="Report best stratum (--best)" help="hits guaranteed best stratum; ties broken by quality" /> <param name="strata" type="boolean" truevalue="--strata" falsevalue="" checked="false" label="Supress hits in sub-optimal strata (--strata)" help="hits in sub-optimal strata aren't reported (requires --best, -m, -k"/> </when> <when value="custom"> </when> </conditional> ====== =4= Show all available parameters in the cheetah template The safe way that I've found is: ======== #silent sys.stderr.write("!!!! Cheetah Template Variables !!!!\n") #silent sys.stderr.write(" searchList = '%s'\n" % (str($searchList)) ) #silent sys.stderr.write("!!!! end-of-list !!!!\n") ======== "$searchList" is the cheetah variable containing all the templated variables. The output will be ugly, but should always work. example: ============== !!!! Cheetah Template Variables !!!! searchList = '[{}, <cheetah_DynamicallyCompiledCheetahTemplate_1301603121_44_70792.DynamicallyCompiledCheetahTemplate object at 0x6b70c50>, {'input2': <galaxy.tools.DatasetFilenameWrapper object at 0x6b76fd0>, 'userEmail': 'gordon@cshl.edu', 'input1': <galaxy.tools.DatasetFilenameWrapper object at 0x69902d0>, 'user_custom_options': <galaxy.tools.InputValueWrapper object at 0x6b768d0>, 'userId': '2', 'output_maxmappers_fastq_1': <galaxy.tools.DatasetFilenameWrapper object at 0x2aaaacd33b90>, 'dbkey': 'mm9', 'output_maxmappers_fastq_2': <galaxy.tools.DatasetFilenameWrapper object at 0x62e5890>, 'multi_mappers_strategy': {'__current_case__': 2, 'choice': 'unique'}, 'GALAXY_DATA_INDEX_DIR': '/localdata1/gordon/projects/galaxy_dev/tool-data', 'GALAXY_DATATYPES_CONF_FILE': '/localdata1/gordon/projects/galaxy_dev/datatypes_conf.xml', 'min_insertion': <galaxy.tools.InputValueWrapper object at 0x62e1850>, 'report_nonmappers_fastq': <galaxy.tools.InputValueWrapper object at 0x6b76490> , 'mismatches_strategy': {'seedmms': <galaxy.tools.InputValueWrapper object at 0x6b76710>, 'strata': <galaxy.tools.InputValueWrapper object at 0x6b764d0>, 'choice': 'seed', 'seedlen': <galaxy.tools.InputValueWrapper object at 0x6b76d90>, '__current_case__': 1, 'best': <galaxy.tools.InputValueWrapper object at 0x6b76d10>}, 'output_nonmappers_fastq_1': <galaxy.tools.DatasetFilenameWrapper object at 0x2aaaacd33b10>, 'report_aligned_fastq': <galaxy.tools.InputValueWrapper object at 0x6b76a10>, '__new_file_path__': '/localdata1/gordon/projects/galaxy_dev/database/tmp', 'output_nonmappers_fastq_2': <galaxy.tools.DatasetFilenameWrapper object at 0x2aaaacd33610>, 'database': {'source': 'fromlist', 'databasepath': <galaxy.tools.SelectToolParameterWrapper object at 0x62e1c10>, '__current_case__': 1}, 'report_maxmappers_fastq': <galaxy.tools.InputValueWrapper object at 0x6b76790>, 'max_insertion': <galaxy.tools.InputValueWrapper object at 0x62e1450>, 'suppress_non_mappers': <galaxy.tool s.InputValueWrapper object at 0x6b76410>, 'output_aligned_fastq_2': <galaxy.tools.DatasetFilenameWrapper object at 0x6b6fa10>, 'output_aligned_fastq_1': <galaxy.tools.DatasetFilenameWrapper object at 0x6ab15d0>, 'GALAXY_ROOT_DIR': '/localdata1/gordon/projects/galaxy_dev', 'output': <galaxy.tools.DatasetFilenameWrapper object at 0x6b76a90>, 'chromInfo': '/localdata1/gordon/projects/galaxy_dev/tool-data/shared/ucsc/chrom/mm9.len', '__app__': <galaxy.tools.RawObjectWrapper object at 0x699c1d0>}]' !!!! end-of-list !!!! ============== A better way (but I'm not sure it's generic enough, or if it's specific to Galaxy, python gurus please chime in): ===== #silent sys.stderr.write("!!!! Cheetah Template Variables !!!!\n") #for k,v in $searchList[2].items() #silent sys.stderr.write(" %s = %s\n" % (str(k), str(v) )) #end for #silent sys.stderr.write("!!!! end-of-list !!!!\n") ===== The output in this case will be: ======== !!!! Cheetah Template Variables !!!! input2 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_251.dat userEmail = gordon@cshl.edu input1 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_183.dat user_custom_options = userId = 2 output_maxmappers_fastq_1 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_533.dat dbkey = mm9 output_maxmappers_fastq_2 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_534.dat multi_mappers_strategy = {'__current_case__': 2, 'choice': 'unique'} GALAXY_DATA_INDEX_DIR = /localdata1/gordon/projects/galaxy_dev/tool-data GALAXY_DATATYPES_CONF_FILE = /localdata1/gordon/projects/galaxy_dev/datatypes_conf.xml min_insertion = 0 report_nonmappers_fastq = True mismatches_strategy = {'seedmms': <galaxy.tools.InputValueWrapper object at 0x6a95b10>, 'strata': <galaxy.tools.InputValueWrapper object at 0x6a95e90>, 'choice': 'seed', 'seedlen': <galaxy.tools.InputValueWrapper object at 0x6a95390>, '__current_case__': 1, 'best': <galaxy.tools.InputValueWrapper object at 0x6a95210>} output_nonmappers_fastq_1 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_531.dat report_aligned_fastq = True __new_file_path__ = /localdata1/gordon/projects/galaxy_dev/database/tmp output_nonmappers_fastq_2 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_532.dat database = {'source': 'fromlist', 'databasepath': <galaxy.tools.SelectToolParameterWrapper object at 0x6a95350>, '__current_case__': 1} report_maxmappers_fastq = True max_insertion = 250 suppress_non_mappers = --quiet output_aligned_fastq_2 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_530.dat output_aligned_fastq_1 = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_529.dat GALAXY_ROOT_DIR = /localdata1/gordon/projects/galaxy_dev output = /localdata1/gordon/projects/galaxy_dev/database/files/000/dataset_528.dat chromInfo = /localdata1/gordon/projects/galaxy_dev/tool-data/shared/ucsc/chrom/mm9.len __app__ = galaxy.app:UniverseApplication !!!! end-of-list !!!! ======== Who knew one can access "GALAXY_ROOT_DIR" as a tool parameter ? (execpt for the galaxy developers, of course... :) ) =5= Divide-and-conquer a long template If you have strange cheetah compilation problems and can't understand what's wrong (sorry, python/cheetah code makes no sense to me...), use "#breakpoint" to stop the compilation at any given point in the template. If the cheetah template compiled without error, then all the statements above your "#breakpoint" are valid, and you can move the "#breakpoint" to a further point in the template until you pin-point the offending statement. The tool will obviously not run successfully without the entire command line, but we're debugging the template here, not the tool. =6= The hazards of an "$input" parameter With most other parameters, a typo in a variable name will produce a helpful cheetah error message. Exmaple: ========= <tool id="cshl_dummy" name="dummy" description="" > <command interpreter="sh">echo $bar</command> <inputs> <param type="integer" name="foo" size="2" value="42"/> </inputs> <outputs> <data format="txt" name="output" /> </outputs> </tool> ========= Since there's no "$bar" parameter (the name is "$foo"), the cheetah error will be: ======== Traceback (most recent call last): File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/jobs/runners/drmaa.py", line 127, in queue_job job_wrapper.prepare() File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/jobs/__init__.py", line 371, in prepare self.command_line = self.tool.build_command_line( param_dict ) File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/tools/__init__.py", line 1527, in build_command_line command_line = fill_template( self.command, context=param_dict ) File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/util/template.py", line 9, in fill_template return str( Template( source=template_text, searchList=[context] ) ) File "/localdata1/gordon/projects/galaxy_dev/eggs/Cheetah-2.2.2-py2.6-linux-x86_64-ucs4.egg/Cheetah/Template.py", line 1004, in __str__ return getattr(self, mainMethName)() File "cheetah_DynamicallyCompiledCheetahTemplate_1301604373_52_56153.py", line 83, in respond NotFound: cannot find 'bar' ======== Helpful enough, even without a line number or code fragment. But if your typo is "$input" (and your parameters are named "$input1" or "$input2" or similar), the error will completely useless. Example: The following bug (using "$input" when there's no "$input" parameter defined): ========== <tool id="cshl_dummy" name="dummy" description="" > <command interpreter="sh">echo $input</command> <inputs> <param type="integer" name="foo" size="2" value="42"/> </inputs> <outputs> <data format="txt" name="output" /> </outputs> </tool> =========== Will throw the following exception: =========== Traceback (most recent call last): File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/jobs/runners/drmaa.py", line 127, in queue_job job_wrapper.prepare() File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/jobs/__init__.py", line 371, in prepare self.command_line = self.tool.build_command_line( param_dict ) File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/tools/__init__.py", line 1527, in build_command_line command_line = fill_template( self.command, context=param_dict ) File "/localdata1/gordon/projects/galaxy_dev/lib/galaxy/util/template.py", line 9, in fill_template return str( Template( source=template_text, searchList=[context] ) ) File "/localdata1/gordon/projects/galaxy_dev/eggs/Cheetah-2.2.2-py2.6-linux-x86_64-ucs4.egg/Cheetah/Template.py", line 1004, in __str__ return getattr(self, mainMethName)() File "cheetah_DynamicallyCompiledCheetahTemplate_1301604408_95_71749.py", line 83, in respond File "<string>", line 0 ^ SyntaxError: unexpected EOF while parsing ============= Probably because there is a hidden "input" parameter in galaxy code that is used in the output format detection code, but I haven't found the exact spot. This 'bug' alone took me two hours to solve :( Hope this will help to others in future debuggings... -gordon