On Nov 2, 2012, at 3:58 PM, Carlos Borroto <carlos.borroto@gmail.com> wrote:
Please, could you share your rules related to blast. I would love to take a look at them.
Thanks, Carlos
Here is the blastn rule procedure code and the relevant snippet of the default runner procedure. I just added the database based multiplier, so this part is very simple at the moment. I just set a bogus multiplier of "4" for the "nt_*" databases as an example. def ncbi_blastn(job): nodes = 1 ppn = 4 walltime='167:00:00' inp_data = dict( [ ( da.name, da.dataset ) for da in job.input_datasets ] ) inp_data.update( [ ( da.name, da.dataset ) for da in job.input_library_datasets ] ) query_file = inp_data[ "query" ].file_name query_size = os.path.getsize( query_file ) inp_params = dict( [ ( da.name, da.value ) for da in job.parameters ] ) inp_params.update( [ ( da.name, da.value ) for da in job.parameters ] ) db_dict = eval(inp_params['db_opts']) db = db_dict['database'] db_multiplier = 1 if db.startswith('nt'): db_multiplier = 4 if query_size <= 20 * 1024 * 1024: pmem = 500 pmem_unit = 'mb' elif query_size > 20 * 1024 * 1024 and query_size <= 50 * 1024 * 1024: pmem = 750 pmem_unit = 'mb' elif query_size > 50 * 1024 * 1024 and query_size <= 100 * 1024 * 1024: pmem = 1500 pmem_unit = 'mb' elif query_size > 100 * 1024 * 1024 and query_size <= 500 * 1024 * 1024: pmem = 2 pmem_unit = 'gb' elif query_size > 500 * 1024 * 1024 and query_size <= 1000 * 1024 * 1024: pmem = 4 pmem_unit = 'gb' elif query_size > 1000 * 1024 * 1024 and query_size <= 2000 * 1024 * 1024: pmem = 10 pmem_unit = 'gb' elif query_size > 2000 * 1024 * 1024: pmem = 20 pmem_unit = 'gb' log.debug('OM: blastn query size is in the bigmem category: %skb\n' % (query_size)) else: pmem = 5 pmem_unit = 'gb' if db_multiplier > 1: pmem = int(pmem * db_multiplier) pmem_str = "%d%s" % (pmem, pmem_unit) log.debug('OM: blastn query: %skb, db: %s, pmem: %s\n' % (query_size, db, pmem_str)) return {'nodes':nodes,'ppn':ppn,'pmem':pmem_str,'walltime':walltime} def default_runner(tool_id, job): ... elif tool_id_src.startswith('ncbi_blastn_wrapper'): request = ncbi_blastn(job) ... drmaa = 'drmaa://%s%s%s/' % (queue_str, group_str, request_str) return drmaa Regards, Alex