Assaf Gordon wrote:
Greg Von Kuster wrote, On 01/08/2010 01:02 PM:
> A dataset's set_meta() is done as part of the job, so if you are not
> running jobs on a cluster, set_meta() will be run locally as well, which
> is certainly chewing up cpu on your server.
I don't mind it running locally, I have several CPUs to spare - the problem is that
it seems to be running in a thread inside the main galaxy process - which slows all of
galaxy.
If there's a way to have set_meta be called externally with local runner (as another
local job - with a different process) - this would also solve the issue (I think).
Even if using the local runner, set_metadata_externally will cause the
metadata code to run in a separate process, which (python-wise) would be
a huge help for performance.
> If running externally,
> set_meta() will run on the cluster when the user does anything in the
> "Edit Attributes" page that call set_meta(), including
"Auto-detect".
This is interesting, but how does galaxy know to submit an "Edit Attributes"
job to the cluster? does it do "qsub" with the default runner?
I'm asking because even when/if I switch to use the cluster, the default runner will
still be local, and only some specific jobs will have an "sge://" runner. How
would then galaxy know to submit a job to the SGE cluster?
It gets a tool id, '__SET_METADATA__', and is submitted through the
regular job runner. I just tested and you can set it in
universe_wsgi.ini as you would any other job runner override.
> As soon as I get a chance, I'll look at enhancing set_meta()
to check if
> "set_metadata_externally" is True for those data types that take
> significant processing, and if jobs are running locally, metadata will
> be set differently.
I'll be more than happy to beta-test this feature. let me know if I can assist.
This is already implemented since auto-detect is run as a job.
--nate
Thanks for all your help!
-gordon
>
> On Jan 8, 2010, at 12:25 PM, Assaf Gordon wrote:
>
>> It is set to "False", but my galaxy runs jobs locally, not on a
cluster...
>> (at least, not directly through the SGE Runner).
>>
>> Does this work with local-runner too (i.e. starting a new process to
>> set the metadata) ?
>> Also, does the "external" method works when the use changes the type
>> in the "Edit Attributes" page ?
>>
>>
>>
>> Greg Von Kuster wrote, On 01/08/2010 10:54 AM:
>>> Hello Assaf,
>>>
>>> Is your instance configured to set metadata externally ( on your cluster
>>> nodes )? If not, in your universe_wsgi.ini file, add the following to
>>> the [app:main] section:
>>>
>>> set_metadata_externally = True
>>>
>>>
>>> On Jan 6, 2010, at 5:13 PM, Assaf Gordon wrote:
>>>
>>>> Hello all,
>>>>
>>>> Continuing the search for slowness in my local Galaxy server (see
>>>>
http://lists.bx.psu.edu/pipermail/galaxy-dev/2009-December/001549.html
>>>> ),
>>>>
>>>> The datatypes/sequence.py file is also scanning and parsing entire
>>>> files when creating a new FASTA/FASTQ file.
>>>> It's nice and fun and informative for small files, but with a 2.7GB
>>>> FASTA file - the python process stays at 100% CPU for a long long
>>>> time, causing everything else to be very slow.
>>>>
>>>> The offending code is at sequence.py, method "set_meta", lines
30-39.
>>>>
>>>> I think Illumina expects 25x coverage of the human genome in a single
>>>> run by the end of the year - this will roughly translates to 8 FASTQ
>>>> files of more than 8GB each => FASTA files of 4GB each... Galaxy will
>>>> not be able to just casually scan these files.
>>>>
>>>> -gordon
>>>>
>>>>
>>>> _______________________________________________
>>>> galaxy-dev mailing list
>>>> galaxy-dev(a)lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu>
>>>>
http://lists.bx.psu.edu/listinfo/galaxy-dev
>>> Greg Von Kuster
>>> Galaxy Development Team
>>> greg(a)bx.psu.edu <mailto:greg@bx.psu.edu>
>>>
>>>
>>>
> Greg Von Kuster
> Galaxy Development Team
> greg(a)bx.psu.edu <mailto:greg@bx.psu.edu>
>
>
>
_______________________________________________
galaxy-dev mailing list
galaxy-dev(a)lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev