[galaxyproject/galaxy] 0ca140: ToolBuildOptimize - Eliminate check_security on da...
Branch: refs/heads/release_18.01 Home: https://github.com/galaxyproject/galaxy Commit: 0ca1401bd8387717edb31a69e11d3adc09258757 https://github.com/galaxyproject/galaxy/commit/0ca1401bd8387717edb31a69e11d3... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/tools/parameters/basic.py M lib/galaxy/tools/parameters/dataset_matcher.py Log Message: ----------- ToolBuildOptimize - Eliminate check_security on dataset matcher. It was only used for collections and we need to drop it as prohibitively expensive to calculate. No need to filter collections ahead of time that way anyhow - it is the tool action's job to block the execution of datasets without permission so hopefully we aren't deriving any security value from this filter. Commit: c3303939c05ae8af9a2b11a12d5f433ee0d8072e https://github.com/galaxyproject/galaxy/commit/c3303939c05ae8af9a2b11a12d5f4... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/tools/parameters/dataset_matcher.py Log Message: ----------- ToolBuildOptimize - Optimize state checking in dataset matcher. Pre-calculate valid states, skip now unneeded helper method. Commit: 7ca2ef587cd26682e3b4b28c7cdb8907cfeae862 https://github.com/galaxyproject/galaxy/commit/7ca2ef587cd26682e3b4b28c7cdb8... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/tools/parameters/dataset_matcher.py Log Message: ----------- DatasetMatcherClean - Skip second, unneeded call to filter(). xref 8f813712f50ca21183d2efe59ee0e2a665520f95 to some degree. Commit: 412f03d8446cb362bbfe27d5832bcf37aef2d4d6 https://github.com/galaxyproject/galaxy/commit/412f03d8446cb362bbfe27d5832bc... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/tools/parameters/basic.py M lib/galaxy/tools/parameters/dataset_matcher.py Log Message: ----------- DatasetMatcherClean - Eliminate DatasetMatcher.value - it is never set. Also eliminate any logic related to it having a value. This had a purpose originally, but is no longer set. Cleaning this up makes subsequent commits a bit cleaner also. Commit: ca80232db3a93a02dcd42f2698b54e7a8d2aeef1 https://github.com/galaxyproject/galaxy/commit/ca80232db3a93a02dcd42f2698b54... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/tools/parameters/basic.py Log Message: ----------- DatasetMatcherClean - Remove unused method. It calls things that I'm changing in subsequent commits so I thought I'd just axe it now to clear things up. Commit: efe5d8b973ffdd22963ec75db7820ff1731db916 https://github.com/galaxyproject/galaxy/commit/efe5d8b973ffdd22963ec75db7820... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/tools/__init__.py M lib/galaxy/tools/parameters/basic.py M lib/galaxy/tools/parameters/dataset_matcher.py Log Message: ----------- DatasetMatcherClean - Implement DatasetMatcherFactory to reason for whole tool. In subsequent commit I'll use this central store of all the inputs for a tool to determine if summary data about collections can be used instead of processing individual datasets one at a time. Even this commit though uses the abstraction to optimize datatype checking and cache commons checks when possible - should lead to a lot fewer objects being created when processing a large history. Commit: 2eea566c10eaf2b7dad21cdff82e0a14a1893fde https://github.com/galaxyproject/galaxy/commit/2eea566c10eaf2b7dad21cdff82e0... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/model/__init__.py M lib/galaxy/tools/parameters/dataset_matcher.py M test/unit/tools/test_data_parameters.py M test/unit/tools/test_dataset_matcher.py Log Message: ----------- ToolBuildOptimize - Add SummaryDatasetCollectionMatcher. Use summary data pulled from the database for collections when possible instead of loading potentially hundreds of thousands of individual datasets. Commit: 740c4c93445d5833038e74ccb39413752eeeeb8c https://github.com/galaxyproject/galaxy/commit/740c4c93445d5833038e74ccb3941... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/model/__init__.py M lib/galaxy/tools/parameters/basic.py M test/unit/tools/test_data_parameters.py Log Message: ----------- ToolBuildOptimize - do not fetch hidden datasets for inclusion. Only fetch visible datasets into big, cached list of history datasets under consideration. Hidden datasets don't seem to be used by the fetcher or initial value stuff so it seems fine to exclude them. The advantage should be clear for histories with a large number of datasets hidden below a signficantly smaller number collections. Commit: afe938fe04f1b70d77a357afd03df8252a375770 https://github.com/galaxyproject/galaxy/commit/afe938fe04f1b70d77a357afd03df... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-27 (Fri, 27 Apr 2018) Changed paths: M lib/galaxy/model/__init__.py M lib/galaxy/tools/parameters/basic.py Log Message: ----------- ToolBuildOptimize - fetch fewer collections, prefetch more of HDCA. Low hanging fruit is to exclude hidden collections, that probably won't buy much for typical uses. This eliminates one extra tag query per dataset collection that appears in the rendered result, this probably buys us a bit more than the hidden collection thing but is still probably a good choice (unless there collections with a large number of tags :(...). This also eliminates the extra fetch of the first, outer-est collection associated the history dataset collection - not its elements just the collection. This saves a number of queries roughly equal to the number of HDCAs in the history and unlike the tag thing there is no downside really here - there will always be one collection. Before and after profiling of a tool form build: https://gist.github.com/jmchilton/d68565662f7f4b7ee2640f09fbb92962 Commit: fbc7d49de1b71e7a1a0e544a9c18c5651dad9dd8 https://github.com/galaxyproject/galaxy/commit/fbc7d49de1b71e7a1a0e544a9c18c... Author: John Chilton <jmchilton@gmail.com> Date: 2018-04-30 (Mon, 30 Apr 2018) Changed paths: M lib/galaxy/tools/parameters/dataset_matcher.py Log Message: ----------- Fix for data collection parameters in tool state optimization branch. Commit: b35340c77f262309d225cf2a9e526ac7e67194cc https://github.com/galaxyproject/galaxy/commit/b35340c77f262309d225cf2a9e526... Author: Dannon <dannon.baker@gmail.com> Date: 2018-05-02 (Wed, 02 May 2018) Changed paths: M lib/galaxy/model/__init__.py M lib/galaxy/tools/__init__.py M lib/galaxy/tools/parameters/basic.py M lib/galaxy/tools/parameters/dataset_matcher.py M test/unit/tools/test_data_parameters.py M test/unit/tools/test_dataset_matcher.py Log Message: ----------- Merge pull request #5997 from jmchilton/1801_tool_state_opt [18.01] Fix tool state performance for large collections. Compare: https://github.com/galaxyproject/galaxy/compare/aebc0da8914f...b35340c77f26 **NOTE:** This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/ Functionality will be removed from GitHub.com on January 31st, 2019.
participants (1)
-
GitHub