tune_substructure#

nvmolkit.autotune.tune_substructure(targets: ~typing.Sequence[~rdkit.Chem.rdchem.Mol], queries: ~typing.Sequence[~rdkit.Chem.rdchem.Mol], *, api: ~typing.Callable = <function hasSubstructMatch>, maxMatches: int = 0, uniquify: bool = False, gpuIds: ~typing.Iterable[int] | None = None, calibration_set: ~typing.Iterable[int] | None = None, calibration_fraction: float = 0.1, calibration_max_size: int = 2000, target_seconds_per_trial: float = 10.0, n_trials: int = 30, search_space_overrides: dict[str, ~typing.Any] | None = None, cpu_budget: int | None = None, sampler: ~typing.Any = None, seed: int | None = None, verbose: bool = False) TuneResult#

Tune SubstructSearchConfig for a substructure-search workflow.

Parameters:
  • targets – Library of target molecules. Calibration trials are run on a (possibly auto-subsampled) slice of these targets.

  • queries – Query molecules. The same query set is used for every trial.

  • api – Which substructure-search entry point to tune. One of hasSubstructMatch(), countSubstructMatches(), or getSubstructMatches().

  • maxMatchesmaxMatches argument forwarded to the resulting config. Held constant across trials.

  • uniquifyuniquify flag forwarded to the resulting config.

  • gpuIds – GPU device IDs to use. Fixed across the study.

  • calibration_set – Optional explicit indices into targets.

  • calibration_fraction – Fraction of the workload to auto-sample.

  • calibration_max_size – Cap on the auto-sampled calibration size.

  • target_seconds_per_trial – Target wall-clock budget for one trial.

  • n_trials – Number of Optuna trials to run after warm-up.

  • search_space_overrides – Optional overrides for batchSize, workerThreads, or preprocessingThreads ranges.

  • cpu_budget – Optional explicit cap on total CPU threads. The default (None) uses os.cpu_count(). The joint constraint num_gpus * workerThreads + preprocessingThreads <= cpu_budget is enforced when sampling each trial. Set this when normalizing tuning runs across machines with different core counts so the search space stays comparable.

  • sampler – Optional Optuna sampler.

  • seed – Seed for the default sampler.

  • verbose – Print warm-up and trial diagnostics.

Returns:

TuneResult with best_config set to a fully-populated SubstructSearchConfig instance.