tune_embed_molecules#

nvmolkit.autotune.tune_embed_molecules(
molecules: list[Mol],
params: EmbedParameters,
*,
confsPerMolecule: int = 1,
maxIterations: int = -1,
gpuIds: Iterable[int] | None = None,
calibration_set: Iterable[int] | None = None,
calibration_fraction: float = 0.1,
calibration_max_size: int = 2000,
target_seconds_per_trial: float = 10.0,
n_trials: int = 30,
search_space_overrides: dict[str, Any] | None = None,
cpu_budget: int | None = None,
sampler: Any = None,
seed: int | None = None,
verbose: bool = False,
) TuneResult#

Tune HardwareOptions for EmbedMolecules() on this hardware.

The tuner runs Optuna trials, each cloning the calibration molecules fresh so embedding can be re-run repeatedly. The returned HardwareOptions is suitable for direct use on the full workload.

Parameters:
  • molecules – Full workload of RDKit molecules. Calibration trials are run on a (possibly auto-subsampled) slice of these molecules.

  • params – ETKDG EmbedParameters to use during tuning. Must satisfy the same constraints as EmbedMolecules().

  • confsPerMolecule – Conformers per molecule passed to each trial.

  • maxIterationsmaxIterations argument forwarded to each trial.

  • gpuIds – GPU device IDs to use. Fixed across the study; the search never varies GPU selection. None lets nvMolKit pick all GPUs.

  • calibration_set – Optional explicit indices into molecules to use for trials. When None, a representative slice is auto-sampled.

  • calibration_fraction – Fraction of the workload to auto-sample.

  • calibration_max_size – Cap on the auto-sampled calibration size.

  • target_seconds_per_trial – Target wall-clock budget for one trial. The warm-up phase shrinks the calibration when the default exceeds twice this value.

  • n_trials – Number of Optuna trials to run after warm-up.

  • search_space_overrides – Optional mapping that overrides the default ranges. Recognized keys: batchSize, batchesPerGpu, preprocessingThreads.

  • cpu_budget – Optional explicit cap on total CPU threads. The default (None) uses os.cpu_count(). Set this when normalizing tuning runs across machines with different core counts so the search space stays comparable.

  • sampler – Optional Optuna sampler to use.

  • seed – Seed for the default sampler when sampler is None.

  • verbose – Print warm-up and trial diagnostics.

Returns:

TuneResult with best_config set to a fully-populated HardwareOptions instance.