tune_embed_molecules#

nvmolkit.autotune.tune_embed_molecules( molecules: list[Mol], params: EmbedParameters, *, confsPerMolecule: int = 1, maxIterations: int = -1, gpuIds: Iterable[int] | None = None, calibration_set: Iterable[int] | None = None, calibration_fraction: float = 0.1, calibration_max_size: int = 2000, target_seconds_per_trial: float = 10.0, n_trials: int = 30, search_space_overrides: dict[str, Any] | None = None, cpu_budget: int | None = None, sampler: Any = None, seed: int | None = None, verbose: bool = False, ) → TuneResult#

Tune HardwareOptions for EmbedMolecules() on this hardware.

The tuner runs Optuna trials, each cloning the calibration molecules fresh so embedding can be re-run repeatedly. The returned HardwareOptions is suitable for direct use on the full workload.

Parameters:

molecules – Full workload of RDKit molecules. Calibration trials are run on a (possibly auto-subsampled) slice of these molecules.
params – ETKDG EmbedParameters to use during tuning. Must satisfy the same constraints as EmbedMolecules().
confsPerMolecule – Conformers per molecule passed to each trial.
maxIterations – maxIterations argument forwarded to each trial.
gpuIds – GPU device IDs to use. Fixed across the study; the search never varies GPU selection. None lets nvMolKit pick all GPUs.
calibration_set – Optional explicit indices into molecules to use for trials. When None, a representative slice is auto-sampled.
calibration_fraction – Fraction of the workload to auto-sample.
calibration_max_size – Cap on the auto-sampled calibration size.
target_seconds_per_trial – Target wall-clock budget for one trial. The warm-up phase shrinks the calibration when the default exceeds twice this value.
n_trials – Number of Optuna trials to run after warm-up.
search_space_overrides – Optional mapping that overrides the default ranges. Recognized keys: batchSize, batchesPerGpu, preprocessingThreads.
cpu_budget – Optional explicit cap on total CPU threads. The default (None) uses os.cpu_count(). Set this when normalizing tuning runs across machines with different core counts so the search space stays comparable.
sampler – Optional Optuna sampler to use.
seed – Seed for the default sampler when sampler is None.
verbose – Print warm-up and trial diagnostics.

Returns:

TuneResult with best_config set to a fully-populated HardwareOptions instance.