How good are these predictions?

Every model we serve has been validated by 5-fold cross-validation on held-out compounds. Here's what that looks like — compound by compound.

Loading validation data…

Headline result

A zero-shot prediction from crystal structure alone

Nilotinib on BCR-Abl: 12% error, zero-shot

Our GKSL physics model predicted nilotinib's residence time on BCR-Abl at 178 minutes — starting only from the crystal structure, with no kinetic training data for this compound. The measured value is 201 minutes (Tiwary et al., 2017).

This is the physics layer working. The ML layer adds compound-specific accuracy across thousands of additional structures.

Nilotinib → BCR-Abl
Crystal structure input
GKSL physics model
Predicted: 178 min
Measured: 201 min
12% error

How does this compare?

Speed, cost, and accuracy vs. alternative methods

Method Speed Cost per compound Accuracy Requires
SPR experiment 2–5 days $1,000–5,000 Ground truth Protein, instrument, expertise
MD unbinding simulation 4–48 hours $500–5,000 GPU ×3–10 PhD expertise, GPU cluster
Pleco Kineτics < 1 second $2 ×3 (R² 0.78) SMILES string

Method transparency

Two layers work together to give interpretable, accurate predictions

Layer 1: GKSL physics

The Lindblad master equation captures how binding mode — whether a drug enters through the ATP site or flips the DFG motif — determines the dominant contribution to residence time. This gives us the mechanism: why type II inhibitors tend to have longer residence times than type I.

Layer 2: ML chemistry

ECFP4 molecular fingerprints (2048 bits) + molecular weight + 7 physicochemical descriptors, trained on 1,000+ compounds per target using XGBoost and Random Forest. This captures the compound-specific variation that physics alone can't predict — giving us the accuracy.

What the model sees: The molecular graph — atoms, bonds, and ring systems encoded as a circular fingerprint. Supplemented with MW, logP, TPSA, H-bond donors/acceptors, rotatable bonds, aromatic rings, and sp3 fraction.

What it doesn't see: No crystal structure. No protein sequence. No binding pose. No explicit 3D coordinates. The model generalizes from the chemical fingerprint alone.

The training data is CC-BY-SA open

Browse it. Verify it. Suggest corrections. The prediction engine is the product — the data belongs to the community.

Browse the dataset →

Preprint

Preprint in preparation. Contact hello@pleco.dev for a preview.

Request preview