Add your own model architecture#
molflux.modelzoo
ships with a vast catalogue of available model-architectures, but you may want to experiment
at some point with your own model architecture, and make it more widely available to all users of
the package. In this guide, we will go through several ways of doing this, depending on the level of integration that
you would like to achieve.
A model that quacks like a duck#
To integrate with the rest of the ecosystem, your model architecture simply needs to behave in the same way as any
other molflux.modelzoo
model architecture. To do this, you need to define a class that implements one or more of the
interfaces advertised by molflux.modelzoo
in molflux.modelzoo.protocols
.
For example, to define a new estimator, we simply need to fill in the Estimator
protocol. For example:
import os
import pickle
from typing import Any, Dict, Optional
from molflux.modelzoo.typing import Features, DataFrameLike, PredictionResult
class MyEstimator:
def __init__(
self,
tag: Optional[str] = "my_estimator",
x_features: Optional[Features] = None,
y_features: Optional[Features] = None,
alpha: float = 0.5,
**kwargs: Any
) -> None:
self._tag = tag
self._config = {
"x_features": x_features,
"y_features": y_features,
"alpha": alpha,
}
self._model = None
@property
def metadata(self) -> Dict[str, Any]:
return {"tag": self._tag, "description": "My supercool model!"}
@property
def name(self) -> str:
return "my_estimator"
@property
def tag(self) -> str:
return self._tag
@property
def config(self) -> Dict[str, Any]:
return self._config
@property
def x_features(self) -> Features:
return self._config.get("x_features")
@property
def y_features(self) -> Features:
return self._config.get("y_features")
def train(self, train_data: DataFrameLike, **kwargs: Any) -> Any:
self._model = "MODEL"
def predict(self, data: DataFrameLike, **kwargs: Any) -> PredictionResult:
predictions = ["tadaaa!" for _ in range(len(data))]
return {f"{self._tag}::{y_feature}": predictions for y_feature in self.y_features}
def as_dir(self, directory: str) -> None:
pickle_fn = os.path.join(directory, "model.pkl")
with open(pickle_fn, "wb") as f:
pickle.dump(self._model, f)
def from_dir(self, directory: str) -> None:
pickle_fn = os.path.join(directory, "model.pkl")
with open(pickle_fn, "rb") as f:
model = pickle.load(f)
self._model = model
def __str__(self) -> str:
return "This is my supercool model!"
Note
Note that we made use of some convenience methods from molflux.modelzoo to annotate our model, but you could / should technically
fully define your class without even importing molflux.modelzoo
at all!
And you can check that your new model architecture does indeed implement the correct protocol as follows:
from molflux.modelzoo.protocols import Estimator
ok = isinstance(MyEstimator, Estimator)
print(ok)
True
If everything went well, you can now use your model architecture anywhere a molflux.modelzoo
model would be expected!
This first step is useful if you are still prototyping and iterating on your model. You can now easily swap it in
anywhere in the molflux
ecosystem, see what works, see what doesn’t, and iterate on your model architecture.
Adding your model to your local catalogue#
You may have noticed that while you can now feed your model to any function expecting molflux.modelzoo
models, it does not
show up yet in the molflux.modelzoo
catalogue of available model architectures.
To do this, you can decorate your class as follows:
from molflux.modelzoo import register_model
@register_model(kind="custom", name="my_estimator")
class MyEstimator:
...
Et voilà ! The model now appears in the catalogue (as my_estimator
, under a custom
category), and you can load it
like any other native model architecture:
from molflux.modelzoo import list_models, load_model
catalogue = list_models()
print(catalogue)
# {..., 'custom': ['my_estimator'], ...}
model = load_model("my_estimator")
print(model)
This step is useful once you have tested the low-level behaviour of your model, and you would like to test the
higher level config-driven integration with the rest of the molflux
ecosystem.