Add your own representations#
Even though molflux.features
comes with built-in representations, you may want to add your own feature extractor.
In this guide, we will go through two ways you can do this.
Temporary representations#
The first method is by registering your representation temporarily. This is useful in cases when you are still prototyping and would like to have easy access to your representation code and modify it.
For a representation to be available in molflux.features
, it must have a representation class. The basic skeleton looks like this
from typing import Any, Dict, List
from molflux.features.catalogue import register_representation
from molflux.features.bases import RepresentationBase
from molflux.features.info import RepresentationInfo
from molflux.features.typing import MolArray
_DESCRIPTION = """
My new representation.
"""
@register_representation(kind = "custom", name = "your_rep_name")
class YourRepName(RepresentationBase):
def _info(self) -> RepresentationInfo:
return RepresentationInfo(
description=_DESCRIPTION,
)
def _featurise(
self,
samples: MolArray,
**kwargs: Any,
) -> Dict[str, Any]:
""" compute your features!"""
my_computed_features = ["bla, bla, bla"]
feature_key = self.tag
return {feature_key: my_computed_features}
from molflux.features import load_representation
rep = load_representation(name="your_rep_name")
print(rep)
Representation(
name: "your_rep_name",
tag: "your_rep_name",
signature: self.featurise(samples: Union[collections.abc.Iterable[str], collections.abc.Iterable[Any], collections.abc.Iterable[bytes]], **kwargs: Any) -> Dict[str, Any],
description: """
My new representation.
""",
usage: """ compute your features!"""
state: {}
)
Let’s break this down.
Start by creating a class (you can name it whatever you like, the class name has no impact on
how the representation will appear in molflux.features
). This class must inherit RepresentationBase
from molflux.features
.
Next, since we are adding this representation temporarily, you need to register it (basically make it available in molflux.features
) by
adding the decorator register_representation
to the class. Here, you specify the kind and name
of your representation.
This is how it will appear in the molflux.features
catalogue and name
is how it will be loaded.
Next, you should also add a description to your representation. This is done by adding an _info
method which returns
a RepresentationInfo
object.
Finally, you need to implement the _featurise
method. This is the method used to compute the features. There are
two requirements:
Signature: The
_featurise
method must have the following argumentsA
samples
argument of typeMolArray
. This is the list of molecules to be featurised.After that, you can add all the optional arguments for featurisation with default values.
Finally, you need to have
**kwargs: Any
(this is for mypy type checking reasons).
Return type: The
_featurise
method must return a dictionary of the computed features with string identifiers as keys (can be anything but it is recommended that you useself.tag
, the representation tag which is automatically generated) and the features as values. You can return multiple features from the same representation as a dictionary with multiple key, value pairs.
Et Voilà ! As you can see from the python cell above, the representation was loaded using molflux.features
!
Permanent representations#
If you have settled on a representation and would like to have it as part of a package, you can still use the above method
to do this (add a script to your repo with the above code). If you do this, your representation will only be available
in molflux.features
if you have imported your package (and also imported the representation class from your package).
To avoid this, we strongly recommended that you add your representation as a plugin (to learn more about plugins see
here). Using this method, your representation
will appear and be available in molflux.features
automatically (you need to have it pip installed in the same environment).
To begin with, add your representation class to your repo (this can be any repo, not necessarily molflux.features
). Next, remove the
register_representation
decorator (it’s no longer needed).
from typing import Any, Dict, List
from molflux.features.bases import RepresentationBase
from molflux.features.info import RepresentationInfo
from molflux.features.typing import MolArray
_DESCRIPTION = """
My new representation.
"""
class YourRepName(RepresentationBase):
def _info(self) -> RepresentationInfo:
return RepresentationInfo(
description=_DESCRIPTION,
)
def _featurise(
self,
samples: MolArray,
**kwargs: Any,
) -> Dict[str, Any]:
""" compute your features!"""
my_computed_features = ["bla, bla, bla"]
feature_key = self.tag
return {feature_key: my_computed_features}
Now, go to the pyproject.toml
file of your repo and under [project.entry-points.'molflux.features.plugins.pluging_name']
,
add a plugin to your representation class as follows
[project.entry-points.'molflux.features.plugins.plugin_name']
name_of_representation = 'path.to.module.file:YourRepName'
Note
You can also do this in the setup.cfg
file of your repo and under [options.entry_points]
. Add a plugin to your
representation class as follows
[options.entry_points]
molflux.features.plugins.kind_of_representation =
name_of_representation = path.to.module.file:YourRepName
Let’s break this down. This entry point allows molflux.features
to detect your representation (you need to have both
molflux
and your package installed in the same environment, the order of installation does not matter). In the plugin definition,
you can specify the kind of your representation. Next, you specify the path to the class of your representation. Here you
define the name of your representation.
And that’s it! You can now find your representation available in molflux.features
.