relevanceai.operations_new.cluster.sub.transform#

Subclustering operation

Module Contents#

class relevanceai.operations_new.cluster.sub.transform.SubClusterTransform(model, alias: str, vector_fields, parent_field, filters: Optional[list] = None, cluster_ids: Optional[list] = None, min_parent_cluster_size: Optional[int] = None, model_kwargs: Optional[dict] = None, cluster_field: str = '_cluster_', outlier_value: Union[int, str] = - 1, outlier_label: str = 'outlier', **kw)#

To write your own operation, you need to add: - name - transform

transform(self, documents)#

It takes a list of documents, and for each document, it runs the document through each of the models in the pipeline, and returns the updated documents.

Parameters

documents (List[Dict[str, Any]]) – List[Dict[str, Any]]

Return type

A list of dictionaries.

store_subcluster_metadata(self, parent_field: str, cluster_field: str)#

Store subcluster metadata