relevanceai.operations_new.cluster.sub.ops
#
Module Contents#
- class relevanceai.operations_new.cluster.sub.ops.SubClusterOps(model, alias: str, vector_fields, parent_field, dataset_id, filters: Optional[list] = None, cluster_ids: Optional[list] = None, min_parent_cluster_size: int = 0, model_kwargs: Optional[dict] = None, cluster_field: str = '_cluster_', outlier_value: Union[int, str] = - 1, **kw)#
To write your own operation, you need to add: - name - transform
- run(self, *args, **kwargs)#
It takes a dataset, and then it gets all the documents from that dataset. Then it transforms the documents and then it upserts the documents.
- Parameters
dataset (Dataset) – Dataset,
select_fields (list) – Used to determine which fields to retrieve for filters
output_fields (list) – Used to determine which output fields are missing to continue running operation
filters (list) – list = None,
- store_subcluster_metadata(self, parent_field: str, cluster_field: str)#
Store subcluster metadata
- get_centroid_documents(self)#
- create_subcluster_centroids(self)#
- format_subcluster_labels(self, labels: list, parent_labels: str)#
- format_subcluster_label(self, label: str, parent_label: str)#