relevanceai.operations_new.cluster.sub.ops#

Module Contents#

class relevanceai.operations_new.cluster.sub.ops.SubClusterOps(model, alias: str, vector_fields, parent_field, dataset_id, filters: Optional[list] = None, cluster_ids: Optional[list] = None, min_parent_cluster_size: int = 0, model_kwargs: Optional[dict] = None, cluster_field: str = '_cluster_', outlier_value: Union[int, str] = - 1, **kw)#

To write your own operation, you need to add: - name - transform

run(self, *args, **kwargs)#

It takes a dataset, and then it gets all the documents from that dataset. Then it transforms the documents and then it upserts the documents.

Parameters
  • dataset (Dataset) – Dataset,

  • select_fields (list) – Used to determine which fields to retrieve for filters

  • output_fields (list) – Used to determine which output fields are missing to continue running operation

  • filters (list) – list = None,

store_subcluster_metadata(self, parent_field: str, cluster_field: str)#

Store subcluster metadata

get_centroid_documents(self)#
create_subcluster_centroids(self)#
format_subcluster_labels(self, labels: list, parent_labels: str)#
format_subcluster_label(self, label: str, parent_label: str)#