relevanceai.operations_new.label.transform#

Labelling performs a vector search on the labels and fetches the closest max_number_of_labels.

Module Contents#

class relevanceai.operations_new.label.transform.LabelTransform(vector_field: str, label_documents: list, expanded: bool = True, max_number_of_labels: int = 1, similarity_metric: str = 'cosine', similarity_threshold: float = 0.1, label_field='label', label_vector_field='label_vector_', output_field: str = '_label_', **kwargs)#

To write your own operation, you need to add: - name - transform

transform(self, documents) List[Dict[str, Any]]#

Get all vectors, search across

Parameters
  • documents – the documents to be labeled

  • label_documents – The documents that contain the labels.

Example

ds = client.Dataset(...)
# label an entire dataset
ds.label(
    vector_field="sample_1_vector_",
    label_documents=[
        {
            "label": "value",
            "price": 0.3,
            "label_vector_": [1, 1, 1]
        },
        {
            "label": "value-2",
            "label_vector_": [2, 1, 1]
        },
    ],
    expanded=True # stored as dict or list
)

If missing “label”, returns Error - labels missing label field writes loop to set label field

If you want all values in a label document plus similarity, you need to set expanded=True

Return type

A list of dictionaries.

get_label_document(self, document, *args, **kwargs)#
property name(self)#

abstractproperty for name

cosine_similarity(self, query_vector, vector_field, documents, reverse=True, score_field: str = '_label_score', max_number_of_labels: int = 1, similarity_threshold: float = 0)#

It takes a query vector, a vector field, a list of documents, and a few other parameters, and returns a list of documents sorted by their cosine similarity to the query vector

Parameters
  • query_vector – the vector you want to compare against

  • vector_field – the field in the documents that contains the vector

  • documents – list of documents

  • reverse – True/False

  • optional – True/False

  • score_field (str, optional) – str = “_label_score”

  • max_number_of_labels (int, optional) – int = 1,

  • similarity_threshold (float, optional) – float = 0,

Return type

A list of dictionaries.

get_operation_metadata(self) Dict[str, Any]#

abstractmethod for return metadata for upsertion