Client#

Client.list_datasets(verbose=False)#

List Datasets

Example

from relevanceai import Client
client = Client()
client.list_datasets()
Client.delete_dataset(dataset_id)#

Delete a dataset

Parameters

dataset_id (str) – The ID of a dataset

Example

from relevanceai import Client
client = Client()
client.delete_dataset("sample_dataset_id")
Client.receive_dataset(dataset_id, sender_project, sender_api_key)#

Recieve an individual a dataset.

Example

>>> client = Client()
>>> client.admin.receive_dataset(
    dataset_id="research",
    sender_project="...",
    sender_api_key="..."
)
Parameters
  • dataset_id (str) – The name of the dataset

  • sender_project (str) – The project name that will send the dataset

  • sender_api_key (str) – The project API key that will send the dataset

Client.send_dataset(dataset_id, receiver_project, receiver_api_key)#

Send an individual a dataset. For this, you must know their API key.

Parameters
  • dataset_id (str) – The name of the dataset

  • receiver_project (str) – The project name that will receive the dataset

  • receiver_api_key (str) – The project API key that will receive the dataset

Example

client = Client()
client.send_dataset(
    dataset_id="research",
    receiver_project="...",
    receiver_api_key="..."
)
Client.clone_dataset(source_dataset_id, new_dataset_id=None, source_project=None, source_api_key=None, project=None, api_key=None)#

Clone a dataset from another user’s projects into your project.

Parameters
  • dataset_id – The dataset to copy

  • source_dataset_id (str) – The original dataset

  • source_project (Optional[str]) – The original project to copy from

  • source_api_key (Optional[str]) – The original API key of the project

  • project (Optional[str]) – The original project

  • api_key (Optional[str]) – The original API key

Example

client = Client()
client.clone_dataset(
    dataset_id="research",
    source_project="...",
    source_api_key="..."
)
Client.create_dataset(dataset_id, schema=None)#

A dataset can store documents to be searched, retrieved, filtered and aggregated (similar to Collections in MongoDB, Tables in SQL, Indexes in ElasticSearch). A powerful and core feature of RelevanceAI is that you can store both your metadata and vectors in the same document. When specifying the schema of a dataset and inserting your own vector use the suffix (ends with) “_vector_” for the field name, and specify the length of the vector in dataset_schema.

For example:

These are the field types supported in our datasets: [“text”, “numeric”, “date”, “dict”, “chunks”, “vector”, “chunkvector”].

For example:

{
    "product_text_description" : "text",
    "price" : "numeric",
    "created_date" : "date",
    "product_texts_chunk_": "chunks",
    "product_text_chunkvector_" : 1024
}

You don’t have to specify the schema of every single field when creating a dataset, as RelevanceAI will automatically detect the appropriate data type for each field (vectors will be automatically identified by its “_vector_” suffix). Infact you also don’t always have to use this endpoint to create a dataset as /datasets/bulk_insert will infer and create the dataset and schema as you insert new documents.

Note

  • A dataset name/id can only contain undercase letters, dash, underscore and numbers.

  • “_id” is reserved as the key and id of a document.

  • Once a schema is set for a dataset it cannot be altered. If it has to be altered, utlise the copy dataset endpoint.

Parameters
  • dataset_id (str) – The unique name of your dataset

  • schema (dict) – Schema for specifying the field that are vectors and its length

Example

from relevanceai import Client
client = Client()
client.create_dataset("sample_dataset_id")
Client.search_datasets(query)#

Note

This function was introduced in 1.1.3.