πββοΈ Relevance AI for analyzing unstructured data#
Use Relevance AI for clustering and gaining meaning from your unstructured data.
β¨ Example#
An example cluster app that showcases meaning amongst each group of unstructured data With just a few lines of code, youβll get rich, interactive, shareable dashboards

Dashboard screenshot#
π Data & Privacy#
We take security very seriously, and our cloud-hosted dashboard uses industry standard best practices for encryption. Our team adhere to our strict privacy policy.
πͺ Install RelevanceAI
library and authenticate the client#
Start by installing the library and logging in to your account.
!pip install RelevanceAI -qqq
In [1]: %load_ext autoreload
In [2]: %autoreload 2
from relevanceai import Client
# Instantiate the client and authenticate
client = Client()
# This will prompt a link to collect your API token which includes your project and API key
π© Upload Some Data#
1οΈβ£. Open a new Dataset
2οΈβ£. Insert some documents
from relevanceai.utils import example_documents
documents = example_documents("retail_reviews_small", number_of_documents=100)
dataset_id = "retail_reviews"
# The dataset name that we have decided, this can be whatever you want for your own data
dataset = client.Dataset(dataset_id=dataset_id)
# Instantiate the dataset
dataset.insert_documents(documents)
while inserting, you can visit monitor the dataset at https://cloud.tryrelevance.com/dataset/retail_reviews/dashboard/monitor/
β
All documents inserted/edited successfully.
You can view your dataset quickly using dataset.head
just like in
Pandas!
# dataset.head()
π¨βπ¬ Vectorizing And Bringing AI In#
πͺ In order to better visualise clusters within our data, we must
vectorise the unstructured fields in a our clusters. In this dataset,
there are two important text fields, both located in the review body;
These are the reviews.text
and reviews.title
. For the purposes
of this tutorial, we will be vectorizing reviews.text
only.
π€ Choosing a Vectorizer#
An important part of vectorizing text is around choosing which vectorizer to use. Relevance AI allows for a custom vectorizer from vectorhub, but if you canβt decide, the default models for each type of unstructured data are listed below.
Text:
SentenceTransformers
Images:
CLIP
# !pip install -q sentence-transformers
π€© Vectorize in one line#
We support vectorizing text in just 1 line.
# The text fields here are the ones we wish to construct vector representations for
text_fields = ["reviews.text"]
dataset.vectorize_text(fields=text_fields)
Search Application#
You can also build a search application in just 1 line of code.
This search application can be built by using
dataset.launch_search_app()
https://cloud.tryrelevance.com/dataset/retail_reviews/deploy/recent/search
You can view an example of our text search below.

Text Search#
β¨ Cluster#
In one line of code, we can create a cluster application based on our new vector field. This application is how we will discover insights about the semantic groups in our data.
First, let us see what vector fields are availbale in the dataset.
dataset.list_vector_fields()
['reviews.text_all-mpnet-base-v2_vector_']
model = "kmeans"
number_of_clusters = 20
alias = "my_clustering"
vector_fields = dataset.list_vector_fields()
dataset.cluster(vector_fields=vector_fields, model=model, alias=alias)
π The above step will produce a link to your first cluster app!#
Click the link provided to view your newly generated clusters in a dashboard app.
π€ Choosing the Number of Clusters#
Most clustering algorithms require you choose the number clusters you wish to find. This can be tricky if you donβt know what the expect. Luckily, RelevanceAI uses a clustering algorithm called community detection that does not require the number of clusters to be set. Instead, the algorithm will decide how many is right for you. To discover more about other clustering methods, read more in Cluster Report.
π·οΈ Add Labels To Your Dataset#
Labelling refers to when you apply a vector search from one tag to another.
labels = [{"label": "Furniture", "label": "Home office", "label": "Electronics"}]
label_dataset.insert_documents(labels)
while inserting, you can visit monitor the dataset at https://cloud.tryrelevance.com/dataset/retail-label/dashboard/monitor/
β
All documents inserted/edited successfully.
# Vectorize like you would with a normal dataset
label_dataset.vectorize_text(
fields=['label'],
output_fields=["label_vector_"]
)
dataset.label_from_dataset(
vector_fields=dataset.list_vector_fields(),
label_dataset=label_dataset
)
You can now see the labels on your dataset on Relevance AI.

Labels#
πΉ Extract Sentiment#
You can add sentiment to your dataset - whicih will label sentiment as
one of neutral
, positive
, negative
.
dataset.extract_sentiment(text_fields=["reviews.text"]
Want to quickly create some example applications with Relevance AI? Check out some other guides below! - Text-to-image search with OpenAIβs CLIP - Hybrid Text search with Universal Sentence Encoder using Vectorhub - Text search with Universal Sentence Encoder Question Answer from Google