☀️ Cluster Centroid Heat Maps#
In order to better interpret your clusters, you may need to visualise them using heatmaps. These heatmaps allow users to see which clusters are the closest.
%load_ext autoreload
%autoreload 2
Installation#
!pip install -q jsonshower
from relevanceai import Client
client = Client()
You can retrieve the ecommerce dataset from https://relevanceai.readthedocs.io/en/development/core/available_datasets.html#relevanceai.utils.datasets.get_ecommerce_1_dataset.
ds = client.Dataset("ecommerce")
Centroid Heatmap#
from relevanceai.operations.viz.cluster import ClusterVizOps
cluster_ops = ClusterVizOps.from_dataset(
ds, alias="main-cluster", vector_fields=["product_image_clip_vector_"]
)
cluster_ops.centroid_heatmap()
Your closest centroids are:
0.74 cluster-5, cluster-1
0.73 cluster-5, cluster-4
0.71 cluster-4, cluster-1
0.65 cluster-4, cluster-2
0.65 cluster-7, cluster-2
0.64 cluster-7, cluster-4
0.64 cluster-7, cluster-5
0.63 cluster-5, cluster-2
[Text(0.5, 1.0, 'cosine plot')]

Now we can see if our clusters are useful when we check the dashboard and inspect those clusters:
closest = cluster_ops.closest()["results"]
You can now visit the dashboard at https://cloud.tryrelevance.com/sdk/cluster/centroids/closest
Below, we can now see if 2 separate clusters. One for boots and one for shoes and if we need that granularity.
cluster_ops.show_closest(
cluster_ids=["cluster-1", "cluster-5"], image_fields=["product_image"]
)
You can now visit the dashboard at https://cloud.tryrelevance.com/sdk/cluster/centroids/closest
product_image | cluster_id | _id | |
---|---|---|---|
0 | ![]() |
cluster-1 | 931f907b-13f1-41e5-92fe-c8007cdedada |
1 | ![]() |
cluster-1 | 93734870-b304-4426-9cd4-d906fea340b8 |
2 | ![]() |
cluster-1 | 6416c33d-3287-446c-90d3-ea220bf6312b |
3 | ![]() |
cluster-5 | 8f5dfc61-6fd1-422e-9682-7df039b8c099 |
4 | ![]() |
cluster-5 | 65082728-720b-4604-8ee4-f7d0ecab0e7f |
5 | ![]() |
cluster-5 | 7ace5350-1487-44d3-9840-2b89183f3117 |