Cluster Visualisations#

Visualisations for your clustering.

class relevanceai.operations.viz.cluster.ClusterVizOps#

Cluster Visualisations. May contain additional visualisation dependencies.

centroid_heatmap(metric='cosine', vmin=0, vmax=1, print_n=8, round_print_float=2)#

Heatmap visualisation of the closest clusters. Prints the ones ranked from top to bottom in terms of largest cosine similarity.

plot_basic_distributions(numeric_field, top_indices=10, dataset_id=None)#

Plot the sentence length distributions across each cluster

Example

from relevanceai import Client
client = Client()

cluster_ops = client.ClusterVizOps(
    dataset_id="sample_dataset",
    vector_fields=["sample_vector_"],
    alias="kmeans-5"
)
cluster_ops.plot_basic_distributions()
Parameters
  • numeric_field (str) – The numeric field to plot

  • top_indices (int) – The top indices in the plotting

  • dataset_id (Optional[str]) – The dataset ID

plot_distributions(numeric_field, measure_function=None, top_indices=10, dataset_id=None, asc=True, measurement_name='measurement')#

Plot the distributions across each cluster measure_function is run on each cluster and plots

Example

from scipy.stats import skew
ops.plot_distributions_measure(numeric_field, skew, dataset_id=dataset_id)
Parameters
  • numeric_field (str) – The numeric field to plot the distribution by

  • measure_function (callable) – What to measure the function

  • top_indices (int) – The top indices

  • dataset_id (str) – The dataset ID to use

  • asc (bool) – If True, the distributions are plotted

  • measurement_name (str) – The name of what should be plotted for the graphs

plot_most_skewed(numeric_field, top_indices=10, dataset_id=None, asc=True)#

Plot the most skewed numeric fields

show_closest(cluster_ids=None, text_fields=None, image_fields=None)#

Show the clusters with the closest.