Visualisations for your clustering.

Module Contents#

class relevanceai.operations.viz.cluster.ClusterVizOps(credentials, dataset_id: str, vector_fields: List[str], alias: Optional[str] = None, **kwargs)#

Cluster Visualisations. May contain additional visualisation dependencies.

plot_basic_distributions(self, numeric_field: str, top_indices: int = 10, dataset_id: Optional[str] = None)#

Plot the sentence length distributions across each cluster


from relevanceai import Client
client = Client()

cluster_ops = client.ClusterVizOps(
  • numeric_field (str) – The numeric field to plot

  • top_indices (int) – The top indices in the plotting

  • dataset_id (Optional[str]) – The dataset ID

plot_distributions(self, numeric_field: str, measure_function: Callable = None, top_indices: int = 10, dataset_id: str = None, asc: bool = True, measurement_name: str = 'measurement')#

Plot the distributions across each cluster measure_function is run on each cluster and plots


from scipy.stats import skew
ops.plot_distributions_measure(numeric_field, skew, dataset_id=dataset_id)
  • numeric_field (str) – The numeric field to plot the distribution by

  • measure_function (callable) – What to measure the function

  • top_indices (int) – The top indices

  • dataset_id (str) – The dataset ID to use

  • asc (bool) – If True, the distributions are plotted

  • measurement_name (str) – The name of what should be plotted for the graphs

plot_most_skewed(self, numeric_field: str, top_indices: int = 10, dataset_id: str = None, asc: bool = True)#

Plot the most skewed numeric fields

centroid_heatmap(self, metric: str = 'cosine', vmin: float = 0, vmax: float = 1, print_n: int = 8, round_print_float: int = 2)#

Heatmap visualisation of the closest clusters. Prints the ones ranked from top to bottom in terms of largest cosine similarity.

show_closest(self, cluster_ids: Optional[List] = None, text_fields: Optional[List] = None, image_fields: Optional[List] = None)#

Show the clusters with the closest.