Functions to manipulate and sort clusters

aggregate_network_by_cluster(temporal_network, clusters, sort_clusters=None, output='averaged')[source]

Aggregates the temporal network over eacher cluster in a cluster set

  • temporal_network (phasik.TemporalNetwork) – Temporal network to aggregate

  • clusters (array of int) – Cluster labels

  • sort_clusters (bool) – If True, sort cluster labels based on ascending times

  • output ({‘weighted’, ‘averaged’, ‘binary’, ‘normalised’}, optional) – Determines the type of output edge weights


aggregates – Dict each key is a cluster label and each value is a tuple of the form (networkx.Graph, list of time indices of cluster).

Return type


cluster_sort(clusters, final_labels=None)[source]

Sorts an array of cluster labels in order of appearance, and returns the sorted array while leaving the original clusters unchanged.

  • clusters (numpy.ndarray) – An array of cluster labels.

  • final_labels (list or None, optional) – A list of final labels (as integers) to replace the original cluster labels, by default None.


An array of cluster labels sorted in order of appearance. If final_labels is not None, it will return a list of final labels with the same length as clusters.

Return type

numpy.ndarray or list


>>> clusters = np.array([2, 2, 2, 3, 3, 1, 1, 1])
>>> cluster_sort(clusters)
array([1, 1, 1, 2, 2, 3, 3, 3])
>>> final_labels = [4, 5, 6]
>>> cluster_sort(clusters, final_labels)
[4, 4, 4, 5, 5, 6, 6, 6]

Returns dictionary where each key is a cluster label and each value is list of the time indices composing the cluster


clusters (list of int) – List of cluster labels

rand_index_over_methods_and_sizes(valid_cluster_sets, reference_method='ward')[source]

Compute the Rand Index to compare any clustering method to a reference method, for all combinations of methods and number of clusters.

  • valid_cluster_sets (list) – List of tuples (cluster_object, method_name) representing the clustering object and the name of the clustering method used to obtain it.

  • reference_method (str, optional) – The name of the reference method to compare against. The default is “ward”.


rand_scores – Array of dimension (n_sizes, n_methods) with Rand Index scores.

Return type



The Rand Index is a measure of the similarity between two clusterings. It is based on the number of pairs of samples that are assigned to the same or different clusters in the two clusterings. The adjusted Rand Index is a modification of the Rand Index that takes into account chance agreements.


Utility functions for static graphs


Return basic size info on about graph

weighted_edges_as_df(network, keep_static=True, temporal_edges=None)[source]

Returns a pandas.Dataframe of weighted edges sorted by weight, from a networkx.Graph.

Columns are [‘i’, ‘j’, ‘weight’] and each row represents a different edge

  • network (networkx.Graph) – A network from which to get weighted edges

  • keep_static (bool or np.nan, optional) – If True (default), keep all edges. If False, discard the static edges (those not in temporal_edges). If np.nan, keep the static edges, but set their weight to np.nan. If keep_static is not False, temporal_edges must be provided.

  • temporal_edges (list of tuples) – List of edges for which there is temporal information.

Return type



When the static network is derived from a temporal network, some edges might static (no temporal info) and have a default constant edge weight. That is when the arguments keep_static and temporal_edges are useful.


Functions to deal with system paths

slugify(text, keep_characters=None)[source]

Turn any text into a string that can be used in a filename

  • text (str) – text to slugify

  • keep_characters (list of str) – characters in this iterable will be kept in the final string. Defaults to [‘_’]. Any other non-alphanumeric characters will be removed.



Return type