Drawing
Clusters
Functions to visualize the results of temporal clusters.
- plot_average_silhouettes(cluster_sets, ax=None, c='k', marker='o', ls='-', **kwargs)[source]
Draw the average silhouette score for each cluster set in cluster_sets.
The silhouette score is a measure of the quality of a clustering.
- Parameters
cluster_sets (ClusterSets) – Cluster sets for which to draw the silhouette scores
ax (matplotlib.Axes, optional) – Axes on which to plot
c (color, optional) – Color to use for the curve. Default: black.
marker (str, optional) – Markers to use for the curve. Default: “o”.
ls (str, optional) – Linestyle to use for the cruve. Default: “-“.
**kwargs – Other parameters to pass to matplotlib’s plot.
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
See also
Examples
>>> import phasik as pk >>> distance_matrix = pk.DistanceMatrix.from_temporal_network( >>> temporal_network, "euclidean" >>> ) >>> cluster_sets = pk.ClusterSets.from_distance_matrix( >>> distance_matrix, "maxclust", range(2, 12), "ward" >>> ) >>> pk.plot_average_silhouettes(cluster_sets)
- plot_cluster_set(cluster_set, colors=None, cmap='tab10', vmin=None, vmax=None, y_height=0, ax=None, **kwargs)[source]
Visualize the clusters in cluster_set.
For each time point, a marker is drawn with a color corresponding to the cluster to which it belongs.
- Parameters
cluster_set (ClusterSet) – ClusterSet object
colors (list of int, optional) – If None (default), cluster label 0 is assigned its automatic color “C0” and so on. If colors is a list (e.g. [3,1,2]), it relabels the clusters in that order and assigns them the new corresponding colors.
cmap (colormap, optional) – Desired colormap (default ‘tab10’).
vmin/vmax (float, optional) – Min and max values to use for the color mapping. If None (default), computed from the data in colors.
y_height (int or float, optional) – Vertical value at which to draw the markers (default 0). If a single cluster is drawn this value does not matter.
ax (matplotlib.Axes, optional) – Axes on which to plot
**kwargs – Other parameters to pass to matplotlib’s scatter.
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> distance_matrix = pk.DistanceMatrix.from_temporal_network( >>> temporal_network, "euclidean" >>> ) >>> cluster_set = pk.ClusterSet.from_distance_matrix( >>> distance_matrix, "maxclust", 5, "ward" >>> ) >>> pk.plot_cluster_set(cluster_set)
- plot_cluster_sets(cluster_sets, axs=None, cmap='tab10', vmin=None, vmax=None, coloring='consistent', translation=None, with_silhouettes=False, with_n_clusters=False, **kwargs)[source]
Visualize the clusters in cluster_sets.
For each time point, a marker is drawn with a color corresponding to the cluster to which it belongs. Clusterings for different numbers of clusters are drawn at different heights on the vertical axis.
- Parameters
cluster_sets (phasik ClusterSets) – ClusterSets object containing partitions to plot
axs (matplotlib.Axes, optional) – Matplotlib axes on which to plot. If None (default), creates a single axis.
cmap (colormap, optional) – Desired colormap (default ‘tab10’).
vmin/vmax (float, optional) – Min and max values to use for the color mapping. If None (default), computed from the data in colors.
coloring ({‘ascending’, ‘consistent’, None}, optional) – The method to use to obtain consistent coloring across cluster sets. See relabel_clustersets for details. By default, “consistent”
translation (dict, optional) – If None (default), has no effect. Elsee, dictionary that determines which label should be replaced by which other label For example {1: 2, 2: 3, 3: 1} It is applied after the order relabling from method.
with_silhouettes (bool, optional) – Whether to draw the corresponding silhouette scores on a second axis. See plot_average_silhouettes for details. Default: False.
with_n_clusters (bool, optional) – Whether to draw the corresponding number of clusters on a third axis. See plot_ns_clusters for details. Default: False.
- Returns
The axis object to draw on
- Return type
tuple of matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> distance_matrix = pk.DistanceMatrix.from_temporal_network( >>> temporal_network, "euclidean" >>> ) >>> cluster_sets = pk.ClusterSets.from_distance_matrix( >>> distance_matrix, "maxclust", range(2, 12), "ward" >>> ) >>> pk.plot_cluster_sets(cluster_sets)
- plot_dendrogram(cluster_set, ax=None, distance_threshold=None, leaf_rotation=90, leaf_font_size=6)[source]
Draw the results of hierarchical clustering as a dendrogram.
The particular clustering passed as argument is the result of choosing a specific threshold in this dendrogram.
- Parameters
cluster_set (ClusterSet) – Cluster set for which to draw a dendrogram
ax (matplotlib.Axes, optional) – Axes on which to plot
distance_threshold (float, optional) – Threshold at which to draw a horizontal line and above which to use different colors for different branches.
leaf_rotation (int or float, optional) – Rotation to apply to the x-axis (leaf) labels (default 90)
leaf_font_size (int or str, optional) – Desired size of the x-axis (leaf) labels (default 6)
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> distance_matrix = pk.DistanceMatrix.from_temporal_network( >>> temporal_network, "euclidean" >>> ) >>> cluster_set = pk.ClusterSet.from_distance_matrix( >>> distance_matrix, "maxclust", 5, "ward" >>> ) >>> pk.plot_dendrogram(cluster_set)
- plot_ns_clusters(cluster_sets, ax=None, c='k', marker='o', ls='-', **kwargs)[source]
Plot the actual number of clusters against the requested number of clusters.
These numbers are plotted for each cluster set in cluster_sets.
- Parameters
cluster_sets (ClusterSets) – Cluster sets information to plot
ax (matplotlib.Axes, optional) – Axes on which to plot
c (color, optional) – Color or the markers and line. Default: “k”.
marker (string, optional) – Marker to use, default: “o”.
ls (str, optional) – Style of the line. Default: “-“.
**kwargs – Other parameters to pass to matplotlib’s plot.
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> distance_matrix = pk.DistanceMatrix.from_temporal_network( >>> temporal_network, "euclidean" >>> ) >>> cluster_sets = pk.ClusterSets.from_distance_matrix( >>> distance_matrix, "maxclust", range(2, 12), "ward" >>> ) >>> pk.plot_ns_clusters(cluster_sets)
- plot_randindex_bars_over_methods_and_sizes(valid_cluster_sets, reference_method='ward', ax=None, plot_ref=False, **kwargs)[source]
Plot Rand Index as bars, to compare any method to a reference method.
This compares all combinations of methods and number of clusters.
- Parameters
valid_cluster_sets (list) – A list of tuples representing valid cluster sets. Each tuple contains the ClusterSet and the clustering method name.
reference_method (str, optional) – The reference method to compare other methods to. Defaults to “ward”.
ax (matplotlib.axes.Axes, optional) – The axes to plot the bars on. If not provided, the current axes will be used.
plot_ref (bool, optional) – Determines whether to plot the reference method bars (will have height one). Defaults to False.
**kwargs – Other parameters to pass to matpotlib’s bar.
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> clustering_methods = ["k_means", "centroid","average", "ward"] >>> valid_cluster_sets = [] >>> for clustering_method in clustering_methods: >>> distance_matrix = pk.DistanceMatrix.from_temporal_network( >>> temporal_network, "euclidean" >>> ) >>> cluster_sets = pk.ClusterSets.from_distance_matrix( >>> distance_matrix, "maxclust", range(2, 12), clustering_method >>> ) >>> valid_cluster_sets.append((cluster_sets, clustering_method)) >>> pk.plot_randindex_bars_over_methods_and_sizes(valid_cluster_sets, reference_method="ward") >>> ax.set_ylabel("Rand index") >>> ax.set_xlabel("# clusters")
- relabel_clusters_sorted(clusters, final_labels=None)[source]
Returns array of cluster labels sorted in order of appearance, with clusters unchanged
- Parameters
clusters (array of int) – Cluster labels
final_labels (array of int) – Cluster labels in expected order (has size of the number of clusters)
- Returns
arr – Resulting clusters
- Return type
np.ndarray
Examples
>>> clusters = np.array([2, 2, 2, 3, 3, 1, 1, 1]) >>> relabel_clusters_sorted(clusters) [ 1 1 1 2 2 3 3 3 ]
- relabel_clustersets(cluster_sets, method='consistent', translation=None)[source]
Relabels clusters in each cluster set, for consistency across different numbers of clusters.
This is especially useful when plotting cluster sets, to have consistent colouring. This function iterates over the different partitions in the cluster set and relabels them using relabel_next_clusterset_sorted or relabel_clusters_sorted depending on the method.
- Parameters
cluster_sets (ClusterSets)
method ({‘consistent’, ‘ascending’}, optional)
translation (dict, optional) – If None (default), has no effect. Else, dictionary that determines which label should be replaced by which other label For example {1: 2, 2: 3, 3: 1} It is applied after the order relabling from method.
- Returns
cluster_sets_sorted
- Return type
Examples
>>> print(clusterset.clusters) [[1 1 1 2 2 2] [1 1 1 2 2 3] [2 1 1 3 3 4]] >>> clusterset_sorted = pk.cluster_sets, method="consistent") >>> print(clusterset_sorted.clusters) # unchanged because consistent [[1 1 1 2 2 2] [1 1 1 2 2 3] [4 1 1 2 2 3]] >>> clusterset_sorted = pk.cluster_sets, method="ascending") >>> print(clust_sorted.clusters) [[1 1 1 2 2 2] [1 1 1 2 2 3] [1 2 2 3 3 4]]
- relabel_clustersets_from_dict(cluster_sets, translation)[source]
Relabels clusters in each cluster set, so that clusters are labeled according to the translation dictionary
This is especially useful when plotting cluster sets, to have consistent colouring between different figures with cluster sets.
- Parameters
cluster_sets (ClusterSets)
translation (dict) – Dictionary that determines which label should be replaced by which other label For example {1: 2, 2: 3, 3: 1}
- Returns
cluster_sets_sorted
- Return type
Examples
>>> print(cluster_sets.clusters) [[1 1 1 2 2 2] [1 1 1 2 2 3] [2 1 1 3 3 4]] >>> translation = {1: 2, 2: 3, 3: 4, 4: 1} >>> clustersets_new = pk.relabel_clustersets_from_dict(cluster_sets, translation) >>> print(clustersets_new.clusters) [[2 2 2 3 3 3] [2 2 2 3 3 4] [3 2 2 4 4 1]]
- relabel_next_clusterset_sorted(cluster_sets, cluster_sets_sorted, i)[source]
Relabels the clusters in i+1-th cluster set so that it is consistent with i-th cluster set.
This is especially useful when plotting cluster sets, to have consistent colouring.
- Parameters
cluster_sets (ClusterSets) – Original cluster sets
cluster_sets_sorted (ClusterSets) – Cluster sets being sorted, already sorted up to i-1
i (int) – Index of reference cluster set
- Returns
cluster_sets_sorted
- Return type
Examples
>>> print(clusterset.clusters) [[1 1 1 2 2 2] [1 1 1 2 2 3] [2 1 1 3 3 4]] >>> clusterset_sorted = deepcopy(clusterset) >>> pk.relabel_next_clusterset_sorted(clust, clust_sorted, 0) >>> print(clusterset_sorted.clusters) # unchanged because consistent [[1 1 1 2 2 2] [1 1 1 2 2 3] [2 1 1 3 3 4]] >>> pk.relabel_next_clusterset_sorted(clust, clust_sorted, 1) >>> print(clust_sorted.clusters) [[1 1 1 2 2 2] [1 1 1 2 2 3] [4 1 1 2 2 3]] # note that the clusters at index 2 were relabeled
Networks
Functions to visualize networks and temporal networks
- animate_temporal_network(temporal_network, color_temporal='red', color_constant='silver', width_scale=1.5, with_labels=True, pos=None, ax=None, interval=20, frames=None)[source]
Return animation of the temporal network evolving over time
- Parameters
temporal_network (phasik.TemporalNetwork) – Temporal network to visualise
color_temporal (str) – Color of the time-varying edges, defaults to ‘red’
color_constant (str) – Color of the constant edges (defaults to ‘silver’), i.e. for which we have no temporal information
width_scale (float) – Scale factor for width of the temporal edges compared to the constant ones
with_labels (bool, optional) – Wether to draw node labels
pos (dict) – Dictionary of node positions
ax (matplotlib.axis) – Axes to plot the animation on
interval (int) – Interval of time between frames, in ms.
frames (int) – Number of frames of the animation (should be at most the number of timepoints (default))
- Return type
matplotlib.animation
Examples
>>> import phasik as pk >>> pk.animate_temporal_network(temporal_network)
- draw_graph(graph, ax=None, pos=None, color='mediumseagreen', edge_widths=None, edge_colors=None, edge_cmap=None, edge_vmin=None, edge_vmax=None, label_nodes=True, colorbar=True)[source]
Basic graph drawing function
- Parameters
graph (networkx.Graph) – Graph to visualise
ax (matplotlib.Axes, optional) – Axes on which to draw the graph
pos (dict) – Dictionary of node positions of the form {node_id : (x, y)}
color (str, optional) – Color to use for the graph nodes and edges (default ‘mediumseagreen’)
edge_widths (float or array of floats) – Line width of edges
edge_colors (color or array of colors) – Edge color. Can be a single color or a sequence of colors with the same length as edgelist. Color can be string or rgb (or rgba) tuple of floats from 0-1. If numeric values are specified they will be mapped to colors using the edge_cmap and edge_vmin,edge_vmax parameters.
edge_cmap (Matplotlib colormap, optional) – Colormap for mapping intensities of edges
edge_vmin,edge_vmax (floats, optional) – Minimum and maximum for edge colormap scaling
label_nodes (bool, optional) – Whether to label the nodes or just leave them as small circles (default True)
colorbar (bool, optional) – Wether to draw a colorbar
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import networkx as nx >>> import phasik as pk >>> G = nx.fast_gnp_random_graph(10, 0.5) >>> pk.draw_graph(G)
- highlight_subgraphs(graphs, colors, ax=None, pos=None, label_nodes=True)[source]
Draw multiple nested subgraphs on the same axes
- Parameters
graphs (list of networkx.Graph)
colors (list of str) – List of colors, one for each of the graphs in ‘graphs’
ax (matplotlib.Axes, optional) – Axes to plot on
pos (dict) – Dictionary of node positions
label_nodes (bool, optional) – Whether or not to label the graph nodes or leave them as circles
- Return type
None
Examples
>>> import networkx as nx >>> import phasik as pk >>> graphs = [nx.fast_gnp_random_graph(10, 0.5) for i in range(3)] >>> pk.highlight_subgraphs(graphs, ["red", "blue", "green"])
- standard_edge_params(color)[source]
Returns a dictionary containing standard values of edge plotting parameters
- Parameters
color (color) – Color to use for edges
- Return type
dict
- standard_label_params(color)[source]
Returns a dictionary containing standard values of label plotting parameters
- Parameters
color (color) – Color to use for labels
- Return type
dict
Drawing
Useful drawing functions
- plot_edge_series(temporal_network, edges, ax=None, **kwargs)[source]
Draw time series of edge weights, for the specified edges
- Parameters
temporal_network (pk.TempNet) – Temporal network
edges (list of str) – List of edges to plot
ax (matplotlib.Axes, optional) – Axes to use
**kwargs – Other parameters to pass to matplotlib’s plot
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> pk.plot_edge_series(temporal_network, ["A-B", "B-C"])
- plot_events(events, ax=None, text_y_pos=None, text_x_offset=0, period=None, n_periods=1, add_labels=True, orientation='vertical', zorder=- 1, alpha=1, va='bottom')[source]
Visualize the occurence of events as vertical lines.
This function was designed to be used in complement to another function plot_cluster_sets that draws objects over time (horizontal axis). The vertical lines are drawn at the horizontal value corresponding to the time of occurrence of the event.
- Parameters
events (list of tuples (time, name, line_style)) –
time - time at which the event occurred
name - the name of the event
line_style - any string accepted by matplotlib.lines.Line2D.set_linestyle
ax (matplotlib.Axes, optional) – Axes on which to plot the events
text_y_pos (float, optional) – Height at which to place the name of the event (default None)
text_x_offset (float, optional) – Distance along x-axis by which to offset the placement of the event name (default 0)
period (float or None, optional) – Length of time of one period, if events repeat periodically.
n_periods (int, optional) – Number of periods to draw, if events repeat periodically.
add_labels (bool, optional) – Wether to display the label of each vertical line, True by default.
orientation ({“vertical, horizontal”}, optonal) – Orientation of the lines marking the events. Default: “horizontal”.
zorder (float, optional) – Zorder of the lines marking the events. Default: -1.
alpha (float) – Transparency of the lines marking the events. Default: 1.
va (str, optional) – Direction with respect to which doing the vertical alignment. Default: “bottom”
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
See also
plot_phases
,plot_cluster_sets
Examples
>>> import phasik as pk >>> events = [(5, "START", "dashed"), (33, "bud", "dashed"), (36, "ori", "dashed")] >>> cluster_sets.plot(axs=(ax1, ax2), with_silhouettes=True) >>> pk.plot_events(events, ax=ax1)
- plot_interval(binary_series, times, y=0, peak=None, color='k', ax=None, zorder=None)[source]
Plot a binary series as a sequence of coloured intervals
Specifically, draw rectangles to mark intervals where the binary series has value 1 (where it has value 0, do nothing).
- Parameters
binary_series (ndarray) – 2D array of binary data to plot
times (ndarray) – 1D array consisting of the corresponding time points
y (float, optional) – Height (y-axis value) at which to plot the interval (default 0)
peak (float, optional) – Time at which to mark the presence of peak with a red star. By default (None), not drawn.
color (str, optional) – Color to use for the intervals (default ‘k’)
ax (matplotlib.Axes, optional) – Axes to plot on
zorder (int, optional) – Height of the z-axis on which to plot the interval (default None)
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
Examples
>>> import phasik as pk >>> binary_series = [1, 1, 1, 0, 0, 1, 1, 0] >>> times = list(range(8)) >>> pk.plot_interval(binary_series, times, peak=2)
- plot_phases(phases, ax=None, y_pos=None, ymin=0, ymax=1, t_offset=0, color='k')[source]
Visualize temporal phases as shaded intervals
This function was designed to be used in complement to another function plot_cluster_sets that draws objects over time (horizontal axis). The phases are drawn as shaded regions spanning the time interval corresponding to the phases.
- Parameters
phases (list of tuples (start_time, end_time, name)) – The start time, end time, and name of each phase to visualize
ax (matplotlib.Axes) – Axes on which to plot the phases
y_pos (float or None, optional) – Height at which to place the name of the phase
ymin (float, optional) – Height at which to start shaded region (default 0)
ymax (float, optional) – Height at which to stop shaded region (default 1)
t_offset (float, optional) – Offset of phase on the time axis
color (color) – Color to draw the intervals in.
- Returns
The axis object to draw on
- Return type
matplotlib.axes.Axes
See also
plot_events
,plot_cluster_sets
Examples
>>> import phasik as pk >>> phases = [(0, 35, "G1"), (35, 70, "S"), (70, 78, "G2")] >>> cluster_sets.plot(axs=(ax1, ax2), with_silhouettes=True) >>> pk.plot_phases(phases, ax=ax1, y_pos=0.05, ymax=0.1)
- threshold_plot(x, y, threshold, color_below_threshold, color_above_threshold, ax=None)[source]
Plot values above a certain threshold in a particular colour
- Parameters
x (array) – 1D array of values to plot along x-axis
y (array) – 1D array of values to plot along y-axis
threshold (float) – Only plot in colour the points (x,y) with y >= threshold
color_below_threshold (color) – Colour to use for points below threshold
color_above_threshold (list colors) – Colour to use for points above threshold
ax (matplotlib.Axes, optional) – Axes to use
- Returns
line_collection
- Return type
matplotlib LineCollection
Utils
General utility functions for plots
- adjust_margin(ax=None, top=0, bottom=0, left=0, right=0)[source]
Extend the margin of a plot by a percentage of its original width/height
- Parameters
ax (matplotlib.Axes, optional) – Axes whose margins to adjust
top (float, optional) – Percentage (as decimal) by which to increase top margin. Default: 0.
bottom (float, optional) – Percentage (as decimal) by which to increase bottom margin. Default: 0.
left (float, optional) – Percentage (as decimal) by which to increase left margin. Default: 0.
right (float, optional) – Percentage (as decimal) by which to increase right margin. Default: 0.
- Returns
ax – Axes with adjusted margins
- Return type
matplotlib.Axes, optional
- configure_sch_color_map(cmap)[source]
Set SciPy’s colour palette to use a particular color map
- Parameters
cmap (colormap) – Colormap to set
- Return type
None
- display_name(key)[source]
Get a more user-friendly name for certain keywords.
This function takes a keyword key and returns a more user-friendly name for that keyword if it exists in the predefined names dictionary. If the keyword is not found in the dictionary, it is returned as is.
- Parameters
key (str) – The keyword for which a display name is needed.
- Returns
The display name for the given keyword, or the original keyword if not found.
- Return type
str
Examples
>>> display_name('maxclust') 'Max # clusters' >>> display_name('distance') 'Distance threshold' >>> display_name('unknown') 'unknown'
Remove unused axes in grid.
If number of axes is not-rectangular, there will be unused axes at the end of the grid. This removes those axes and ads axes ticks.
- Parameters
axes (list of matplotlib.Axes) – Axes containing the subplots
n_subplots (int) – Number of subplots in the grid; need not be a ‘rectangular’ number
xlabel (str) – Label of the x-axis
ylabel (str) – Label of the y-axis
- Returns
axes – Axes containing the subplots
- Return type
list of matplotlib.Axes
- palette_20_ordered(as_map=False)[source]
Create an ordered color palette of 20 colors.
The function uses the ‘tab20’ color palette from seaborn and rearranges the colors in an ordered pattern. By default, the colors are returned as a list, but if as_map is set to True, a ListedColormap object is returned.
- Parameters
as_map (bool, optional) – Whether to return the colors as a ListedColormap object (default is False).
- Returns
The ordered color palette. If as_map is True, a ListedColormap object is returned.
- Return type
list or ListedColormap
Examples
>>> cmap = pk.palette_20_ordered(as_map=True)