Clustering belief systems#
Clustering is a pre-condition for some diversity and polarisation measures. Intuitively, clusters should be formed by groups that have low internal distance. Belief systems in the theory of dialectical structures are clustered by comparing their distances to other agents. Clustering depends on a distance function in this way.
Generate clustering matrices#
The clustering algorithms accept a clustering_matrix
as input, which is
generated by clustering_matrix()
:
- taupy.analysis.clustering.clustering_matrix(positions, *, measure=<function normalised_hamming_distance>, scale=-4, distance_threshold=0.2)[source]#
Converts a difference matrix to a sparse clustering (adjacency) matrix that can be input to community structuring algorithms. This is necessary because many clustering algorithms are designed for sparse social networks.
The default scale of -4 means that agents with a normalised δ > 0.4 will be flattened.
Clustering algorithms#
taupy
implements four clustering functions. These functions serve as
frontends to the clustering algorithms in igraph
and sklearn
.
- taupy.analysis.clustering.leiden(positions, *, clustering_settings={})[source]#
Return the community structure obtained by the Leiden clustering algorithm (see [Traag2019]).
- taupy.analysis.clustering.affinity_propagation(positions, *, clustering_settings={})[source]#
Return the community structure obtained by clustering with Affinity Propagation ([Frey2007]).
- taupy.analysis.clustering.agglomerative_clustering(positions, *, distance_threshold=0.75, base_measure=<function normalised_hamming_distance>)[source]#
Return community structuring obtained by Agglomerative Clustering. Please note that Agglomerative Clustering accepts a common difference matrix, not an adjacency matrix as Leiden and Affinity Propagation do. It is not advisable to pass the output of clustering_matrix() to this function. Please use difference_matrix() with a normalised distance measure as input.
- taupy.analysis.clustering.density_based_clustering(positions, *, min_cluster_size=3, max_neighbour_distance=0.2, base_measure=<function normalised_hamming_distance>)[source]#
Return community structure obtained from density based clustering on a distance (not adjacency) matrix. This clustering algorithm is the only one implemented in this module to allow noise. Points with -1 signal noise.
Comparing clusterings#
The adjusted Rand index (ARI, [HubertArabie1985]) measures the similarity between two clusterings. For simulations that contain many debate stages, the ARI can indicate whether the clustering in subsequent debate stages has completely changed or is somewhat stable. A reasonably high ARI can support the reliability of the clustering method for the analysed debate stages.