Community detection¶
Using library you can easily use state-of-the-art methods for community detection.
SCAN (PSCAN)¶
Implementation is based on [Zhao]. PSCAN bject implements the whole logic of algorithm. Method computeConnectedComponents(<graph>,<epsilon>), takes two parameters:
- graph - on with algorithm will be executed
- \(\epsilon\) - used for graph pruning based on similarity measure of edges.
Mentioned similarity is computed as follows:
\(sim(v,u)=\frac{|N(v)\cap{} N(u)|}{\sqrt{|N(v)||N(u)|}}\)
where \(N(v)\) is neighbours set of vertex \(v\) . Edeges with similarity lower than \(\epsilon\) (\(sim(v,u)<\epsilon\)) are removed from graph before main part of community detection.
Main part is based on label propagation and is implemented using apropriate data structures and PREGEL operator
import ml.sparkling.graph.operators.OperatorsDSL._
import ml.sparkling.graph.operators.algorithms.pscan.PSCAN.ComponentID
import org.apache.spark.SparkContext
import org.apache.spark.graphx.Graph
implicit ctx:SparkContext=???
// initialize your SparkContext as implicit value
val graph =???
// load your graph (for example using Graph loading API)
val components: Graph[ComponentID, Int] = PSCAN.computeConnectedComponents(graph)
// Graph where each vertex is associated with its component identifier
You can also use more readable method using DSL
import ml.sparkling.graph.operators.OperatorsDSL._
import ml.sparkling.graph.operators.algorithms.pscan.PSCAN.ComponentID
import org.apache.spark.SparkContext
import org.apache.spark.graphx.Graph
implicit ctx:SparkContext=???
// initialize your SparkContext as implicit value
val graph =???
// load your graph (for example using Graph loading API)
val components: Graph[ComponentID, Int] = graph.PSCAN(epsilon=0.5)
// Graph where each vertex is associated with its component identifier
References:
[Zhao] | Zhao, W., Martha, V., & Xu, X. (2013, March). PSCAN: a parallel Structural clustering algorithm for big networks in MapReduce. In Advanced Information Networking and Applications (AINA), 2013 IEEE 27th International Conference on (pp. 862-869). IEEE. PDF |