# Community detection¶

Using library you can easily use state-of-the-art methods for community detection.

## SCAN (PSCAN)¶

Implementation is based on [Zhao]. PSCAN bject implements the whole logic of algorithm. Method computeConnectedComponents(<graph>,<epsilon>), takes two parameters:

• graph - on with algorithm will be executed
• $$\epsilon$$ - used for graph pruning based on similarity measure of edges.

Mentioned similarity is computed as follows:

$$sim(v,u)=\frac{|N(v)\cap{} N(u)|}{\sqrt{|N(v)||N(u)|}}$$

where $$N(v)$$ is neighbours set of vertex $$v$$ . Edeges with similarity lower than $$\epsilon$$ ($$sim(v,u)<\epsilon$$) are removed from graph before main part of community detection.

Main part is based on label propagation and is implemented using apropriate data structures and PREGEL operator

import ml.sparkling.graph.operators.OperatorsDSL._
import ml.sparkling.graph.operators.algorithms.pscan.PSCAN.ComponentID
import org.apache.spark.SparkContext
import org.apache.spark.graphx.Graph

implicit ctx:SparkContext=???
// initialize your SparkContext as implicit value
val graph =???

val components: Graph[ComponentID, Int] = PSCAN.computeConnectedComponents(graph)
// Graph where each vertex is associated with its component identifier


You can also use more readable method using DSL

import ml.sparkling.graph.operators.OperatorsDSL._
import ml.sparkling.graph.operators.algorithms.pscan.PSCAN.ComponentID
import org.apache.spark.SparkContext
import org.apache.spark.graphx.Graph

implicit ctx:SparkContext=???
// initialize your SparkContext as implicit value
val graph =???