Betweenness Centrality

Description, inputs and outputs of the Betweenness Centrality algorithm

URI and scalar arguments

<http://cray.com/graphAlgorithm.betweenness_centrality> (st_vx_ct, normalize)
In the above URI, st_vx_ct and normalize are used as examples.

Description

This is the CGE specific implementation of the classical vertex-betweenness-centrality algorithm. This algorithm assigns each vertex a numerical score. Take a given vertex V. In full generality, its betweenness score is defined to be the sum (over all other pairs of vertices) of the ratio of the number of shortest paths between that pair that go through V, over the total number of shortest paths between that pair. Thus it measures a sort of “importance” of each vertex, in terms of the shortest paths to other vertices that pass through it.  

Inputs and Default Values

ParameterDescriptionDefault Value
st_vx_ctThe st_vx_ct parameter can either be an integer or a decimal.
  • If the starting_vertex_ctl parameter is an integer, it represents how many starting vertices should be used when approximating the betweenness score of every vertex in the graph.
  • If the starting_vertex_ctl parameter is a decimal, it should be between 0.0 and 1.0. If a decimal argument is used, the decimal value will represent the fraction of the graph's vertices, randomly chosen, that will be used as starting vertices for approximating the betweenness scores. A value of 1.0 (the default) specifies that every vertex in the graph will be used as a starting vertex.
1.0
normalizeThe normalize parameter specifies whether or not the betweenness scores should be normalized. The acceptable values for this parameter are 0 and 1, where 1 specifies that betweenness scores should be normalized.

Normalizing the scores means to subtract from the betweenness score of each vertex the minimum betweenness score and then divide that partial result by the difference between the maximum and minimum betweenness scores found among all the vertices. Normalized scores will be between 0.0 and 1.0.

1

Outputs

A call to the Betweenness Centrality function returns a two-column intermediate result set. The first column contains the vertex identifier (URI), whereas the second column contains the centrality score of the vertex. In other words, each row of the output result set pairs a vertex’s ID with a double-precision floating-point value representing the centrality score for that vertex.

Example: Betweenness Centrality

PREFIX cray: <http://cray.com/>
SELECT ?vertices ?scores
WHERE {
  CONSTRUCT {
     ?sub ?pred ?obj .
  } WHERE{
      ?sub ?pred ?obj .
  }
    INVOKE cray:graphAlgorithm.betweenness_centrality(.01,1)
    PRODUCING ?vertices ?scores
}
ORDER BY DESC(?scores)

Special Consideration for Graphs with Very Large Diameter

The value of the cge.server.BCmaxActiveLevels NVP parameter can be used to better handle graphs with large diameters. The default setting for this parameter is 100 and can be increased if needed.

If the value of this parameter is set to a value that is too low, the database will remain up and running, but the query will be halted, and the system will return a message indicating that the cge.server.BCmaxActiveLevels parameter's value (i.e. the allocation size) needs to be increased.

The error message is written to a CGE log file as well as to the front end.

The message written to the CGE log file will be similar to the following:
Warning, graph diameter is larger than current allocation for LevelSet data structure.
Use NVP parameter BCmaxActiveLevels to increase the size of the allocation currently set to X levels.
Here X is used as an example for the current value of the BCmaxActiveLevels parameter.
Similarly, the following message will be returned to the CGE front end:
graph diameter is larger than current allocation for LevelSet data structure. 
Use NVP parameter BCmaxActiveLevels to increase the size of the allocation