Bad Rank

Description, inputs and outputs of the Bad Rank algorithm

URI

<http://cray.com/graphAlgorithm.badrank>

Description

The Bad Rank algorithm assigns a “badness” score to all vertices in the graph based on their nearness to known bad vertices.

Inputs and Default Values


Input	Default Value
The threshold of the maximum difference between per-vertex Bad Rank results from successive iterations of the algorithm below, which the algorithm will terminate.	`0.0001`
The probability that the next step in a (random) walk will be followed.	`0.84`
The probability that a random walk will take a next step to a bad vertex.	`0.01`
The URI that designates the object field of a triple that identifies a spam vertex	<http://cray.com/spamVertex>
The URI that designates the object field of a triple that identifies a non-spam, or trusted vertex.	<http://cray.com/nonspamVertex>
The URI that designates the predicate field of a triple that identifies either a spam or a non-spam vertex.	Defaults to the standard RDFS type predicate, <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> The above can be abbreviated in a SPARQL query as “`a`”.
The indicator that specifies whether or not normalization should be applied to results. Acceptable values for this parameter are `0` and `1`.	`1`. If the default value is used, the scores are all mapped to floating point numbers between `0.0` and `1.0`, with the maximum value found mapping to `1.0`, the minimum score found mapping to `0.0`, and other scores mapping between those values proportionately. If the value is set to 0, results will not be normalized and will be presented as Bad Rank computed them.

Outputs

Bad Rank produces a two-column intermediate result that can be thought of as a set of pairs. The first item in each pair is the identifier of a vertex, whereas the second is the double-precision Bad Rank value of the vertex.