Support for Simple GraphML Files
CGE's support/limitations related to GraphML
CGE enables importing simple GraphML files and generating the corresponding quads for the given graph(s). To enable importing a GraphML file, the user can either list a GraphML file in a graph.info file as part of a database build, or load a GraphML file. When CGE processes an input file, any file that ends with the .graphml extension will be treated as a GraphML file.
The syntax supported for GraphML files is based on the DTD specification provided at: http://graphml.graphdrawing.org/
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<node id="n2"/>
<node id="n3"/>
<node id="n4"/>
<edge id="e1" source="n0" target="n2"/>
<edge id="e2" source="n1" target="n2"/>
<edge id="e3" source="n2" target="n3"/>
<edge id="e4" source="n3" target="n4"/>
</graph>
</graphml>Limitations
There are multiple limitations in the current support for GraphML files, including the following:- The
xmlandgraphmlelements are parsed, but otherwise ignored. - Edge data is currently ignored.
- Default edge direction for a graph is ignored.
- Edge direction attribute is ignored.
- Default values for data are ignored.
- Elements in a graph are limited to descriptions, data, nodes and edges.
- Nodes and edges can only contain descriptions or data as subelements.
- Nested graphs are not supported.
When translating an edge to a quad, CGE will convert the edge identifier as well as the source and target identifiers to URIs.
<edge id="e1" source="n0" target="n2"/><urn:n0> <urn:e2> <urn:n2> <urn:G> .Note that when converting the identifier to a URI, CGE will insert the urn: prefix by default. Also, if any error is found when parsing an edge no quad will be generated for that edge. For example, if a node referred to by an edge does not exist in the given graph, or if there was an error when parsing the node declaration, these errors will prevent a quad from being generated for an edge.
NVPs for GraphML Support
cge.server.ExportGMLRDFEnable- Setting this NVP to 1 will cause CGE to export the quads generated for a given GraphML file to an nt file of the same name as the input GraphML file but with the nt extension. For example, if a graph.info file includes the line:
The given NVP is set to 1 then CGE will write the quads produced by the GraphML file to an nt file named:/my/path/to/file_name.graphml
Exporting the quads to an nt file can be useful if the quads will be loaded multiple times since loading quads is faster and uses less memory than loading from a GraphML file. This NVP is off by default./my/path/to/file_name.ntcge.server.GMLInsertPrefix- Setting this to 1 will cause CGE to insert the urn: prefix when converting identifiers for graphs, nodes, and edges to URIs. For example, the following edge:
would result in URIs of<edge id="e1" source="n0" target="n2"/><urn:e1>, <urn:n0> and <urn:n2>for the edge, source and target identifiers, respectively. This NVP is on by default.cge.server.GMLCheckPrefix- Setting this to1will cause CGE to check an identifier for a known prefix before inserting the urn: default prefix. The prefixes that CGE will check for are:urn:http:https:
urn:prefix. For example, given the following edge:
and having this NVP set will result in the following URIs:<edge id="http://www.mysite.com/e1" source="n0" target="n2"/><http://www.mysite.com/e1><urn:n0><urn:n2>
urn:prefix by default.