Cray Graph Engine (CGE) Quick Reference

Provides a quick start reference for using CGE.

The order in which CGE operations should be performed is:

Step 1: Set up SSH keys

If the following command allows re-logging into the login node without a password, then the SSH keys are set up sufficiently for using CGE.
$ ssh localhost
On the other hand, if the previous command fails and there are existing SSH keys that do not use pass-phrases or have the ssh-agent defined, then try the following:
$ cat ~/.ssh/id_*.pub >> ~/.ssh/authorized_keys
At this point, if it is possible to run the aforementioned test and to re-log in to the login node without using a password, pass-phrase, or ssh-agent, then this step can be considered to be complete. If, on the other hand, the aforementioned test fails, there are no SSH keys defined yet, the following commands can be used to set them up:
CAUTION: Ensure that there are no existing SSH keys because this will overwrite any existing keys. Also, do not specify a pass-phrase when running ssh-keygen
$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
$ ssh-keygen
$ chmod 600 ~/.ssh/id_*
$ chmod 600 ~/.ssh/authorized_keys

If the existing SSH key(s) use pass-phrase(s) or the ssh-agent, or if a more complex SSH key configuration is required, see Cray Graph Engine (CGE) Security Mechanisms. This section also contains information about fine-tuning access to CGE instances.

Step 2: Start the CGE Server

The cge-launch command launches the CGE query engine and enables creating and building a database in a single step.

The following is an example of using the cge-launch command:
$ cge-launch -o pathtoResultsDir -d path -l logfile
In the preceding example:
  • -o - Specifies the path to a directory where you want the result files produced by queries to be placed.
    CAUTION: This path MUST be a directory.
  • -d - Specifies the path to the directory containing the data set to be loaded into the server. This directory must contain all input data files for the data set.
    Note: This directory MUST contain at least one of the following if the data set is being built for the first time with CGE (only one of these will actually be used):
    • dataset.nt - This file contains triples and must be named dataset.nt
    • dataset.nq - This file contains quads and must be named dataset.nq
    • graph.info - This file contains a list of pathnames or URLs to files containing triples or quads and must be named graph.info.
  • -l - Specifies a log file to capture the command output from the run. If the database server is logging to stderr, this log file will capture that information as well. There are two special argument values for this: ':1' and ':2’, which refer to stdout and stderr, respectively, so that the log can be directed to either of those. If the -l option is specified, the cge-launch command runs silently, producing no output to the terminal stdout/stderr.
For more information, see Launch the CGE Server Using the cge-launch Command and The CGE Database Build Process.

Step 3: Execute CGE CLI Commands (Optional)

CGE CLI commands can be executed after the CGE query engine has been launched. Following is an example of using the CGE nvp-info CLI command:
$ cge-cli nvp-info
CGE CLI features a number of commands, which are documented in the CGE CLI section.

Step 3: Start up the CGE Front End Server to Connect with the CGE Server (Optional)

The CGE graphical user interface and SPARQL endpoints can be accessed once the database has been launched. This can be accomplished by launching the web server that provides the user interface on a login node of the system where CGE is running.
$ cge-cli fe --ping
The --ping option in the preceding example is used to verify that the database can be connected to immediately upon launch and that any failure is seen immediately. Not doing so may delay and hide failures. If the ping operation does not succeed, and it is certain that the user executing this command is the only user running CGE, and that everything else is set up correctly, the user should go back to the first step and make sure that the SSH keys are set up right. The system may prompt to trust the host key when the fe command is run for the first time.The default URL to access the UI is http://<hostname>:3756/dataset, where hostname is used as an example for the web server's name. For more information, see Launch the CGE Web Server Using the fe Command.
Alternatively, the following command can be used to have the web server continue running in the background with its logs redirected, even if disconnected from the terminal session:
$ nohup cge-cli fe > web-server.log 2>&1 &

Step 4: Access and Use the CGE Front End (Optional)

For more information, see CGE GUI.

Shutdown the CGE Server

Additional Information

Cancelling a query - To cancel a query, hit CTRL-C on the window where the CGE server was launched or locate the CGE server instance's PID on the login node and use kill -INT <PID>. After that, re-launch CGE.