Logging and Troubleshooting
CGE logging and troubleshooting tips and techniques
CGE produces a text log, which is a trace of program execution during query or update processing. Users can view the log with a text editor (such as vi), or typically the Linux less command. The log can be searched using the grep command for text messages of interest.
INFO messages will be deposited into the log during normal operation. CGE can also generate ERROR and WARN messages. All of these messages can yield information about activity that takes place during command execution.
System error message can be present in the log under conditions where CGE exits or improperly shuts down.
When queries or updates are executed, INFO messages with “now starting query #” are written to the log. For example:
2015-Feb-10 19:34:26.513 CST INFO [][7720] 0x43 parser/parseAndBuildSM.cpp@374 allocQueryGlobals [] [QRY ] <OT> now starting query # 1
INFO messages will also be deposited to the log during normal operation. For example, long processing times can be seen in the log from one INFO message to the next:2015-Feb-13 14:44:45.500 CST INFO [][9448] 0xb utils/malloc/cqe_malloc.cpp@901 LogRequest [] [QRY |MEM ] image 0 : request by "file: parser/qengine/database.cpp, func: readFromDisk line: 989" of 69.849 MiB (0x45d9688) was filled. (0x10005200c80) 2015-Feb-13 14:49:31.099 CST INFO [][9448] 0xc parser/qengine/database.cpp@1141 readFromDisk [] [QRY |STRT] time to read in db of size 139.698 GiB (0x22ecb28000): 285.679279
2014-Dec-18 14:40:37.428 CST INFO [][25977] 0x5b parser/dbServer.cpp@1259 main [] [QRY |STRT|PERF] Total startup time: 1434.489315 seconds
The following are examples of ERROR messages that CGE can produce when query or update processing has failed:
- No such file or directory
- No space left on device
- Exiting because malloc of
- Lookup failure for HURI
- Invalid graph algorithm name
- Exiting with status
- Bad entry
- Short read
- Assertion
- Realloc of
- Error detected in Dispatcher
It is recommend to search the log for the text: "ERROR" and contact Cray Support if problems are encountered in query or update processing.
- huri was not found
- directory not specified
- not found in IRA
- No valid quads in database
- Invalid object for quad
- Number of warnings found
- Unsupported datatype
- not in the dictionary
- IRA huris not allocated
- DUE TO TIME LIMIT
- terminate called without an active exception
- srun: error
- Segmentation fault
- Bus error
- free invalid pointer
- Out of memory
- Unable to terminate gracefully
- Floating point exception
- Aborted
- Killed
- Unable to allocate resources
- Exited with exit code
- Requested nodes are busy
- transaction completed with an error state
- LIBDMAPP ERROR
- IRI Resolution Error
- rpn not found for
- Trapped with SIGINT