Serialization Formats in Neptune Streams
Amazon Neptune uses two different formats for serializing graph-changes data to log streams, depending on whether the graph was created using Gremlin or SPARQL.
Both formats share a common record serialization format, as described in Neptune Streams API Response Format, that contains the following fields:
commitTimestamp– The time at which the commit for the transaction was requested, in milliseconds from the Unix epoch.eventId– The sequence identifier of the stream change record.data– The serialized Gremlin, SPARQL, or OpenCypher change record. The serialization formats of each record are described in more detail in the next sections.op– The operation that created the change.
PG_JSON Change Serialization Format
Note
As of engine release 1.1.0.0,
the Gremlin stream output format (GREMLIN_JSON) output by the Gremlin
stream endpoint (https://)
is being deprecated. It is replaced by PG_JSON, which is currently identical to
Neptune-DNS:8182/gremlin/streamGREMLIN_JSON.
A Gremlin or openCypher change record, contained in the data field
of a log stream response, has the following fields:
-
id– String, required.The ID of the Gremlin or openCypher element.
-
type– String, required.The type of this Gremlin or openCypher element. Must be one of the following:
vl– Vertex label for Gremlin; node label for openCypher.vp– Vertex properties for Gremlin; node properties for openCypher.e– Edge and edge label for Gremlin; relationship and relationship type for openCypher.ep– Edge properties for Gremlin; relationship properties for openCypher.
-
key– String, required.The property name. For element labels, this is "label".
-
value–valueobject, required.This is a JSON object that contains a
valuefield for the value itself, and adatatypefield for the JSON data type of that value."value": { "value": "the new value", "dataType": "the JSON datatype of the new value" } -
from– String, optional.If this is an edge (type="e"), the ID of the corresponding from vertex or source node.
-
to– String, optional.If this is an edge (type="e"), the ID of the corresponding to vertex or target node.
Gremlin Examples
-
The following is an example of a Gremlin vertex label.
{ "id": "an ID string", "type": "vl", "key": "label", "value": { "value": "the new value of the vertex label", "dataType": "String" } } -
The following is an example of a Gremlin vertex property.
{ "id": "an ID string", "type": "vp", "key": "the property name", "value": { "value": "the new value of the vertex property", "dataType": "the datatype of the vertex property" } } -
The following is an example of a Gremlin edge.
{ "id": "an ID string", "type": "e", "key": "label", "value": { "value": "the new value of the edge", "dataType": "String" }, "from": "the ID of the corresponding "from" vertex", "to": "the ID of the corresponding "to" vertex" }
openCypher Examples
-
The following is an example of an openCypher node label.
{ "id": "an ID string", "type": "vl", "key": "label", "value": { "value": "the new value of the node label", "dataType": "String" } } -
The following is an example of an openCypher node property.
{ "id": "an ID string", "type": "vp", "key": "the property name", "value": { "value": "the new value of the node property", "dataType": "the datatype of the node property" } } -
The following is an example of an openCypher relationship.
{ "id": "an ID string", "type": "e", "key": "label", "value": { "value": "the new value of the relationship", "dataType": "String" }, "from": "the ID of the corresponding source node", "to": "the ID of the corresponding target node" }
SPARQL NQUADS Change Serialization Format
Neptune logs changes to SPARQL quads in the graph using the Resource Description
Framework (RDF) N-QUADS language defined in the W3C RDF 1.1 N-Quads
The data field in the change record simply contains a stmt field
that holds an N-QUADS statement expressing the changed quad, as in the following
example.
"stmt" : "<https://test.com/s> <https://test.com/p> <https://test.com/o> .\n"