Loading data into Amazon Neptune using queries
Neptune supports writing data directly through query language operations. You can use standard write operations like CREATE and MERGE in openCypher, INSERT in SPARQL, or mergeV() and mergeE() in Gremlin to add or modify data in your graph. These operations are suitable for incremental updates and transactional writes.
For loading data from Amazon S3, use the Using the Amazon Neptune bulk loader to ingest data for large datasets requiring optimized performance. For smaller datasets in one or a few Amazon S3 files, you can use query-based loading functions to read and process data directly within your queries.
The following query-based loading functions are available:
openCypher: neptune.read()
The neptune.read() function reads CSV or Parquet files from Amazon S3 within a CALL subquery, allowing you to process and load data at query time.
CALL neptune.read({ source: "s3://bucket/data.csv", format: "csv" }) YIELD row CREATE (n:Person {id: row.id, name: row.name})
For complete documentation, see neptune.read().
SPARQL: LOAD and UNLOAD
SPARQL LOAD operations import RDF data from a URI into a named graph. UNLOAD exports data from a graph to Amazon S3.
LOAD <s3://bucket/data.ttl> INTO GRAPH <http://example.org/graph>
For complete documentation, see Using SPARQL UPDATE LOAD to import data into Neptune.
Gremlin: io() step
You can also use Gremlin's g.io(URL).read()
step to read in data files in GraphML
g.io("s3://bucket/data.graphml").read().iterate()
See TinkerPop
documentation