Creating an Amazon Q custom connector
To use a custom data source, create an application environment that is responsible for updating your Amazon Q index. The application environment depends on a crawler that you create. The crawler reads the documents in your repository and determines which documents should be sent to Amazon Q. Your application environment should perform the following steps:
-
Crawl your repository and make a list of the documents in your repository that are added, updated, or deleted.
-
Call the StartDataSourceSyncJob API operation to signal that a sync job is starting. You provide a data source ID to identify the data source that is synchronizing. Amazon Q returns an execution ID to identify a particular sync job.
Note
After you end a sync job, you can start a new sync job. There can be a period of time before all of the submitted documents are added to the index. To see the status of the sync job, use the ListDataSourceSyncJobs operation. If the
Statusreturned for the sync job isSYNCING_INDEXING, some documents are still being indexed. You can start a new sync job when the status of the previous job isFAILEDorSUCCEEDED. -
To remove documents from the index, use the BatchDeleteDocument operation. You provide the data source ID and execution ID to identify the data source that is synchronizing and the job that this update is associated with.
-
To signal the end of the sync job, use the StopDataSourceSyncJob operation. After you call the
StopDataSourceSyncJoboperation, the associated execution ID is no longer valid.Note
After you call the
StopDataSourceSyncJoboperation, you can't use a sync job identifier in a call to theBatchPutDocumentorBatchDeleteDocumentoperations. If you do, all of the documents submitted are returned in theFailedDocumentsresponse message from the API. -
To list the sync jobs for the data source and to see metrics for the sync jobs, use the ListDataSourceSyncJobs operation with the index and data source identifiers.