View a markdown version of this page

Google Drive - Amazon Bedrock

Google Drive

Google Drive is a cloud-based file storage and collaboration service. You can connect Google Drive as a data source for your managed knowledge base to crawl personal drives, shared drives, and files shared with the authenticated identity.

Supported features

  • Crawl personal drives, shared drives, and files shared with the authenticated identity

  • Automatic detection of common document fields (such as title, author, and created or modified dates)

  • Inclusion and exclusion content filters by shared drive IDs and MIME types

  • Granular inclusion of specific folders and files (with OAuth 2.0 authentication only)

  • Date-based filtering for content modified before or after a specific date

  • Incremental content syncs for added, updated, and deleted content

  • Service account and OAuth 2.0 authentication

  • Document-level access control (ACLs), with service account authentication

Authentication methods

A Google Drive data source supports two authentication methods. Choose one before you begin, because it determines the credentials you create, what content the connector can crawl, and whether you can use document-level access control. We recommend service account authentication for new data sources.

Google Drive authentication methods
Method How it authenticates When to use Setup
Service account (SERVICE_ACCOUNT) — recommended A Google Cloud service account with domain-wide delegation authenticates with a private key, then impersonates a Google Workspace admin user to access Drive content for any user in your domain. Most data sources, and any data source that uses document-level access control. Requires a Google Workspace administrator to set up. Set up service account authentication
OAuth 2.0 (OAUTH2) An OAuth 2.0 client ID and secret together with a delegated refresh token that you obtain from a single Google user's sign-in. The connector accesses only the content that user can access. Use when a single Google user has access to all the Drive content you want to crawl, or when you cannot configure domain-wide delegation. Granular folder and file ID filters are available only with this method. Not supported with document-level access control. Set up OAuth 2.0 authentication
Important

Service account authentication requires Google Workspace administrator access. If your organization restricts third-party app access in Google Workspace, your administrator might also need to allow OAuth 2.0 access on your behalf for the OAuth 2.0 method.

Prerequisites

In Google Workspace and Google Cloud, make sure you:

  • Have access to a Google Cloud project where you can enable APIs and create credentials.

  • Have a Google Workspace account with an email domain that matches the content you want to crawl. For service account authentication, you also need administrator access to the Google Workspace.

In your AWS account, make sure you:

How to set up a Google Drive data source

Setting up a Google Drive data source involves the following steps:

  1. Set up authentication. Follow the page for your chosen method to configure Google Cloud and Google Workspace, generate credentials, and store them in AWS: Set up service account authentication for Google Drive or Set up OAuth 2.0 authentication for Google Drive.

  2. Connect the data source. Create the Google Drive data source in the knowledge base using the AWS Management Console or the API. See Connect a Google Drive data source.

  3. (Optional) Enable document-level access control. Filter query results by each user's Google Drive permissions. Requires service account authentication. See Document-level access controls.