Web Crawler connector overview
The following table gives an overview of the Amazon Q Business Web Crawler connector and its supported features.
| Category | Feature | Support | 
|---|---|---|
| Security | Authentication type | 
 NoteYou don't need authentication to crawl public websites you have permission to crawl. | 
| Authentication credentials | Basic authentication 
 NTLM/Kerberos authentication 
 Form authentication 
 SAML authentication 
 | |
| Access Control List (ACL) crawling | No | |
| Identity crawling | No | |
| Crawl features | Custom metadata | Yes | 
| Visual content processing | Yes. Amazon Q Business can extract and index content from images embedded in webpages and the following supported document types: PDF, PowerPoint, Microsoft Word (DOCX), Google Slides, Google Docs | |
| Entities | Yes. The following entities are supported: 
 See What is a document? for more details on what each connector crawls as a document. | |
| Field mappings | Yes. For more information, see Field mappings. | |
| Filters | Yes. The following filters are supported: 
 | |
| Sync mode | Supports full and new, modified, or deleted content sync | |
| File types | Supports all files supported by Amazon Q. |