

# What Is Amazon CloudSearch?


**Important**  
Amazon CloudSearch is no longer available to new customers. Existing customers of Amazon CloudSearch can continue to use the service as usual. [Learn more](https://aws.amazon.com/blogs/big-data/transition-from-amazon-cloudsearch-to-amazon-opensearch-service/).

Amazon CloudSearch is a fully managed service in the cloud that makes it easy to set up, manage, and scale a search solution for your website or application.

 With Amazon CloudSearch you can search large collections of data such as web pages, document files, forum posts, or product information. You can quickly add search capabilities without having to become a search expert or worry about hardware provisioning, setup, and maintenance. As your volume of data and traffic fluctuates, Amazon CloudSearch scales to meet your needs. 

**Note**  
This document describes the Amazon CloudSearch 2013-01-01 API. If you have 2011-02-01 search domains and need to reference the old documentation, you can download a PDF of the [2011-02-01 Developer Guide](https://s3.amazonaws.com/awsdocs/cloudsearch/2011-02-01/cloudsearch-dg-2011-02-01.pdf). 

You can use Amazon CloudSearch to index and search both structured data and plain text. Amazon CloudSearch features:
+ Full text search with language-specific text processing
+ Boolean search
+ Prefix searches
+ Range searches
+ Term boosting
+ Faceting
+ Highlighting
+ Autocomplete suggestions

You can get search results in JSON or XML, sort and filter results based on field values, and sort results alphabetically, numerically, or according to custom expressions. 

 To build a search solution with Amazon CloudSearch, you take the following steps:
+ **Create and configure a search domain.** A search domain includes your searchable data and the search instances that handle your search requests. If you have multiple collections of data that you want to make searchable, you can create multiple search domains.
+ **Upload the data you want to search to your domain.** Amazon CloudSearch indexes your data and deploys the search index to one or more search instances. 
+ **Search your domain.** You send a search request to your domain's search endpoint as an HTTP/HTTPS GET request. 

**Topics**
+ [

## Are You New to Amazon CloudSearch?
](#new-to-cloudsearch)
+ [

# How Search Works
](how-search-works.md)
+ [

# Automatic Scaling in Amazon CloudSearch
](concepts-scaling.md)
+ [

## Accessing Amazon CloudSearch
](#accessing-cloudsearch)
+ [

## Frequently asked questions
](#faq)

## Are You New to Amazon CloudSearch?
Are You New to Amazon CloudSearch?

For a high-level overview of Amazon CloudSearch, service highlights, and pricing information, see the [Amazon CloudSearch detail page](http://aws.amazon.com/cloudsearch/). If you are ready to start using Amazon CloudSearch, you should begin with [Getting Started with Amazon CloudSearch](getting-started.md). 

You can interact with Amazon CloudSearch through the AWS Management Console, AWS SDKs, or AWS CLI. While you can also submit API requests directly to Amazon CloudSearch, the SDKs and AWS CLI automatically sign your requests as needed and provide centralized tools for interacting with Amazon CloudSearch domains in conjunction with other AWS services. For information about the AWS SDKs, see [Tools for Amazon Web Services](http://aws.amazon.com/tools/). For information about installing and using the AWS CLI, see the [AWS Command Line Interface User Guide](https://docs.aws.amazon.com/cli/latest/userguide/). 

For more information about configuring and managing your search domains, getting your data into Amazon CloudSearch, submitting search requests, and processing the responses, see:
+ [Preparing Your Data](preparing-data.md)—how to format your data so you can upload it to an Amazon CloudSearch domain for indexing
+ [Configuring Index Fields](configuring-index-fields.md)—how to configure indexing options for an Amazon CloudSearch domain
+ [Searching Your Data with Amazon CloudSearch](searching.md)—how to use the Amazon CloudSearch query language
+ [Controlling Search Results](controlling-search-results.md)—how to sort, filter, and paginate search results

# How Search Works
How Search Works

The collection of data that you want to search (sometimes referred to as your *corpus*) can consist of unstructured full-text documents, semi-structured documents such as those formatted in mark-up languages like XML, or structured data that conforms to a strict data model. Each item that you want to be able to search (such as a forum post or web page) is represented as a document. Every document has a unique ID and one or more fields that contain the data that you want to search and include in results. 

To make your data searchable, you represent it as a batch of documents in either JSON or XML and upload the batch to your search domain. Amazon CloudSearch then generates a search index from your document data according to your domain's configuration options. You submit queries against this index to find the documents that meet specific search criteria. 

As your data changes, you submit updates to add, change, or delete documents from your index. Updates are applied continuously in the order they are received.

For information about how to format your data, see [Preparing Your Data](preparing-data.md).

## Indexing in Amazon CloudSearch
Indexing

To build a search index from your data, Amazon CloudSearch needs the following information:
+ Which document fields do you want to search?
+ Which document field values do you want to retrieve with the search results?
+ Which document fields represent categories that you want to use to refine and filter search results?
+ How should the text within a particular field be processed?

You define this metadata in your domain configuration by configuring indexing options. You use indexing options to specify the fields included in the search index and control how you can use those fields. 

You must configure a corresponding index field for each document field that occurs in your data—there's a one-to-one mapping between document fields and the fields in your Amazon CloudSearch index. In addition to the index field name, you specify the following:
+ The index field type
+ Whether the field is searchable (`text` and `text-array` fields are always searchable)
+ Whether the field can be used as a category (facet)
+ Whether the field value can be returned with the search results
+ Whether the field can be used to sort the results
+ Whether highlights can be returned for the field
+ A default value to use if no value is specified in the document data.

For information about how to configure index fields for Amazon CloudSearch, see [Configuring Index Fields](configuring-index-fields.md).

## Facets in Amazon CloudSearch
Facets

A facet is an index field that represents a category that you want to use to refine and filter search results. When you submit search requests to Amazon CloudSearch, you can request facet information to find out how many hits share the same value in a facet. You can display this information along with the search results and use it to enable users to interactively refine their searches. (This is often referred to as faceted navigation or faceted search.)

A facet can be any date, literal, or numeric field that has faceting enabled in your domain configuration. For each facet, Amazon CloudSearch calculates the number of hits that share the same value. You can define buckets to calculate facet counts for particular subsets of the facet values. Only buckets that have matches are included in the facet results.

For information about configuring facets, see [Configuring Index Fields](configuring-index-fields.md). For information about using facet information to support faceted navigation, see [Getting and Using Facet Information in Amazon CloudSearch](faceting.md).

## Text Processing in Amazon CloudSearch
Text Processing

During indexing, Amazon CloudSearch processes the contents of `text` and `text-array` fields according to the language-specific analysis scheme configured for the field. An analysis scheme controls how the text is normalized, tokenized, and stemmed, and specifies any stopwords or synonyms to take into account during indexing. Amazon CloudSearch provides default analysis schemes for each supported language. For information about configuring custom analysis schemes, see [Configuring Analysis Schemes](configuring-analysis-schemes.md). For information about how Amazon CloudSearch normalizes and tokenizes text and applies configured text options when indexing text fields and processing search requests, see [Text Processing in Amazon CloudSearch](text-processing.md).

## Sorting Results in Amazon CloudSearch
Sorting Results

You can customize how search results are ranked by defining expressions that calculate custom values for every document that matches your search criteria. For example, you might define an expression that takes into account the value in a document's `popularity` field as well as the default relevance score calculated by Amazon CloudSearch Expressions are simply numeric expressions that use standard numeric operators and functions. Expressions can reference `int` and `double` fields, other expressions, a document's relevance score (\$1score), as well as the epoch time (\$1time). When you submit search requests, you specify the expression(s) you want to use to sort the search results. You can also reference expressions within your search criteria. 

A document's relevance `_score` indicates how relevant a particular search hit is to the search request. To calculate the relevance score, Amazon CloudSearch takes into account how many times the search terms appear in a document relative to the other documents in the index.

For information about how to configure expressions for your domain, see [Configuring Expressions](configuring-expressions.md).

## Search Requests in Amazon CloudSearch
Search Requests

You submit search requests to your domain's search endpoint as HTTP/HTTPS GET requests. You can specify a variety of options to constrain your search, request facet information, control ranking, and specify what you want to be returned in the results. You can get search results in either JSON or XML. By default, Amazon CloudSearch returns results in JSON.

When you submit a search request, Amazon CloudSearch performs text processing on the search string. The search string is processed to:
+ Convert all characters to lowercase
+ Split the string into separate terms on whitespace and punctuation boundaries 
+ Remove terms that are on the stopword list for the field being searched.
+ Map stems and synonyms according to the stemming and synonym options configured for the field being searched.

After this preprocessing is complete, Amazon CloudSearch looks up the search terms in the index and identifies all of the documents that match the request. To generate a response, Amazon CloudSearch processes this list of search hits to filter and sort the matching documents and compute facets. Amazon CloudSearch then returns the response in JSON or XML. 

By default, Amazon CloudSearch returns search results ranked according to the hits' relevance \$1scores. Alternatively, your request can specify the index field or expression that you want to use to sort the hits. For example, you might want to sort hits by an index field that contains the price or an expression that calculates popularity. 

For more information about searching, ranking, and paginating results, see [Searching Your Data with Amazon CloudSearch](searching.md). 

# Automatic Scaling in Amazon CloudSearch
Automatic Scaling

A search domain has one or more search instances, each with a finite amount of RAM and CPU resources for indexing data and processing requests. How many search instances a domain needs depends on the documents in your collection and the volume and complexity of your search requests.

Amazon CloudSearch can determine the size and number of search instances required to deliver low latency, high throughput search performance. When you upload your data and configure your index, Amazon CloudSearch builds an index and picks the appropriate initial search instance type. As you use your search domain, Amazon CloudSearch can scale to accommodate the amount of data uploaded to the domain and the volume and complexity of search requests.

When you create a search domain, a single instance is deployed for the domain. As the following illustration shows, you always have at least one instance for your domain. Amazon CloudSearch automatically scales the domain by adding instances as the volume of data or traffic increases. 

![\[Scaling for Data and Traffic\]](http://docs.aws.amazon.com/cloudsearch/latest/developerguide/images/cloudsearch-scaling-diagram.png)


## Scaling for Data


When the amount of data you add to your domain exceeds the capacity of the initial search instance type, Amazon CloudSearch scales your search domain to a larger search instance type. After a domain exceeds the capacity of the largest search instance type, Amazon CloudSearch partitions the search index across multiple search instances. (The number of search instances required to hold the index partitions is sometimes referred to as the domain's *width*.) 

When the volume of data in your domain shrinks, Amazon CloudSearch scales down your domain to fewer search instances or a smaller search instance type to minimize costs.

**Note**  
If your domain has scaled up to accommodate your index size and you delete a large number of documents, the domain scales down the next time the full index is rebuilt. Although the index is automatically rebuilt periodically, to scale down as quickly as possible you can explicitly [run indexing](indexing.md) when you are done deleting documents. 

## Scaling for Traffic


As your search request volume or complexity increases, it takes more processing power to handle the load. A high volume of document uploads also increases the load on a domain's search instances. When a search instance nears its maximum load, Amazon CloudSearch deploys a duplicate search instance to provide additional processing power. (The number of duplicate search instances is sometimes referred to as the domain's* depth*.) 

When traffic drops, Amazon CloudSearch removes search instances to minimize costs. For example, a new domain might scale up to handle the initial influx of documents, and scale back down after you have finished uploading your data and are only submitting updates.

If your domain experiences a sudden surge in traffic, Amazon CloudSearch deploys additional search instances. It takes a few minutes to set up the new instances, however, so you might see an increase in 5xx errors until the new instances can start processing requests. For more information about handling 5xx errors, see [Handling Errors](error-handling.md). 

Keep in mind that the type and complexity of your search requests affect overall search performance and in some cases increase the number of search instances required to operate your domain. Submitting a high volume of small or single-document batches can affect your search domain's performance. For more information, see [Tuning Search Request Performance in Amazon CloudSearch](tuning-search.md).

## Accessing Amazon CloudSearch
Accessing Amazon CloudSearch

You can access Amazon CloudSearch through the Amazon CloudSearch console, the AWS SDKs, or the AWS CLI. 
+ The [Amazon CloudSearch console](https://console.aws.amazon.com/cloudsearch/home?region=us-west-2) enables you to easily create, configure, and monitor your search domains, upload documents, and run test searches. Using the console is the easiest way to get started with Amazon CloudSearch and provides a central command center for the ongoing management of your search domains. 
+ The [AWS SDKs](http://aws.amazon.com/code) support all of the Amazon CloudSearch API operations, making it easy to manage and interact with your search domains using your preferred technology. The SDKs automatically sign requests as needed using your AWS credentials.
+ The [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/) wraps all of the Amazon CloudSearch API operations to provide a simple way to create and configure search domains, upload the data you want to search, and submit search requests. The AWS CLI automatically signs requests as needed using your AWS credentials. 

### Regions and Endpoints for Amazon CloudSearch
Regions and Endpoints

 Amazon CloudSearch provides regional endpoints for accessing the configuration service and domain-specific endpoints for accessing the search and document services. 

You use the configuration service to create and manage your search domains. The region-specific configuration service endpoints are of the form: `cloudsearch.region.amazonaws.com`. For example, `cloudsearch.us-east-1.amazonaws.com`. For a current list of supported regions, see [Regions and Endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html#cloudsearch_region) in the AWS General Reference.

 To access the Amazon CloudSearch search and document services, you use separate domain-specific endpoints:
+ `http://doc-domainname-domainid.us-east-1.cloudsearch.amazonaws.com`—a domain's document service endpoint is used to upload documents
+ `http://search-domainname-domainid.us-east-1.cloudsearch.amazonaws.com`—a domain's search endpoint is used to submit search requests

### Signing Amazon CloudSearch Requests
Signing Requests

If you're using a language for which AWS provides an SDK, we recommend that you use the SDK to submit Amazon CloudSearch requests. All of the AWS SDKs greatly simplify the process of signing requests and save you a significant amount of time when compared to using the Amazon CloudSearch APIs directly. The SDKs integrate easily with your development environment and provide easy access to related commands. You can also use the Amazon CloudSearch console and AWS CLI to submit signed requests with no additional effort.

If you choose to call the Amazon CloudSearch APIs directly, you must sign your own requests. Configuration service requests must always be signed. Upload, search, and suggest requests must be signed, unless you configure anonymous access for those services. To sign a request, you calculate a digital signature using a cryptographic hash function, which returns a hash value based on the input. The input includes the text of your request and your secret access key. The hash function returns a hash value that you include in the request as your signature. The signature is part of the Authorization header of your request. After receiving your request, Amazon CloudSearch recalculates the signature using the same hash function and input that you used to sign the request. If the resulting signature matches the signature in the request, Amazon CloudSearch processes the request. Otherwise, the request is rejected.

Amazon CloudSearch supports authentication using AWS Signature Version 4. For more information, see [Signature Version 4 Signing Process](https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html).

## Frequently asked questions
Frequently asked questions

What is the cutoff point for “current customers”?

We created an allowlist of account IDs that are already using Amazon CloudSearch. However, we will allowlist any new account of customers previously using Amazon CloudSearch. If you are having difficulties, please submit a support ticket.

What do we mean by “access” to the service?

Current customers can do anything they could previously. The only change is that non-current customers cannot access Amazon CloudSearch.

Can existing Amazon CloudSearch customers create new repositories if they were alreadyAmazon CloudSearch?

Yes. If you are having difficulties, please submit a support ticket