

 This whitepaper is for historical reference only. Some content might be outdated and some links might not be available.

# The AWS advantage in big data analytics
<a name="the-aws-advantage-in-big-data-analytics"></a>

Analyzing large datasets requires significant compute capacity that can vary in size, based on the amount of input data and the type of analysis. This characteristic of big data workloads is ideally suited to the pay-as-you-go cloud computing model, where applications can easily scale up and down based on demand. As requirements change, you can easily resize your environment (horizontally or vertically) on AWS to meet your needs, without having to wait for additional hardware or over-investing to provision enough capacity.

For mission-critical applications on a more traditional infrastructure, system designers have no choice but to over-provision, because a surge in additional data due to an increase in business needs must be something the system can handle. By contrast, on AWS, you can provision more capacity and compute in a matter of minutes, meaning that your big data applications grow and shrink as demand dictates, and your system runs as close to optimal efficiency as possible.

 In addition, you get flexible computing on a global infrastructure with access to the many different [geographic Regions](https://aws.amazon.com/about-aws/globalinfrastructure/) that AWS offers, along with the ability to use other scalable services that augment to build sophisticated big data applications. These other services include:
+ [Amazon Simple Storage Service](https://aws.amazon.com/s3/) (Amazon S3) to store data
+ [AWS Glue](https://aws.amazon.com/glue/) to orchestrate jobs to move and transform the data easily
+ [AWS IoT](https://aws.amazon.com/iot/), which lets connected devices interact with cloud applications and other connected devices

As the amount of data being generated continues to grow, AWS has many options to get that data to the cloud, including secure devices like [AWS Snow Family](https://aws.amazon.com/snow/) to accelerate petabyte-scale data transfers, delivery streams with [Amazon Data Firehose](https://aws.amazon.com/kinesis/data-firehose/) to load streaming data continuously, migrating databases using [AWS Database Migration Service](https://aws.amazon.com/dms/), and scalable private connections through [AWS Direct Connect](https://aws.amazon.com/directconnect/). 

As mobile continues to rapidly grow in usage, you can use the suite of services within the [AWS Mobile Hub](https://aws.amazon.com/mobile/) to collect and measure app usage and data, or export that data to another service for further custom analysis. 

These capabilities of AWS make it an ideal fit for solving big data problems, and many customers have implemented successful big data analytics workloads on AWS. For more information about case studies, see [Big Data Customer Success Stories](https://aws.amazon.com/solutions/case-studies/big-data/). 

The following services for collecting, processing, storing, and analyzing big data are described in order: 
+ [Amazon Kinesis](https://aws.amazon.com/kinesis/)
+ [Amazon Managed Streaming for Apache Kafka](https://aws.amazon.com/msk/) (Amazon MSK)
+ [AWS Lambda](https://aws.amazon.com/lambda/)
+ [Amazon Elastic Map Reduce](https://aws.amazon.com/emr/) (Amazon EMR)
+ [AWS Glue](https://aws.amazon.com/glue/)
+ [AWS Lake Formation](https://aws.amazon.com/lake-formation/)
+ [Amazon Machine Learning](https://aws.amazon.com/machine-learning/)
+ [Amazon DynamoDB](https://aws.amazon.com/dynamodb/)
+ [Amazon Redshift](https://aws.amazon.com/redshift/)
+ [Amazon OpenSearch Service](https://aws.amazon.com/opensearch-service/) (OpenSearch Service)
+ [Quick](https://aws.amazon.com/quicksight/)
+ [Amazon Compute Services](https://aws.amazon.com/products/compute/) ([Amazon Elastic Compute Cloud](https://aws.amazon.com/ec2/) (Amazon EC2) instances, [Amazon Elastic Container Service](https://aws.amazon.com/ecs/) (Amazon ECS), and [Amazon Elastic Kubernetes Service](https://aws.amazon.com/eks/) (Amazon EKS) are available for self-managed big data applications.)
+ [Amazon Athena](https://aws.amazon.com/athena/)

 