Easily run interactive queries directly against data in Amazon S3. Pay only for the queries you run.
Easily deploy popular open source, big data frameworks like Apache Hadoop, Spark, Presto, HBase, and Flink.
Fast, fully managed, petabyte-scale data warehouse makes it easy to run even complex queries on massive collections of structured data.
Easily query data in Amazon S3 using Standard SQL
Learn More | Get Started
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to setup or manage, and you can start analyzing your data immediately.
You can use Athena to process logs, perform ad-hoc analysis, and run interactive queries; and you just pay for the queries you run.
"Athena has proven to be fast, easy to use, and cost effective."
Watch the video »
Build massively scalable applications for data transformation, real-time, and predictive analytics.
Learn More | Get Started
Amazon EMR is a managed service that lets you process and analyze extremely large data sets using the latest versions of big data processing frameworks such as Apache Hadoop, Spark, HBase, and Presto on fully customizable clusters.
Amazon EMR goes far beyond SQL. You can run custom applications and code for applications such as machine learning, graph analytics, data transformation, streaming data, and more. You can define specific compute, memory, storage, and application parameters to optimize your analytic requirements.
Redfin provides real estate listing & recommendations to millions of homebuyers. Every day, Redfin uses Amazon EMR with spot instances – dynamically spinning up & down Apache Hadoop clusters – to perform large data transformations and deliver data to internal and external customers. Watch the video »
Analyze all your data using your existing business intelligence tools. Run complex business reports with data from multiple sources.
Learn More | Get Started
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. The query engine in Redshift has been optimized to run SQL queries with really fast performance, including complex queries that join large numbers of database tables.
You can use Amazon Redshift when you need to pull together data from many different sources – like inventory systems, financial systems, retail sales systems, and even log data – into a common format, and store it for long periods of time, to build sophisticated reports with very high query performance.
"Nasdaq achieved faster, richer analytics and data warehousing capabilities while reducing costs by 57% by shifting to Amazon Redshift". Watch the session »
| If you need | Consider using |
| Run ad hoc queries on data stored in S3 | Athena |
| Interactively analyze data in S3 before loading it into Redshift | Athena |
| Run custom code on Spark, Hive, Pig, Presto clusters |
EMR |
| Build and train a predictive model using Spark | EMR |
| Custom application for real-time recommendations | EMR |
| Enterprise reports that join data from mutiple structured data sources | Redshift |
| Run complex queries that join large numbers of database tables on an ongoing basis | Redshift |
| Support business intelligence workloads | Redshift |