Introduction
Analysts predict that the Hadoop market will reach $50.2 billion USD by 2020. Applications driving these large expenditures are some of the most important workloads for businesses today.
As users adopted Hadoop, many discovered performance and complexity limited Hadoop’s use for broad predictive analytics use. In response, the Hadoop community has focused on the Apache Spark platform to provide Hadoop with significant performance improvements. With Spark atop Hadoop, users can leverage Hadoop’s big-data management capabilities while achieving new performance levels by running analytics in Apache Spark.
In this white paper, we’ll describe how Microsoft R Server helps data scientists, actuaries, risk analysts, quantitative analysts, product planners, and other R users to capture the benefits of Apache Spark on Hadoop by providing a straightforward platform that eliminates much of the complexity of using Spark and Hadoop to conduct analyses on large data assets.