Questions? Feedback? powered by Olark live chat software
Skip Navigation

Data Lake Store

A hyper-scale repository for big data analytics workloads

HDFS for the cloud

Microsoft Azure Data Lake Store is an Apache Hadoop® file system that’s compatible with Hadoop Distributed File System (HDFS) and works with the Apache Hadoop ecosystem. Data Lake Store is integrated with Azure Data Lake Analytics and Azure HDInsight and will be integrated with Microsoft offerings like Revolution-R Enterprise; industry-standard distributions like Hortonworks, Cloudera, and MapR; and individual Hadoop projects like Apache Spark, Storm, Flume, Sqoop, and Kafka.

Apache Hadoop® and associated open source project names are trademarks of the Apache Software Foundation.

Ultra-high capacity

Data Lake Store has no fixed limits on account size or file size. While other cloud storage offerings might restrict individual file sizes to a few terabytes, Data Lake Store can store very large files that are hundreds of times larger. At the same time, it provides very low latency read/write access and high throughput for scenarios like high-resolution video, scientific, medical, large backup data, event streams, web logs, and Internet of Things (IoT). Collect and store everything in Data Lake Store without restriction or prior understanding of business requirements.

Optimized for massive throughput

Data Lake Store is built for running large analytic systems that require massive throughput to query and analyze petabytes of data. Other cloud storage solutions aren’t always optimized for parallel computation, creating more work for application developers. With Data Lake Store, you need only focus on the application logic, and we automatically optimize the store for any throughput level.

High frequency, low latency, real-time analytics

Data Lake Store handles high volumes of small writes at low latency, so it’s optimized for near real-time scenarios like website analytics, IoT, and analytics from sensors. NoSQL databases like columnar and key-value stores can also integrate with Data Lake.

Stores data in its native format without prior transformation

Data Lake Store is a distributed file store allowing you to store relational and non-relational data without transformation or schema definition. This lets you store all of your data and analyze them in their native format.

Durable and highly available

Data Lake Store automatically replicates your data to help guard against unexpected hardware failures and make sure it's available when you need it. We keep three copies within a single region.

Rich management and security features

All of your data are assets that have both present and future value. Data Lake Store provides rich capabilities to help manage and secure your data assets. Monitor performance, receive alerts, and audit usage—and gain greater peace of mind. Data Lake Store uses Azure Active Directory, providing a robust identity and access management solution for all of your data.

Related products and services

Data Lake Analytics

Distributed analytics service that makes big data easy

HDInsight

Provision cloud Hadoop, Spark, R Server, HBase, and Storm clusters

Get started with Data Lake Store

Try Data Lake Store for free