Spring Cloud Data Flow is a cloud-native orchestration service for composable microservice applications on modern runtimes. With Spring Cloud Data Flow, developers can create and orchestrate data pipelines for common use cases such as data ingest, real-time analytics, and data import/export.
Spring Cloud Data Flow is the cloud-native redesign of Spring XD – a project that aimed to simplify the development of Big Data applications. The stream and batch modules from Spring XD are refactored as Spring Boot based stream and task/batch microservice applications respectively. These applications are now autonomous deployment units and they can "natively" run in modern runtimes such as Cloud Foundry, Apache YARN, Apache Mesos, and Kubernetes.
Spring Cloud Data Flow offers a collection of patterns and best practices for microservices-based distributed streaming and task/batch data pipelines.
Step 1 - Download the Spring Cloud Data Flow Local Server and Shell apps:
wget http://repo.spring.io/release/org/springframework/cloud/spring-cloud-dataflow-server-local/1.1.2.RELEASE/spring-cloud-dataflow-server-local-1.1.2.RELEASE.jar
wget http://repo.spring.io/release/org/springframework/cloud/spring-cloud-dataflow-shell/1.1.2.RELEASE/spring-cloud-dataflow-shell-1.1.2.RELEASE.jar
Step 2 - Download and Start Kafka 0.10 [used as: messaging middleware]
Step 3 - Launch the Data Flow Local Server
java -jar spring-cloud-dataflow-server-local-1.1.2.RELEASE.jar
Step 4 - Launch Shell on the same machine where the Data Flow Local Server is runnign
java -jar spring-cloud-dataflow-shell-1.1.2.RELEASE.jar
Step 5 - Import all the out-of-the-box application coordinates in bulk
dataflow:>app import --uri http://bit.ly/Avogadro-GA-stream-applications-kafka-10-maven
Step 6 - Create ‘ticktock’ Stream
dataflow:>stream create ticktock --definition "time | log" --deploy
You'll notice the following in ‘Local’ Server console.
2016-07-18 22:08:24.777 INFO 73058 --- [nio-9393-exec-9] o.s.c.d.spi.local.LocalAppDeployer : deploying app ticktock.log instance 0
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gp/T/spring-cloud-dataflow-5011521526937452211/ticktock-1468904904769/ticktock.log
2016-07-18 22:08:25.081 INFO 73058 --- [nio-9393-exec-9] o.s.c.d.spi.local.LocalAppDeployer : deploying app ticktock.time instance 0
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gp/T/spring-cloud-dataflow-5011521526937452211/ticktock-1468904905074/ticktock.time
Step 7 - Verify the ‘ticktocks’:
tail -f /var/folders/ ... /ticktock.log/stdout_0.log
Step 8 - Launch Dashboard at: http://localhost:9393/dashboard
| Server Type | Stable Release | Milestone Release |
|---|---|---|
| Local Server | 1.1.2.RELEASE[docs] | 1.2.0.BUILD-SNAPSHOT[docs] |
| Cloud Foundry Server | 1.1.0.RELEASE[docs] | 1.2.0.BUILD-SNAPSHOT[docs] |
| Apache YARN Server | 1.1.0.RELEASE[docs] | 1.1.1.BUILD-SNAPSHOT[docs] |
| Kubernetes Server | 1.1.1.RELEASE[docs] | 1.1.2.BUILD-SNAPSHOT[docs] |
| Apache Mesos Server | 1.0.0.RELEASE[docs] | 1.1.0.BUILD-SNAPSHOT[docs] |
Spring Cloud Data Flow for HashiCorp Nomad
Spring Cloud Data Flow for Red Hat OpenShift
Spring Cloud Data Flow builds upon several projects and the top-level building blocks of the ecosystem are listed in the following visual representation. Each project represents a core capability and they evolve in isolation, with separate release cadences - follow the links to find more details about each project.
|
REST-APIs / Shell / DSL
|
Dashboard
|
Flo for Spring Cloud Data Flow
|
Spring Flo
|
|
Spring Cloud Data Flow - Core
|
|||
↓ Uses ↓
|
Spring Cloud Deployer - Service Provider Interface (SPI)
|
↑ Implements ↑
|
Spring Cloud Deployer Local
|
Spring Cloud Deployer Cloud Foundry
|
Spring Cloud Deployer Yarn
|
Spring Cloud Deployer Kubernetes
|
Spring Cloud Deployer Mesos
|
↓ Deploys ↓
|
Spring Cloud Stream App Starters
|
Spring Cloud Task App Starters
|
|
Spring Cloud Stream
|
Spring Cloud Task
|
↓ Uses ↓
|
Spring Integration
|
Spring Boot
|
Spring Batch
|