Spark docker images

Monday, August 13, 2018

Spark docker images

Spark is a very popular environment for processing data and doing machine learning in a distributed environment.

When working in a development environment you might work on a single node. This can be your local PC or laptop, as not everyone will have access to a multi node distributed environment.

But what if you could spin up some docker images there by creating additional nodes for you to test out the scalability of your Spark code.

There are links to some Docker images that may help you to do this.

Mesosphere - Docker repository for Spark image
Big Data Europe - Spark Docker images on GitHub
GettyImages - Spark Docker image on GitHub and also available on Docker website
SequenceIQ - Docker repository Spark image

Or simply create a cloud account on the Databricks Community website to create your own Spark environment to play and learn.

Brendan Tierney - Oralytics Blog

Pages

Monday, August 13, 2018

Spark docker images

No comments:

Post a Comment