Monday, August 13, 2018

Spark docker images

Spark is a very popular environment for processing data and doing machine learning in a distributed environment.

When working in a development environment you might work on a single node. This can be your local PC or laptop, as not everyone will have access to a multi node distributed environment.

But what if you could spin up some docker images there by creating additional nodes for you to test out the scalability of your Spark code.

There are links to some Docker images that may help you to do this.

Or simply create a cloud account on the Databricks Community website to create your own Spark environment to play and learn.

No comments:

Post a Comment