MapR made a design goal to be both portable and extensible in this release to enable all types of data science teams. This means that, while we don't ship every possible tool that users will want, we have the right structure in place to allow them to install those tools and have them work seamlessly with direct data access to their MapR Converged Data Platform.
One of the great advantages of MapR-FS is the ability to mount your global file namespace as a Direct NFS mount on your local file system. What this means for deep learning libraries is that they can interact directly with the data in the cluster without needing to be distributed via an execution engine like Spark or limited by compatibility with HDFS.
The MapR Data Science Refinery container includes a FUSE-based MapR POSIX Client, optimized for containers, that allows deep learning libraries to read and write data directly to MapR-FS.
So, when you run TensorFlow, the compute occurs on the host where the container resides, but each container has full access to the persistent storage provided by the MapR Converged Data Platform. When you kill the container off, the data remains.
In order to install TensorFlow in your Data Science Refinery container, it's as simple as running:
sudo -u root pip install tensorflow
This is immediately available to you via the Python interpreter in Apache Zeppelin, and you can test it with the following script in a Zeppelin paragraph:
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
You should see the following result:
In order to access TensorBoard, you need to simply add the port mapping to Docker Run as such:
docker run -p 6006:6006 ...
And then launch TensorBoard using the external host IP:
tensorboard --logdir /tmp/ --host 0.0.0.0
TensorBoard 1.6.0 at http://0.0.0.0:6006 (Press CTRL+C to quit)
And that's all there is to it!
Here's a blog and tutorial that shows how to put this into action:
And here's a video:
- Announcing: MapR Data Science Refinery
- MapR Data Science Refinery Library
- How To: Run the MapR Data Science Refinery from an Edge Node
- How To: Collaborate & Share Notebooks using the MapR Data Science Refinery
- How To: Using R Studio with the MapR Data Science Refinery
- How To: Leveraging Python Environments from DSR (Conda)