data:image/s3,"s3://crabby-images/f8dee/f8deef554a4f8c99c019e32adf119f3f7abbe20f" alt="Screenshot 2019-03-04 10.53.57"
docker pull hivemall/latest:20180924
data:image/s3,"s3://crabby-images/d0e0e/d0e0e294c8547f101bfeaa7dea6279ce55314e45" alt="Screenshot 2019-03-04 10.51.34"
data:image/s3,"s3://crabby-images/6b965/6b965302e3ab59b656fd428d2db1c00d3a935a78" alt="Screenshot 2019-03-04 11.01.40"
docker run -p 8088:8088 -p 50070:50070 -p 19888:19888 -it hivemall/latest:20180924
Consider creating a shell script for this, to make it easier each time you want to run the image.
data:image/s3,"s3://crabby-images/c5f9f/c5f9f6fa956f466c41552ccfce54f094a1800b95" alt="Screenshot 2019-03-04 11.15.04"
Now seed Hive with some data. The typical example uses the IRIS data set. Run the following command to do this. This script downloads the IRIS data set, creates a number directories and then creates an external table, in Hive, to point to the IRIS data set.
cd $HOME && ./bin/prepare_iris.sh
data:image/s3,"s3://crabby-images/bc94c/bc94c1b880b64de4e4a8c2d000f88bb33110c46c" alt="Screenshot 2019-03-04 11.20.49"
Now open Hive and list the databases.
hive -S hive> show databases; OK default iris Time taken: 0.131 seconds, Fetched: 2 row(s)
Connect to the IRIS database and list the tables within it.
hive> use iris; hive> show tables; iris_raw
Now query the data (150 records)
hive> select * from iris_raw; 1 Iris-setosa [5.1,3.5,1.4,0.2] 2 Iris-setosa [4.9,3.0,1.4,0.2] 3 Iris-setosa [4.7,3.2,1.3,0.2] 4 Iris-setosa [4.6,3.1,1.5,0.2] 5 Iris-setosa [5.0,3.6,1.4,0.2] 6 Iris-setosa [5.4,3.9,1.7,0.4] 7 Iris-setosa [4.6,3.4,1.4,0.3] 8 Iris-setosa [5.0,3.4,1.5,0.2] 9 Iris-setosa [4.4,2.9,1.4,0.2] 10 Iris-setosa [4.9,3.1,1.5,0.1] 11 Iris-setosa [5.4,3.7,1.5,0.2] 12 Iris-setosa [4.8,3.4,1.6,0.2] 13 Iris-setosa [4.8,3.0,1.4,0.1 ...
Find the min and max values for each feature.
hive> select > min(features[0]), max(features[0]), > min(features[1]), max(features[1]), > min(features[2]), max(features[2]), > min(features[3]), max(features[3]) > from > iris_raw; 4.3 7.9 2.0 4.4 1.0 6.9 0.1 2.5
You are now up and running with HiveMall on Docker.