I have been working with Hadoop for quite a few years now and frequently find myself needing to try bits of code out on multiple distributions. During these times, the Single Node virtual editions of the various Hadoop distributions have always been my goto resource. Of all the VMs available, I believe the most seamless and well done version is the Hortonworks Sandbox. In fact, in the work I am starting now, to build a new PHD3.0 and HAWQ virtual playground, I view the Hortonworks Sandbox as the bar that needs to be exceeded.
Earlier this week, I had a request to do a live customer demonstration of installing HAWQ on HDP 2.2.4 leveraging Ambari. This activity kicked off those Sandbox thoughts again and I decided to leverage the Sandbox for the demo. Now, this was a bit of a risky proposition considering that I had just about 5 hours to figure out how to make it work. My fallback was a demo video that we built to show during the original announcement. Luckily, the process was fairly straightforward and I had it working in about an hour. I took the rest of my allotted time to work through some additional functionality that I knew would be needed in any follow-on efforts. One nice feature I spent a good deal of time on was automating the Ambari piece of the install via the extremely robust Ambari REST API.
One challenge that I ran into immediately was a versioning issue. Hortonworks provides an HDP-2.2 based VM that runs Ambari 1.7 and they provide a 2.2.4 based VM that runs the just-released Ambari 2.0. When developing the plugin that allows HAWQ to be installed as a Service within Ambari our developers were working with the then newest release of Ambari 1.7, so at release Pivotal HAWQ installation requires Ambari 1.7. So, I had 2 options:
- Update the HDP stack in the 2.2 based VM
- Give the HAWQ installation a whirl on the 2.2.4 VM and Ambari 2.0
I decided to move forward with #2 just to see what would happen......and it worked. What you find below are the results of that first hour or so of work. Please keep in mind, this installation on the Sandbox results in what would be considered an Un-Supported configuration because it's leveraging Ambari 2.0.....BUT.... for playing around with HAWQ it works just fine. I decided to go this direction, because I was unsure how upgrading the ALL of the Hadoop stack might effect some of the other tutorials that Horton provides.
Here is the Step by Step Guide for the Installation:
Download the Hortonworks Sandbox 2.2.4 and install it according to the instructions on the Hortonworks site.
- Boot the VM and once it's booted you will see the ssh command needed to login to the Sandbox. The default root password is: hadoop. Using a terminal, SSH into the VM.
- Outside of the VM: Download the Pivotal HAWQ package, and the HAWQ plugin for Ambari on HDP. Then, move the files into the VM. This can be accomplished via a shared drive, or scp. As an example: scp /User/dbaskette/Downloads/hawq-plugin-hdp-1.0-103.tar.gz root@192.168.9.131:/opt
- hawq-plugin-hdp-1.0-103.tar.gz
- PADS-1.3.0.0-12954.tar
- Untar and uncompress the files.
- Change directories into the new hawq-plugin directory. Inside will be a file named hawq-plugin-hdp-X.Y.Z. (substitute the correct version numbers).
Comments
Post a Comment