Skip to main content

Posts

Using chef to build out a Hadoop cluster

After not posting for a while, I have about 3-4 posts that I'd like to get out there. The first is about using chef to build a Hadoop cluster. Chef  is a configuration management tool that allows one to automate the process of provisioning servers. I had to create a Hadoop cluster of 4-5 servers and I wanted to use this opportunity to automate the process with chef. I had to perform a series of the same steps on these Linux nodes: Install ruby and chef Install Sun Java Install VMware Tools Install NTP Add its hostname to a shared /etc/hosts file Configure passwordless ssh login Installing Chef and Ruby I followed the steps in this link . The first step is to sign up for a Hosted Chef account on the  Opscode  site. An account is free for 5 nodes or less. Perform the following steps: Create a new organization Select "Generate knife config" to download knife.rb Select "Regenerate validation key" to download (validator).pem Click on your acc...

Building a Hadoop cluster

I've recently had to build a Hadoop cluster for a class in information retrieval . My final project involved building a Hadoop cluster. Here are some of my notes on configuring the nodes in the cluster. These links on configuring a single node cluster and multi node cluster were the most helpful. I downloaded the latest Hadoop distribution then moved it into /hadoop. I had problems with this latest distribution (v.21) so I used v.20 instead. Here are the configuration files I changed: core-site.xml: fs.default.name hdfs://master:9000 hadoop.tmp.dir /hadoop/tmp A base for other temporary directories. hadoop-env.sh: # Variables required by Mahout export HADOOP_HOME=/hadoop export HADOOP_CONF_DIR=/hadoop/conf export MAHOUT_HOME=/Users/rpark/mahout PATH=/hadoop/bin:/Users/rpark/mahout/bin:$PATH # The java implementation to use. Required. export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home hdfs-site...

Working with VMware vShield REST API in perl

Here is an overview of how to use perl code to work with VMware's vShield API. vShield App and Edge are two security products offered by VMware. vShield Edge has a broad range of functionality such as firewall, VPN, load balancing, NAT, and DHCP. vShield App is a NIC-level firewall for virtual machines. We'll focus today on how to use the API to programatically make firewall rule changes. Here are some of the things you can do with the API: List the current firewall ruleset Add new rules Get a list of past firewall revisions Revert back to a previous ruleset revision vShield API documentation is available here . Before we get into the API itself, let's look at what the firewall ruleset looks like. It's formatted as XML: 1.1.1.1/32 10.1.1.1/32 datacenter-2 ANY 1023 High 1 ANY LDAP over SSL 636 TCP ALLOW deny 1020 Low 3 ANY IMAP 143 TCP ALLOW false Here are some notes about the XML configuration: The API works mainly ...

Using multiple versions of Ruby on the same host

I've recently come across a tool called RVM or Ruby Version Manager. It enables you to run different versions of Ruby on the same host. RVM uses git so my first step was to install git with the Homebrew package manager. Homebrew is an increasingly popular alternative to MacPorts and Fink. Note that you'll need to install Xcode first. /usr/bin/ruby -e "$(curl -fsSL https://raw.github.com/gist/323731)" brew install git Then I just followed the instructions available here . bash < <( curl http://rvm.beginrescueend.com/releases/rvm-install-head ) source ~/.rvm/scripts/rvm rvm install jruby,1.9.2-head Here is what the output looks like: info: Downloading jruby-bin-1.5.1, this may take a while depending on your connection... info: Extracting jruby-bin-1.5.1 ... info: Building Nailgun info: Installing JRuby to /Users/rpark/.rvm/rubies/jruby-1.5.1 info: Importing initial gems... info: Installing rake info: Installing Ruby from source to: /Users/rp...

Connecting to JDBC data source from OS X perl

Here's another blog post on a similar topic. I recently had to figure it out and I documented it already, so I thought I would share it here too so I don't forget. Here are instructions on connecting to a JDBC data source from OS X perl. I used DBD::JDBC . The process mainly involves setting up a local Java server that provides the front end for a JDBC driver. The Java code implements the JDBC connection to the data source. The perl code talks to the Java server on localhost via DBI so it can access the database. Here are the steps: 1. Check out any documentation about required setup for your JDBC data source. For example, you may need to install an SSL certificate from the server to your client. You may also need to ensure that your server permits database access at all from your client or IP address. 2. Download your JDBC driver and put any .jar files into a lib directory. For example, VJDBC is a JDBC driver that enables you to establish JDBC connections over Java ...

Connecting to SQL Server from OS X perl

I've been spending my coding time in the offhours working on Perl instead of Ruby. My coding time in general has been very limited, which is part of the reason for the length of time between updates. :) My latest project is to pull data out of a Microsoft SQL Server database for analysis. I'm using perl for various reasons: I need a crossplatform environment, and I need certain libraries that only work on perl. Some of the target users for my code run on Windows. I know that Ruby runs on Windows but it's not the platform of choice for Ruby developers. The vast majority seem to develop either on OS X or Linux. So Ruby on Windows isn't at the maturity that ActiveState perl is on Windows. In fact, I don't even run native perl anymore on my MacBook Pro. I've switched over to ActiveState perl because I don't need to compile anything every time I want to install new CPAN libraries. And because it's ActiveState, I'm that much more confident it will w...

Thoughts on the iPad

There are many, many posts on the iPad already. I was quite surprised about the intensity of the debate of its merits, or lack thereof. I've had my iPad for about 5 months now, and I love it. I preordered it and got it on April 30. Since then it has subtly become part of my daily routine. Here is a list of my uses for it, ordered from most frequent to less frequent: Reading books iBooks GoodReader Bible E-mail Looking at RSS feeds Watching movies Games I thought it would be interesting to put together a "collage" of interesting blog posts about the iPad. Here is an analysis of the fact that the iPad is considered to be nothing new. Its critics seize this as a point of weakness but in reality it is probably a point of strength: Instead of praising the iPad, critics express their disappointment, because they expected more. They expected a genre buster. They expected something they’d never seen before, something beyond their imagination. Something revolutionary....