<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>thattommyhall.com</title>
	<atom:link href="http://www.thattommyhall.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.thattommyhall.com</link>
	<description>A Random Walk Through Idea Space</description>
	<lastBuildDate>Sun, 08 Jan 2012 11:42:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Forward Vegas 2011</title>
		<link>http://www.thattommyhall.com/2012/01/08/forward-vegas-2011/</link>
		<comments>http://www.thattommyhall.com/2012/01/08/forward-vegas-2011/#comments</comments>
		<pubDate>Sun, 08 Jan 2012 11:42:23 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[travel]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=772</guid>
		<description><![CDATA[One again Forward took all its staff to Las Vegas for the Christmas party, cheers Neil! We stayed at the Wynn again. Beautiful view from my room On the first day we went carting. The final day I went to do a skyjump off the Stratosphere (have a DVD I&#8217;ll upload when I find it) [...]]]></description>
			<content:encoded><![CDATA[<p>One again Forward took all its staff to Las Vegas for the Christmas party, cheers Neil!</p>
<p>We stayed at the Wynn again.<br />
<a href="http://www.flickr.com/photos/thattommyhall/6593575777/" title="IMAG0171.jpg by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7001/6593575777_4233933ed9_z.jpg" width="640" height="383" alt="IMAG0171.jpg"></a></p>
<p>Beautiful view from my room<br />
<a href="http://www.flickr.com/photos/thattommyhall/6593572247/" title="IMAG0163.jpg by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7167/6593572247_a24d4ce8b4_z.jpg" width="640" height="383" alt="IMAG0163.jpg"></a></p>
<p>On the first day we went carting.<br />
<a href="http://www.flickr.com/photos/thattommyhall/6593571835/" title="IMAG0162.jpg by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7160/6593571835_ccbfa9f464_z.jpg" width="640" height="383" alt="IMAG0162.jpg"></a></p>
<p>The final day I went to do a skyjump off the Stratosphere (have a DVD I&#8217;ll upload when I find it)<br />
<a href="http://www.flickr.com/photos/thattommyhall/6593573621/" title="IMAG0166.jpg by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7027/6593573621_d46a8f09b0_z.jpg" width="383" height="640" alt="IMAG0166.jpg"></a><br />
<a href="http://www.flickr.com/photos/thattommyhall/6593575417/" title="IMAG0170.jpg by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7005/6593575417_34eec4a941_z.jpg" width="640" height="383" alt="IMAG0170.jpg"></a></p>
<p>Between that some drinking and gambling&#8230;</p>
<p>Random head coming out of the lake in the Wynn<br />
<a href="http://www.flickr.com/photos/thattommyhall/6593576049/" title="IMAG0172.jpg by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7007/6593576049_ca8e5ec96b_z.jpg" width="640" height="383" alt="IMAG0172.jpg"></a></p>
<p>Then home!</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2012/01/08/forward-vegas-2011/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2012/01/08/forward-vegas-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Day 700 of 101 goals in 1001 days</title>
		<link>http://www.thattommyhall.com/2011/12/22/day-700/</link>
		<comments>http://www.thattommyhall.com/2011/12/22/day-700/#comments</comments>
		<pubDate>Thu, 22 Dec 2011 15:44:53 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=762</guid>
		<description><![CDATA[This update is late as I have been busy. 84 &#8211; Revisit Met Museum Went while I was at Hadoop World (pic is actually the Natural History museum but what the hell) During my recent Africa Trip: 24 &#8211; Swim with sharks 33 &#8211; Safari Visited Addo Elephant Park during my recent South Africa Trip [...]]]></description>
			<content:encoded><![CDATA[<p>This update is late as I have been busy.</p>
<p><strong>84 &#8211; Revisit Met Museum</strong><br />
Went while I was at Hadoop World (pic is actually the Natural History museum but what the hell)<br />
<a href="http://www.flickr.com/photos/thattommyhall/6554507049/" title="IMAG0110 by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7018/6554507049_256e46c2a9_z.jpg" width="383" height="640" alt="IMAG0110"></a><br />
During my recent Africa Trip:<br />
<strong>24 &#8211; Swim with sharks</strong><br />
<a href="http://www.flickr.com/photos/thattommyhall/6454747759/" title="IMG_7285.JPG by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7168/6454747759_c021578cee_z.jpg" width="640" height="480" alt="IMG_7285.JPG"></a><br />
<strong>33 &#8211; Safari</strong><br />
Visited <a href="http://www.addoelephant.com/parks/addo/">Addo Elephant Park</a> during my recent South Africa Trip<br />
<a href="http://www.flickr.com/photos/thattommyhall/6459069725/" title="DSC00594.JPG by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7168/6459069725_8f49012b77_z.jpg" width="640" height="480" alt="DSC00594.JPG"></a><br />
<strong>34 &#8211; Vinyard tour</strong><br />
I visited the <a href="http://www.boekenhoutskloof.co.za/">boekenhoutskloof</a> vinyard in Franschhoek South Africa<br />
<a href="http://www.flickr.com/photos/thattommyhall/6459115581/" title="DSC00659.JPG by thattommyhall, on Flickr"><img src="http://farm8.staticflickr.com/7022/6459115581_cdc1a2bc5c_z.jpg" width="640" height="480" alt="DSC00659.JPG"></a><br />
I&#8217;ll do a full write-up of the Africa Trip soon, it was amazing.</p>
<p>So now out of the 101 I have done 24 with 18 on track and only 300 days to go! Need to get a shift on and tick off as much as I can.</p>
<p>The dayzeroproject site is back up, I am on there as <a href="http://dayzeroproject.com/user/thattommyhall" target="_blank">thattommyhall</a></p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/12/22/day-700/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/12/22/day-700/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Reith Lecture by Aung San Suu Kyi</title>
		<link>http://www.thattommyhall.com/2011/06/28/reith-lecture-by-aung-san-suu-kyi/</link>
		<comments>http://www.thattommyhall.com/2011/06/28/reith-lecture-by-aung-san-suu-kyi/#comments</comments>
		<pubDate>Tue, 28 Jun 2011 15:59:21 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[random]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=748</guid>
		<description><![CDATA[Today I woke up to Radio4 as usual and was surprised to hear this years Reith Lecture by Aung San Suu Kyi on Securing Freedom. Very interesting, looking forward to hearing the remainder The archive is available and I thought I would share my highlights: Vilayanur Ramachandran: The Emerging Mind was the first one I [...]]]></description>
			<content:encoded><![CDATA[<p>Today I woke up to Radio4 as usual and was surprised to hear this years Reith Lecture by Aung San Suu Kyi on <a href="http://www.bbc.co.uk/programmes/b0126d29">Securing Freedom</a>. Very interesting, looking forward to hearing the remainder</p>
<p>The archive is available and I thought I would share my highlights:</p>
<ul>
<li><a href="http://www.bbc.co.uk/programmes/p00ghvck">Vilayanur Ramachandran: The Emerging Mind</a> was the first one I ever listened to, great peek into how the brain works</li>
<li><a href="http://www.bbc.co.uk/programmes/p00gmx4c">Edward Said: The representation of the Intellectual</a></li>
<li><a href="http://www.bbc.co.uk/programmes/p00gq1fk">John Searle: Minds Brains And Science</a>. I dont agree with Searle but find him very engaging and thought provoking.</li>
<li><a href="http://www.bbc.co.uk/programmes/p00h9lz3">Bertrand Russell: Authority and the Individual.</a> The first Reith lecture was by a personal hero.</li>
</ul>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/06/28/reith-lecture-by-aung-san-suu-kyi/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/06/28/reith-lecture-by-aung-san-suu-kyi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Walking The Great Glen Way</title>
		<link>http://www.thattommyhall.com/2011/06/26/walking-the-great-glen-way/</link>
		<comments>http://www.thattommyhall.com/2011/06/26/walking-the-great-glen-way/#comments</comments>
		<pubDate>Sun, 26 Jun 2011 21:22:53 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>
		<category><![CDATA[hiking]]></category>
		<category><![CDATA[scotland]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=731</guid>
		<description><![CDATA[Over Easter, while we had all the extra days off because some chinless wonder married a model in an old church in London I went with two of my best friends and walked the 73 miles from Inverness to Fort William along the Caledonian Canal. (picture from Wikipedia) We did it ultralight, using kit I [...]]]></description>
			<content:encoded><![CDATA[<p>Over Easter, while we had all the extra days off because some chinless wonder married a model in an old church in London I went with two of my best friends and walked the 73 miles from Inverness to Fort William along the Caledonian Canal.</p>
<p><a href="http://www.thattommyhall.com/wp-content/uploads/2011/06/500px-Great_Glen_Way_map-en.svg_.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/06/500px-Great_Glen_Way_map-en.svg_.png" alt="" title="500px-Great_Glen_Way_map-en.svg" width="500" height="674" class="alignleft size-full wp-image-732" /></a><br />
(picture from <a href="http://en.wikipedia.org/wiki/Great_Glen_Way">Wikipedia</a>)</p>
<p>We did it ultralight, using kit I have <a href="http://www.thattommyhall.com/2008/06/15/summer-fun/">blogged about before</a>. My mate Ben got well into expedition planning mode and prepared an optimal food mix for the trip and introduced us to <a href="http://">SCROGIN</a> (Sultanas Chocolate Raisons Orange Ginger Imagination Nuts)  and <a href="http://en.wikipedia.org/wiki/ANZAC_biscuit">ANZAC biscuits</a> (his lovely other half is a kiwi).<br />
<a href="http://www.flickr.com/photos/thattommyhall/5701461234/" title="IMAG0005.jpg by thattommyhall, on Flickr"><img src="http://farm3.static.flickr.com/2010/5701461234_b4670f3e59.jpg" width="500" height="299" alt="IMAG0005.jpg"></a><a href="http://www.flickr.com/photos/thattommyhall/5700892919/" title="IMAG0006.jpg by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5269/5700892919_3e04b42a3c.jpg" width="299" height="500" alt="IMAG0006.jpg"></a><br />
I was pleased to fit it all in a 30L sack, made the walking much easier than it might have been.</p>
<p>As I had just been in Lisbon for a stag do the weekend before I was not feeling 100% when we got the sleeper to Inverness on the Monday night but we arrived somewhat fresh and started walking immediatly. The sleeper is really nice and I would deffinatly recommend it over flying if you need an early start in Scotland, see <a href="http://www.scotrail.co.uk/caledoniansleeper/index.html">ScotRail</a>. By the end of Tuesday we had got most of the way to Invermoriston (nearly 30 miles) but were all exhausted. We wildcamped with some stunning views.<br />
<a href="http://www.flickr.com/photos/thattommyhall/5700894217/" title="IMAG0007.jpg by thattommyhall, on Flickr"><img src="http://farm3.static.flickr.com/2255/5700894217_273fb005d1_z.jpg" width="640" height="383" alt="IMAG0007.jpg"></a></p>
<p>The Wednesday we walked to Fort Augustus and decided to take a B&#038;B for the night as non of us had slept well and our legs and feet were killing. We were fortunate enough to stay at <a href="http://www.oldpierhouse.com/">Old Pier House</a> which was lovely and we got moving again on the Thursday with much more enthusiasm than we ended the day before. </p>
<p>Thursday night we got past laggan and camped at a campsite on the north of Loch Lochy.<br />
<a href="http://www.flickr.com/photos/thattommyhall/5701465328/" title="IMAG0008.jpg by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5150/5701465328_538f5d8b8f_z.jpg" width="640" height="383" alt="IMAG0008.jpg"></a></p>
<p>Friday was an epic day, taking in the 2 munros ( <a href="http://en.wikipedia.org/wiki/Meall_na_Teanga">Meall na Teanga</a> and <a href="http://en.wikipedia.org/wiki/Sr%C3%B2n_a%27_Choire_Ghairbh">Sròn a&#8217; Choire Ghairbh</a> and walking about 25 miles then (we thought) finishing the walk. </p>
<p>We had actually just reached <a href="http://en.wikipedia.org/wiki/Neptune%27s_Staircase">Neptune&#8217;s Staircase</a> and we wound up bivvying at the start line of <a href="http://www.maggiescentres.org/eventsfundraising/events/monsterhike/about.html">Maggies Monster Bike and Hike</a>. We must have looked quite odd&#8230;</p>
<p>We spent the first few hours of the Saturday finishing it off and arriving at Fort William where we ate the biggest amount of food we could.</p>
<p><a href="http://www.flickr.com/photos/thattommyhall/5700897633/" title="IMAG0009.jpg by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5107/5700897633_06e77c3546_z.jpg" width="640" height="383" alt="IMAG0009.jpg"></a></p>
<p>A great hike with 2 great guys and as it is a UK long distance path it is another of my <a href="http://www.thattommyhall.com/category/101/">101 goals in 1001 days</a> days ticked off</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/06/26/walking-the-great-glen-way/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/06/26/walking-the-great-glen-way/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Berlin Buzzwords</title>
		<link>http://www.thattommyhall.com/2011/06/09/berlin-buzzwords/</link>
		<comments>http://www.thattommyhall.com/2011/06/09/berlin-buzzwords/#comments</comments>
		<pubDate>Thu, 09 Jun 2011 12:07:50 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[conf]]></category>
		<category><![CDATA[hadoop]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=717</guid>
		<description><![CDATA[I have just returned from Berlin Buzzwords. It was a great conference and well organised so thanks to the organisers. As all the talks will be online soon I will just mention a few things that I enjoyed. The two keynotes were excellent, Doug Cutting on the history of Hadoop and Ted Dunning on the [...]]]></description>
			<content:encoded><![CDATA[<p>I have just returned from Berlin Buzzwords. It was a great conference and well organised so thanks to the organisers.</p>
<p>As all the talks will be online soon I will just mention a few things that I enjoyed.</p>
<p>The two keynotes were excellent, Doug Cutting on the history of Hadoop and Ted Dunning on the future. Both were very interesting and had a great feel for the community aspect of Open Source software. Ted works for <a href="http://www.mapr.com/">MapR technologies</a> but the talk was not a sales pitch. Ted spoke about how Hadoop fails currently to get the most out of the components and what we might get if we could. MapR are used by EMC for their new Hadoop distro, among other things I think they have reimplemented HDFS. An interesting number of companies had got some pretty big amounts of funding to build front-ends to Hadoop, <a href="http://www.datameer.com/">DataMeer</a> have an excel-like web frontend that looks interesting.</p>
<p>Talks I enjoyed were:</p>
<p><a href="http://berlinbuzzwords.de/content/nodejs-heavy-io">NODE.JS FOR HEAVY I/O</a><br />
A superb intro to Node.js, with an example small enough to fit on a slide but not completely trivial.</p>
<p><a href="http://berlinbuzzwords.de/content/time-series-or-causal-analysis-without-limits">TIME SERIES OR CAUSAL ANALYSIS WITHOUT LIMITS!</a><br />
Shivek was awesome, engaging and enthusiastic. The topic itself was fascinating, using<br />
<a href="http://en.wikipedia.org/wiki/%CE%A0-calculus">Pi Calculus</a> to reason about and design map/reduce algorithms. He made the point that most Hadoop jobs are datacentric but showed how to do some more mathscentric algorithms like FFTs</p>
<p><a href="http://berlinbuzzwords.de/content/oh-leonhard-where-art-thou">OH LEONHARD, WHERE ART THOU?</a><br />
Jim Webber on graph databases in general and <a href="http://neo4j.org/">Neo4J</a> in particular. Quite a nice reference to Euler in the title. If your data is a graph, why not have a database that is too?</p>
<p><a href="http://berlinbuzzwords.de/content/realtime-big-data-facebook-hadoop-and-hbase">REALTIME BIG DATA AT FACEBOOK WITH HADOOP AND HBASE</a><br />
From  Jonathan Gray, this talk was really interesting &#8211; amazing the throughput they are getting from HBase. I think Forward are more like Facebook than Google (more freedom within teams, choice of tech/roll your own vs Google wanting everything on BigTable. I cringed a bit at the thought of loads of servers running random C++ apps all over the place though&#8230;)</p>
<p><a href="http://berlinbuzzwords.de/content/newer-developments-large-data-techniques">NEWER DEVELOPMENTS IN LARGE DATA TECHNIQUES</a><br />
Joseph Turian from <a href="http://metaoptimize.com/">MetaOptimise</a> gave a great overview of recent academic work on Machine Learning and Natural LAnguage Processing, buzzwords to look out for are: Deep Learning, Semantic Hashing and Semantic Parsing. Also look at <a href="http://www.graphlab.ml.cmu.edu/">GraphLab</a>, Machine Learning on graph databases</p>
<p><a href="http://berlinbuzzwords.de/content/digitised-dutch-cultural-heritage-mahout-hadoop">DIGITISED DUTCH CULTURAL HERITAGE, MAHOUT &#038; HADOOP</a><br />
<a href="http://berlinbuzzwords.de/content/composing-mahout-clustering-jobs">COMPOSING MAHOUT CLUSTERING JOBS</a><br />
Two good talks on using Mahout, the first is on a Dutch Gov project, <a href="http://imagesforthefuture.com/en/project">Images for the future</a> to archive and categorise AV heritage resources. The second had a nice demo of categorising stack-overflow.</p>
<p>Lightning Talks:<br />
The Lustre filesystem from Eric Barton of <a href="http://www.whamcloud.com/">Whamcloud</a> talked about how his company are developing Lustre outside Sun/Oracle and he was trying to see where it could fit in with Hadoop. Luster is the other end of the spectrum from HDFS/Hadoop, really quick but assuming fast, highly available storage behind it. I would love to see some integration with <a href="http://wiki.lustre.org/index.php/Main_Page">Lustre</a> or <a href="http://ceph.newdream.net/">Ceph</a> in a Hadoop-like system.</p>
<p>I gave a talk on the Flume Firehose Abs and I made at Forward last week, it was OK (though I still think no-one has done a good job of selling ZeroMQ in 10 minutes!). Slides are <a href="http://tinyurl.com/tthbbuzz">here</a> (I&#8217;ll do another post about it as well, quite an entertaining fallout from it over twitter.)</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/06/09/berlin-buzzwords/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/06/09/berlin-buzzwords/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Compressing Text Tables In Hive</title>
		<link>http://www.thattommyhall.com/2011/06/01/compressing-text-tables-in-hive/</link>
		<comments>http://www.thattommyhall.com/2011/06/01/compressing-text-tables-in-hive/#comments</comments>
		<pubDate>Wed, 01 Jun 2011 10:29:36 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[hadoop]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=690</guid>
		<description><![CDATA[At Forward we have been using Hive for a while and started out with the default table type (uncompressed text) and wanted to see if we could save some space and not lose too much performance. The wiki page HiveCompressedStorage lists the possibilities. Basically you have 3 decisions: TextFile or SequenceFile tables TextFile Can be [...]]]></description>
			<content:encoded><![CDATA[<p>At Forward we have been using Hive for a while and started out with the default table type (uncompressed text) and wanted to see if we could save some space and not lose too much performance.</p>
<p>The wiki page <a href="http://wiki.apache.org/hadoop/Hive/CompressedStorage">HiveCompressedStorage</a> lists the possibilities. </p>
<p>Basically you have 3 decisions:<br />
<strong>TextFile or SequenceFile tables</strong><br />
TextFile</p>
<ul>
<li>Can be compressed in place. </li>
<li>Can gzip/bzip before you LOAD DATA into your table</li>
<li>Only gzip/bzip are supported</li>
<li>Gzip is not splitable</li>
</ul>
<p>SequenceFile</p>
<ul>
<li>Need to create a SequenceFile table and do a SELECT/INSERT into it</li>
<li>Can use any supported compression codec</li>
<li>All compression codecs are splitable. All the cool kids use <a href="https://github.com/toddlipcon/hadoop-lzo">LZO</a> or <a href="http://code.google.com/p/hadoop-snappy/">Snappy</a></li>
<li><strong>Does not work</strong>- At least <a href="http://mail-archives.apache.org/mod_mbox/hive-user/201105.mbox/%3CBANLkTim_VG92dnG+fxC89NTSKAJBVvKgMw@mail.gmail.com%3E">for me</a> (help appreciated!)</li>
</ul>
<p><strong>Which compression algorithm</strong></p>
<ul>
<li>gzip &#8211; Quite slow, good compression, not splitable, supported in TextFile table</li>
<li>bzip &#8211; Slowest, best compression, splitable, supported in TextFile table</li>
<li><a href="https://github.com/toddlipcon/hadoop-lzo">LZO</a> &#8211; Not in standard distro (licensing issues), fast, splitable</li>
<li><a href="http://code.google.com/p/hadoop-snappy/">Snappy</a> &#8211; New from google, Not in standard distro (but licence compatable), Very fast </li>
</ul>
<p><strong>Block or Record compression (for SequenceFile tables) </strong><br />
The docs say </p>
<blockquote><p>The value for io.seqfile.compression.type determines how the compression is performed. If you set it to RECORD you will get as many output files as the number of map/reduce jobs. If you set it to BLOCK, you will get as many output files as there were input files. There is a tradeoff involved here &#8212; large number of output files => more parellel map jobs => lower compression ratio.</p></blockquote>
<p>But I got the same number of files regardless of what I selected and the total size suggested they were not even compressed so I dont know what is going on. </p>
<p>For simplicity I chose gziped TextFile tables because</p>
<ul>
<li>It worked (always criteria zero)</li>
<li>Most of our files were not huge anyway and the technique described below keeps some of the parallelism</li>
<li>Can be done on the table in place</li>
<li>Each partition can be compressed separately </li>
<li>The space is saved incrementally and realised immediately </li>
<li>Testing showed for our load it was not much of a performance hit</li>
<li>We are feeling more pain on space than query performance at the moment, our hourly runs complete in ~20mins)</li>
</ul>
<p><div id="gist-1000863" class="gist">

        <div class="gist-file">
          <div class="gist-data gist-syntax">
              <div class="highlight"><pre><div class='line' id='LC1'><span class="nb">require</span> <span class="s1">&#39;rubygems&#39;</span></div><div class='line' id='LC2'><span class="nb">require</span> <span class="s1">&#39;date&#39;</span></div><div class='line' id='LC3'><span class="nb">require</span> <span class="s1">&#39;rbhive&#39;</span></div><div class='line' id='LC4'><br/></div><div class='line' id='LC5'><span class="n">countrys</span> <span class="o">=</span> <span class="sx">%w[at au br de dk es fr in it jp mx nl no pl pt ru se uk us za]</span></div><div class='line' id='LC6'><span class="n">dates</span> <span class="o">=</span> <span class="p">(</span><span class="no">Date</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s1">&#39;2011-01-01&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">.</span><span class="no">Date</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s1">&#39;2011-04-30&#39;</span><span class="p">))</span></div><div class='line' id='LC7'><br/></div><div class='line' id='LC8'><span class="no">RBHive</span><span class="o">.</span><span class="n">connect</span><span class="p">(</span><span class="s1">&#39;hiveserver&#39;</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">con</span><span class="o">|</span></div><div class='line' id='LC9'>&nbsp;&nbsp;<span class="n">dates</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">date</span><span class="o">|</span></div><div class='line' id='LC10'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">countrys</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">country</span><span class="o">|</span></div><div class='line' id='LC11'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">query</span> <span class="o">=</span> <span class="s2">&quot;insert overwrite table keywords partition (dated=&#39;</span><span class="si">#{</span><span class="n">date</span><span class="si">}</span><span class="s2">&#39;, country = &#39;</span><span class="si">#{</span><span class="n">country</span><span class="si">}</span><span class="s2">&#39;)</span></div><div class='line' id='LC12'><span class="s2">              select account,campaign,ad_group,keyword_id,keyword,match_type,status,</span></div><div class='line' id='LC13'><span class="s2">              first_page_bid,quality_score,distribution,max_cpc,destination_url,ad_group_status,</span></div><div class='line' id='LC14'><span class="s2">              campaign_status,currency_code,impressions,clicks,ctr,cpc,</span></div><div class='line' id='LC15'><span class="s2">              cost,avg_position,account_id,campaign_id,adgroup_id </span></div><div class='line' id='LC16'><span class="s2">              from keywords where dated=&#39;</span><span class="si">#{</span><span class="n">date</span><span class="si">}</span><span class="s2">&#39; and country=&#39;</span><span class="si">#{</span><span class="n">country</span><span class="si">}</span><span class="s2">&#39;&quot;</span></div><div class='line' id='LC17'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">begin</span></div><div class='line' id='LC18'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;mapred.output.compression.codec&#39;</span><span class="p">,</span><span class="s1">&#39;org.apache.hadoop.io.compress.GzipCodec&#39;</span><span class="p">)</span></div><div class='line' id='LC19'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;hive.exec.compress.output&#39;</span><span class="p">,</span><span class="s1">&#39;true&#39;</span><span class="p">)</span></div><div class='line' id='LC20'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;mapred.output.compress&#39;</span><span class="p">,</span><span class="s1">&#39;true&#39;</span><span class="p">)</span></div><div class='line' id='LC21'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;mapred.compress.map.output&#39;</span><span class="p">,</span><span class="s1">&#39;true&#39;</span><span class="p">)</span></div><div class='line' id='LC22'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;hive.merge.mapredfiles&#39;</span><span class="p">,</span><span class="s1">&#39;true&#39;</span><span class="p">)</span></div><div class='line' id='LC23'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">set</span><span class="p">(</span><span class="s1">&#39;hive.merge.mapfiles&#39;</span><span class="p">,</span><span class="s1">&#39;true&#39;</span><span class="p">)</span></div><div class='line' id='LC24'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">con</span><span class="o">.</span><span class="n">execute</span><span class="p">(</span><span class="n">query</span><span class="p">)</span></div><div class='line' id='LC25'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">rescue</span> <span class="o">=&gt;</span> <span class="n">e</span></div><div class='line' id='LC26'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="nb">puts</span> <span class="s2">&quot;#########################&quot;</span></div><div class='line' id='LC27'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="nb">puts</span> <span class="n">e</span><span class="o">.</span><span class="n">message</span></div><div class='line' id='LC28'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="nb">puts</span> <span class="s2">&quot;#########################&quot;</span></div><div class='line' id='LC29'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">end</span></div><div class='line' id='LC30'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">end</span></div><div class='line' id='LC31'>&nbsp;&nbsp;<span class="k">end</span> </div><div class='line' id='LC32'><span class="k">end</span> </div><div class='line' id='LC33'><br/></div></pre></div>
          </div>

          <div class="gist-meta">
            <a href="https://gist.github.com/raw/1000863/6ddbf406275e09a95b30a1203f522e493e0ea9e8/compress_keywords.rb" style="float:right;">view raw</a>
            <a href="https://gist.github.com/1000863#file_compress_keywords.rb" style="float:right;margin-right:10px;color:#666">compress_keywords.rb</a>
            <a href="https://gist.github.com/1000863">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
          </div>
        </div>
</div>
<br />
This will loop through the partitions (date/country) and do an INSERT OVERWRITE from/to that partition using our <a href="https://github.com/forward/rbhive">rbhive</a> gem. This is good because Hive reads the old data via map/reduce jobs, writes the output to /tmp, deletes the old folder and then imports the new compressed version. You need to select the columns out as the target partition has 2 less fields (date and country are missing) As we had 2 levels of partitioning and lots of big files this ran within a day on a 2Tb table, saving us around 5Tb (replication factor is 3).</p>
<p>You can actually download and compress the data directly to HDFS as Hive does not know what data is inside the folders on HDFS, just their layout but I thought better to do it via hive and let Hadoop parallelise it. I would have carried on doing it this way but with other tables it was too slow (too many partitions, difficult to parallelise hive server). I stopped using rbhive, dropped to using hive -e to execute the querys and used the lovely autopartitioning in later hive versions. Notice you can SELECT * now and it automatically does what it needs to to insert results into the correct partitions. </p>
<p><div id="gist-1002077" class="gist">

        <div class="gist-file">
          <div class="gist-data gist-syntax">
              <div class="highlight"><pre><div class='line' id='LC1'><span class="nb">require</span> <span class="s1">&#39;rubygems&#39;</span></div><div class='line' id='LC2'><span class="nb">require</span> <span class="s1">&#39;date&#39;</span></div><div class='line' id='LC3'><br/></div><div class='line' id='LC4'><span class="n">countrys</span> <span class="o">=</span> <span class="sx">%w[at au br de dk es fr in int it jp kr mx nl no pl pt ru se uk us za]</span></div><div class='line' id='LC5'><br/></div><div class='line' id='LC6'><span class="n">dates</span> <span class="o">=</span> <span class="p">(</span><span class="no">Date</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s1">&#39;2010-12-02&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">.</span><span class="no">Date</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span><span class="s1">&#39;2011-05-01&#39;</span><span class="p">))</span></div><div class='line' id='LC7'><br/></div><div class='line' id='LC8'><span class="n">dates</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">date</span><span class="o">|</span></div><div class='line' id='LC9'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">=</span> <span class="s2">&quot;&quot;</span></div><div class='line' id='LC10'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;SET hive.exec.compress.output=true;&quot;</span></div><div class='line' id='LC11'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;&quot;</span></div><div class='line' id='LC12'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;set mapred.job.priority=VERY_LOW;&quot;</span> </div><div class='line' id='LC13'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;set hive.exec.dynamic.partition=true;&quot;</span></div><div class='line' id='LC14'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;set mapred.output.compress=true;&quot;</span></div><div class='line' id='LC15'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;set mapred.compress.map.output=true;&quot;</span></div><div class='line' id='LC16'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;set hive.merge.mapredfiles=true;&quot;</span></div><div class='line' id='LC17'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;set hive.merge.mapfiles=true;&quot;</span></div><div class='line' id='LC18'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">+=</span> <span class="s2">&quot;insert overwrite table hourly_clicks </span></div><div class='line' id='LC19'><span class="s2">            partition (dated=&#39;</span><span class="si">#{</span><span class="n">date</span><span class="si">}</span><span class="s2">&#39;, country, hour) </span></div><div class='line' id='LC20'><span class="s2">            select * from hourly_clicks where dated=&#39;</span><span class="si">#{</span><span class="n">date</span><span class="si">}</span><span class="s2">&#39;&quot;</span></div><div class='line' id='LC21'>&nbsp;&nbsp;<span class="n">query</span> <span class="o">=</span> <span class="s2">&quot;hive -e </span><span class="se">\&quot;</span><span class="si">#{</span><span class="n">query</span><span class="si">}</span><span class="se">\&quot;</span><span class="s2">&quot;</span></div><div class='line' id='LC22'>&nbsp;&nbsp;<span class="nb">puts</span> <span class="s2">&quot;running </span><span class="si">#{</span><span class="n">query</span><span class="si">}</span><span class="s2">&quot;</span></div><div class='line' id='LC23'>&nbsp;&nbsp;<span class="sb">`</span><span class="si">#{</span><span class="n">query</span><span class="si">}</span><span class="sb">`</span></div><div class='line' id='LC24'><span class="k">end</span></div><div class='line' id='LC25'><br/></div></pre></div>
          </div>

          <div class="gist-meta">
            <a href="https://gist.github.com/raw/1002077/2a819261336a38866a1a9f505b9adef42ebd64c6/compress_hive_cli.rb" style="float:right;">view raw</a>
            <a href="https://gist.github.com/1002077#file_compress_hive_cli.rb" style="float:right;margin-right:10px;color:#666">compress_hive_cli.rb</a>
            <a href="https://gist.github.com/1002077">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
          </div>
        </div>
</div>
<br />
The key difference is partition (dated=&#8217;#{date}&#8217;, country, hour) , we have not specified a country or hour partition so hive will do it automatically. This ran loads faster than looping over the partitions, letting hive schedule lots more mapreduce jobs at once. If you set hive.exec.dynamic.partition.mode=nonstrict as well you can not specify any partition information (I did this as a test but kept the WHERE clause, I was scared to do it all at once!)</p>
<p>The reason I am not (very) worried about losing parallelism is that some of our partition contained big .csv&#8217;s and the output of INSERT OVERWRITE was multiple .gz files (looked to me like as many as there were mappers, for example a 700M text file became ~10 .gz files) so they will still be read in parallel by mappers as the original CSV was.</p>
<p>Open to suggestions about better ways to achieve this, this does not preclude doing something better later.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/06/01/compressing-text-tables-in-hive/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/06/01/compressing-text-tables-in-hive/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Finding information on Hive tables from HDFS</title>
		<link>http://www.thattommyhall.com/2011/05/16/hive-size-hdfs/</link>
		<comments>http://www.thattommyhall.com/2011/05/16/hive-size-hdfs/#comments</comments>
		<pubDate>Mon, 16 May 2011 16:42:07 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[hadoop]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=686</guid>
		<description><![CDATA[I was curious about our Hive tables total usage on HDFS and what the average filesize was with the current partitioning scheme so wrote this ruby script to calculate it. Lots of our files were small so I am going to experiment with different partitioning and compression schemes. Share on Facebook]]></description>
			<content:encoded><![CDATA[<p>I was curious about our Hive tables total usage on HDFS and what the average filesize was with the current partitioning scheme so wrote this ruby script to calculate it.</p>
<p><div id="gist-974792" class="gist">

        <div class="gist-file">
          <div class="gist-data gist-syntax">
              <div class="highlight"><pre><div class='line' id='LC1'><span class="n">current</span> <span class="o">=</span> <span class="s1">&#39;&#39;</span></div><div class='line' id='LC2'><span class="n">file_count</span> <span class="o">=</span> <span class="mi">0</span></div><div class='line' id='LC3'><span class="n">total_size</span> <span class="o">=</span> <span class="mi">0</span></div><div class='line' id='LC4'><br/></div><div class='line' id='LC5'><span class="n">output</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="s1">&#39;output.csv&#39;</span><span class="p">,</span><span class="s1">&#39;w&#39;</span><span class="p">)</span></div><div class='line' id='LC6'><br/></div><div class='line' id='LC7'><span class="no">IO</span><span class="o">.</span><span class="n">popen</span><span class="p">(</span><span class="s1">&#39;hadoop fs -lsr /user/hive/warehouse&#39;</span><span class="p">)</span><span class="o">.</span><span class="n">each_line</span> <span class="k">do</span> <span class="o">|</span><span class="n">line</span><span class="o">|</span></div><div class='line' id='LC8'>&nbsp;&nbsp;<span class="nb">split</span> <span class="o">=</span> <span class="n">line</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="sr">/\s+/</span><span class="p">)</span></div><div class='line' id='LC9'>&nbsp;&nbsp;<span class="c1">#permissions,replication,user,group,size,mod_date,mod_time,path</span></div><div class='line' id='LC10'>&nbsp;&nbsp;<span class="k">next</span> <span class="k">unless</span> <span class="nb">split</span><span class="o">.</span><span class="n">size</span> <span class="o">==</span> <span class="mi">8</span></div><div class='line' id='LC11'>&nbsp;&nbsp;<span class="n">path</span> <span class="o">=</span> <span class="nb">split</span><span class="o">[</span><span class="mi">7</span><span class="o">]</span></div><div class='line' id='LC12'>&nbsp;&nbsp;<span class="n">size</span> <span class="o">=</span> <span class="nb">split</span><span class="o">[</span><span class="mi">4</span><span class="o">]</span></div><div class='line' id='LC13'>&nbsp;&nbsp;<span class="n">permissions</span> <span class="o">=</span> <span class="nb">split</span><span class="o">[</span><span class="mi">0</span><span class="o">]</span></div><div class='line' id='LC14'>&nbsp;&nbsp;<span class="n">tablename</span><span class="o">=</span><span class="n">path</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s1">&#39;/&#39;</span><span class="p">)</span><span class="o">[</span><span class="mi">4</span><span class="o">]</span></div><div class='line' id='LC15'>&nbsp;&nbsp;<span class="k">if</span> <span class="n">tablename</span> <span class="o">!=</span> <span class="n">current</span></div><div class='line' id='LC16'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">average_size</span> <span class="o">=</span> <span class="n">file_count</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">?</span> <span class="mi">0</span> <span class="p">:</span> <span class="n">total_size</span><span class="o">/</span><span class="n">file_count</span></div><div class='line' id='LC17'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">result</span> <span class="o">=</span> <span class="s2">&quot;</span><span class="si">#{</span><span class="n">current</span><span class="si">}</span><span class="s2">,</span><span class="si">#{</span><span class="n">file_count</span><span class="si">}</span><span class="s2">,</span><span class="si">#{</span><span class="n">total_size</span><span class="si">}</span><span class="s2">,</span><span class="si">#{</span><span class="n">average_size</span><span class="si">}</span><span class="s2">&quot;</span></div><div class='line' id='LC18'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">unless</span> <span class="n">current</span><span class="o">==</span><span class="s1">&#39;&#39;</span></div><div class='line' id='LC19'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="nb">puts</span> <span class="n">result</span></div><div class='line' id='LC20'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">output</span><span class="o">.</span><span class="n">puts</span> <span class="n">result</span></div><div class='line' id='LC21'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="k">end</span></div><div class='line' id='LC22'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">total_size</span> <span class="o">=</span> <span class="mi">0</span></div><div class='line' id='LC23'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">current</span> <span class="o">=</span> <span class="n">tablename</span></div><div class='line' id='LC24'>&nbsp;&nbsp;&nbsp;&nbsp;<span class="n">file_count</span> <span class="o">=</span> <span class="mi">0</span></div><div class='line' id='LC25'>&nbsp;&nbsp;<span class="k">end</span></div><div class='line' id='LC26'>&nbsp;&nbsp;<span class="n">file_count</span> <span class="o">+=</span> <span class="mi">1</span> <span class="k">unless</span> <span class="n">permissions</span><span class="o">[</span><span class="mi">0</span><span class="o">]</span> <span class="o">==</span> <span class="s1">&#39;d&#39;</span></div><div class='line' id='LC27'>&nbsp;&nbsp;<span class="n">total_size</span> <span class="o">+=</span> <span class="n">size</span><span class="o">.</span><span class="n">to_i</span></div><div class='line' id='LC28'><span class="k">end</span></div></pre></div>
          </div>

          <div class="gist-meta">
            <a href="https://gist.github.com/raw/974792/f07207bc776f669d93e9d669f7fee539057d5613/hive_info.rb" style="float:right;">view raw</a>
            <a href="https://gist.github.com/974792#file_hive_info.rb" style="float:right;margin-right:10px;color:#666">hive_info.rb</a>
            <a href="https://gist.github.com/974792">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
          </div>
        </div>
</div>
</p>
<p>Lots of our files were small so I am going to experiment with different partitioning and compression schemes.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/05/16/hive-size-hdfs/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/05/16/hive-size-hdfs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Running &#8211;repair on MongoDB via Upstart</title>
		<link>http://www.thattommyhall.com/2011/05/13/running-repair-on-mongodb-via-upstart/</link>
		<comments>http://www.thattommyhall.com/2011/05/13/running-repair-on-mongodb-via-upstart/#comments</comments>
		<pubDate>Fri, 13 May 2011 18:46:01 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[devops]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[mongodb]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=672</guid>
		<description><![CDATA[One of our servers running MongoDB crashed today and we encountered the typical old lock file: /var/lib/mongodb/mongod.lock. probably means unclean shutdown recommend removing file and running &#8211;repair see: http://dochub.mongodb.org/core/repair for more information As the docs do not seem to have much of an alternative to running &#8211;repair I looked for a way to automate it [...]]]></description>
			<content:encoded><![CDATA[<p>One of our servers running MongoDB crashed today and we encountered the typical </p>
<blockquote><p>old lock file: /var/lib/mongodb/mongod.lock.  probably means unclean shutdown<br />
recommend removing file and running &#8211;repair<br />
see: http://dochub.mongodb.org/core/repair for more information
</p></blockquote>
<p>As the docs do not seem to have much of an alternative to running &#8211;repair I looked for a way to automate it from upstart. Mongo creates a mongod.lock file in the data directory with the processes PID in and on a safe shutdown removes the PID, leaving the file there. </p>
<p>This upstart scripts includes a pre-start script that checks if the lock file exists, reads it, makes sure there is a PID there, makes sure no mongod processes exist with that PID then performs the repair as the mongodb user. </p>
<p><div id="gist-971056" class="gist">

        <div class="gist-file">
          <div class="gist-data gist-syntax">
              <div class="highlight"><pre><div class='line' id='LC1'>limit nofile 20000 20000</div><div class='line' id='LC2'><br/></div><div class='line' id='LC3'>kill timeout 300</div><div class='line' id='LC4'><br/></div><div class='line' id='LC5'>env MONGO_DATA=/var/lib/mongodb/</div><div class='line' id='LC6'>env MONGO_LOGS=/var/log/mongodb/</div><div class='line' id='LC7'>env MONGO_EXE=/usr/bin/mongod</div><div class='line' id='LC8'>env MONGO_CONF=/etc/mongodb.conf</div><div class='line' id='LC9'><br/></div><div class='line' id='LC10'>pre-start script</div><div class='line' id='LC11'>&nbsp;&nbsp;mkdir -p $MONGO_DATA</div><div class='line' id='LC12'>&nbsp;&nbsp;mkdir -p $MONGO_LOGS</div><div class='line' id='LC13'>&nbsp;&nbsp;if [ -f $MONGO_DATA/mongod.lock ]; then</div><div class='line' id='LC14'>&nbsp;&nbsp;&nbsp;&nbsp;mongo_pid=`cat $MONGO_DATA/mongod.lock`</div><div class='line' id='LC15'>&nbsp;&nbsp;&nbsp;&nbsp;if [ ! -z $mongo_pid ]; then</div><div class='line' id='LC16'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if [ ! `pgrep mongo | grep &quot;$mongo_pid&quot; | wc -l` -gt 0 ]; then</div><div class='line' id='LC17'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;rm $MONGO_DATA/mongod.lock</div><div class='line' id='LC18'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sudo -u mongodb /usr/bin/mongod --config /etc/mongodb.conf --repair</div><div class='line' id='LC19'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;touch $MONGO_DATA/repaired-`date &quot;+%Y%m%d-%H%M%S&quot;`</div><div class='line' id='LC20'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fi</div><div class='line' id='LC21'>&nbsp;&nbsp;&nbsp;&nbsp;fi</div><div class='line' id='LC22'>&nbsp;&nbsp;fi  </div><div class='line' id='LC23'>end script</div><div class='line' id='LC24'><br/></div><div class='line' id='LC25'>start on runlevel [2345]</div><div class='line' id='LC26'>stop on runlevel [06]</div><div class='line' id='LC27'><br/></div><div class='line' id='LC28'>script</div><div class='line' id='LC29'>&nbsp;&nbsp;if [ -f /etc/default/mongodb ]; then . /etc/default/mongodb; fi</div><div class='line' id='LC30'>&nbsp;&nbsp;exec start-stop-daemon --start --quiet --chuid mongodb --exec  $MONGO_EXE -- --config $MONGO_CONF</div><div class='line' id='LC31'>end script</div></pre></div>
          </div>

          <div class="gist-meta">
            <a href="https://gist.github.com/raw/971056/6fae5ce2e92c63e4b1a1f0d820a8072c861dd2f3/mongodb.conf" style="float:right;">view raw</a>
            <a href="https://gist.github.com/971056#file_mongodb.conf" style="float:right;margin-right:10px;color:#666">mongodb.conf</a>
            <a href="https://gist.github.com/971056">This Gist</a> brought to you by <a href="http://github.com">GitHub</a>.
          </div>
        </div>
</div>
</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/05/13/running-repair-on-mongodb-via-upstart/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/05/13/running-repair-on-mongodb-via-upstart/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>We are all DevOps</title>
		<link>http://www.thattommyhall.com/2011/04/04/we-are-all-devops/</link>
		<comments>http://www.thattommyhall.com/2011/04/04/we-are-all-devops/#comments</comments>
		<pubDate>Mon, 04 Apr 2011 17:12:15 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[devops]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=662</guid>
		<description><![CDATA[I gave a talk recently at the Forward Tech away day entitled We Are All DevOps and it went down quite well. Forward is an unusual environment, the devs are trusted to do lots of the typical sysadmin role and the boundary between Dev and Ops is very blurred. During my first few months in [...]]]></description>
			<content:encoded><![CDATA[<p>I gave a talk recently at the Forward Tech away day entitled We Are All DevOps and it went down quite well. Forward is an unusual environment, the devs are trusted to do lots of the typical sysadmin role and the boundary between Dev and Ops is very blurred. During my first few months in the search team I kept mindmapping stuff I wanted to talk about but only got round to making the slides the day before so it was a bit underprepared but I hope useful for people. </p>
<p>I borrowed ideas from John Leach&#8217;s excellent <a href="http://video2010.scottishrubyconference.com/show_video/6/1">Ruby: Reinventing the Wheel</a> talk, this <a href="http://www.slideshare.net/jedi4ever/devops-the-war-is-over-if-you-want-it">DepOps: The War Is Over</a> presentation and rambled incoherently about a talk I just saw at the UKUUG Spring Conference from the author of cfengine, see <a href="http://www.cfengine.org/pages/science">here</a> a nice description of the project (you can see how it has influenced <a href="http://www.puppetlabs.com/">Puppet</a>) </p>
<p>Here are the slides (first time I have used Scribd, it is excellent. Much better than slideshare)<br />
<a title="View DevOps on Scribd" href="http://www.scribd.com/doc/52256587/DevOps" style="margin: 12px auto 6px auto; font-family: Helvetica,Arial,Sans-serif; font-style: normal; font-variant: normal; font-weight: normal; font-size: 14px; line-height: normal; font-size-adjust: none; font-stretch: normal; -x-system-font: none; display: block; text-decoration: underline;">DevOps</a><iframe class="scribd_iframe_embed" src="http://www.scribd.com/embeds/52256587/content?start_page=1&#038;view_mode=slideshow&#038;access_key=key-20ddphiwt0x32nqhp6br" data-auto-height="true" data-aspect-ratio="1.33333333333333" scrolling="no" id="doc_72344" width="100%" height="600" frameborder="0"></iframe><script type="text/javascript">(function() { var scribd = document.createElement("script"); scribd.type = "text/javascript"; scribd.async = true; scribd.src = "http://www.scribd.com/javascripts/embed_code/inject.js"; var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(scribd, s); })();</script></p>
<p>I like the <a href="http://blog.websages.com/2010/12/10/jameswhite-manifesto/">James White Manifesto </a>, it chimes really strongly with me.</p>
<p>In particular</p>
<blockquote><p> On Infrastructure<br />
 &#8212;&#8212;&#8212;&#8212;&#8212;&#8211;<br />
 There is one system, not a collection of systems.<br />
 The desired state of the system should be a known quantity.<br />
 The &#8220;known quantity&#8221; must be machine parseable.<br />
 The actual state of the system must self-correct to the desired state.<br />
 The only authoritative source for the actual state of the system is the system.<br />
 The entire system must be deployable using source media and text files.</p></blockquote>
<p>Soon they will post videos and I will get to see myself give a talk  for the first time.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/04/04/we-are-all-devops/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/04/04/we-are-all-devops/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>101 goals in 1001 days &#8211; Day 400 Update</title>
		<link>http://www.thattommyhall.com/2011/02/27/101in1001-day400/</link>
		<comments>http://www.thattommyhall.com/2011/02/27/101in1001-day400/#comments</comments>
		<pubDate>Sun, 27 Feb 2011 18:11:37 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=632</guid>
		<description><![CDATA[Well, day 400 of my 101 goals in was Feb 5th and I was in the midst of moving house so delayed doing this. Completed &#8211; 16 1, Teetotalitarianism for 3 months 2, Cheeseless for 3 months 9, Read GEB 11, Reread all Dennett books 15, Proofread for Project Guttenburg 48, Create a Backblaze storage [...]]]></description>
			<content:encoded><![CDATA[<p>Well, day 400 of my 101 goals in was Feb 5th and I was in the midst of moving house so delayed doing this.</p>
<p><strong>Completed &#8211; 16</strong><br />
<em>1, Teetotalitarianism for 3 months</em><br />
<em>2, Cheeseless for 3 months</em><br />
<em>9, Read <a href="http://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach">GEB</a></em><br />
<em>11, Reread all <a href="http://en.wikipedia.org/wiki/Daniel_Dennett">Dennett</a> books</em><br />
<em>15, Proofread for <a href="http://www.gutenberg.org/wiki/Main_Page">Project Guttenburg</a></em><br />
<em>48, <a href="http://ukblazers.com/2010/08/25/test-build-easier-than-i-thought/">Create a Backblaze storage pod</a></em><br />
<em>53, <a href="http://www.thattommyhall.com/2010/04/15/making-lemon-curd/">Make Jam</a></em><br />
<em>66, Via Feratta in Italy</em><br />
<em>78, Learn to use Emacs</em><br />
I suppose you can never fully learn it but I do use it for my development now<br />
<em>82, <a href="http://www.thattommyhall.com/2010/12/16/egypt-trip/">Visit Egypt</a></em><br />
<em>83, Re-visit Louvre</em><br />
<em>85, Visit Pergamon Museum</em><br />
<em>86, Give Carrie a British Museum Tour</em><br />
<em>92, Read &#8220;<a href="http://www.amazon.co.uk/Ode-Less-Travelled-Unlocking-Within/dp/009179661X" rel="nofollow">An Ode Less Travelled</a>&#8220;, do the exercises (but not share them!)</em><br />
Read it while in Egypt.<br />
<em>97, Be 1/3 through in 2010<br />
100, Set success criteria / progression metrics for each goal</em></p>
<p><strong>On Track &#8211; 16</strong><br />
<em>5, Lose 2 stone</em><br />
<em>10, Write book reviews for each book I read</em><br />
Where I havent yet I have added a task to rememberthemilk to do so<br />
<em>13, Release 303 books on bookcrossing.com</em><br />
88 available <a href="http://www.bookcrossing.com/mybookshelf/thattommyhall/available">here</a>, let me know if you like any and I will post them to you.<br />
<em>19,Blog on average once a week<br />
50, Move 10 people to FreeAgent</em><br />
<em>68, Complete Pimsleur German</em><br />
Changed from Spanish as I now live with a lovely German lady.<br />
<em>72, Read &#8220;Winning Ways&#8221;</em><br />
read 1/2 of part 1 (of 4)<br />
<em>74, Read AI: A Modern Approach</em><br />
<em>75, Watch SICP, do exercises from book</em><br />
Started a book club in work, seems to have stalled but I&#8217;ll start banging the drum again now I&#8217;ve settled in my new house.<br />
<em>76, Do on average 1 <a href="http://projecteuler.net/index.php?section=about">Project Euler</a> problem per week</em><br />
<em>77, Complete &#8220;Real World Haskell&#8221;</em><br />
<em>88, Go to the theatre on average once a month</em><br />
Way ahead on this, started a monthly theatre club but we managed to schedule a dozen things for the first few months of 2011<br />
<em>91, Memorise 10 poems</em><br />
Not quite settled on the 10 but between listening to Jorge Louis Borges, This Craft Of Verse and The Ode Less Travelled I have quite a list to choose from.<br />
<em>95, Pay off all credit cards<br />
96, Let loans run course and dont get any more<br />
101, Do 100 day updates</em><br />
This is one right <img src='http://www.thattommyhall.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p><strong>Behind &#8211; 4</strong><br />
8, Read all the VSIs<br />
12, Read all PG Wodehouse<br />
81, Watch all TTC Art history DVDs<br />
90, See all world heritage sites in the UK</p>
<p><strong>Changing &#8211; 5</strong><br />
Lots of the work related ones dont make sense any more now that I have gone full time and moved into development so I am making the following changes.<br />
<em>43, Visit the <a href="http://www.rijksmuseum.nl/">rijksmuseum</a></em> (was Get CCNP)<br />
<em>44, Visit The <a href="http://www.uffizi.com/">Uffizi</a> in Florence</em> (was Get CCEE)<br />
<em>45, Give blood every 20 weeks</em> (was Get MCITP &#8211; Enterprise Admin)<br />
<em>46, Listen to Radio 4 / British Museum &#8211; <a href="http://www.bbc.co.uk/ahistoryoftheworld/">A History of the World in 100 Objects</a> and view each of them</em> (was Get VCAP)<br />
The above are all taken from a mate who just did his own 101 list.<br />
<em>47, Make a Munro bagging site in Rails</em> (was Say to a recruiter &#8220;I dont work <MonthName>&#8221; and turn down work)</p>
<p><strong>Planning &#8211; 12 </strong><br />
<em>60, Hike on average once a month<br />
61, Do a UK long distance path<br />
67, Do another alpine 4000m peak<br />
62, Do a big hike in Europe<br />
64, Climb a continental highest mountain<br />
33, Safari<br />
20, Organise a big bash for my 30th</em><br />
The fitness aspect of these goals is where I am behind the most (though I am still a stone lighter than when I started) so I am concentrating the next six months on these goals, ending with summiting kilimanjaro for my 30th then returning to a big party.<br />
<em>35, Visit 5 Michelin 3* restaurants</em><br />
<em>37, Visit porto</em><br />
Will go with Petra in the spring<br />
<em>84, Revisit Met Museum</em><br />
A good mate has just moved to NYC so this should happen as soon as he is settled.<br />
<em>89, Return to the Theatre by the lake</em><br />
My first trip with Petra was to here and we loved it. Will be going in the spring.<br />
<em>94, Go to Edinburgh festival</em><br />
Will go at the beginning of August.</p>
<p><em><strong>Not Started &#8211; 48</strong><br />
3, Do a marathon<br />
4, Do a triathlon<br />
6, Attend martial arts classes for 3 months<br />
7, Write an artice for Plus new writers<br />
14, Read a short story for librivox<br />
16, Send Dennett a letter<br />
17, Send Dawkins a letter<br />
18, Read Joyce<br />
21, Read GTD<br />
22, Spend 3 months in another country<br />
23, Organise all my DVDs<br />
24, Swim with sharks<br />
25, Paraglide<br />
26, Learn to play bongos<br />
27, Skydive<br />
28, Drive Offroad<br />
29, Do a banger rally<br />
30, Have a track day<br />
31, Hire the whole of Salvos Salumeria for an evening<br />
32, Bungee Jump<br />
34, Vinyard tour<br />
36, See Northern Lights<br />
38, Take dad to an opera<br />
39, Take Mum, Dad and Carrie to the Welsh Mountain Zoo<br />
40, Do 1000 things in London<br />
41, Do a standup comedy course<br />
42, Visit Japan<br />
49, Work only 100 days in a year<br />
51, Investigate Visa situation for Australia<br />
52, Investigate Visa situation for US<br />
54, Grow mushrooms<br />
55, Paint a water colour<br />
56, Make beer<br />
57, Make wine<br />
58, Cook a 4 course meal for 20 friends<br />
59, Do a photography course<br />
63, Attend NIM<br />
69, Learn to dance<br />
70, Learn to play golf<br />
71, Learn 10 magic tricks<br />
73, Make a Dots and Boxes program<br />
79, Raise £5005 for charity<br />
80, Talk about Free Software at a school<br />
87, Go on wine tasting course<br />
93, Go to Melbourne Comedy Festival<br />
99, Have a completion party<br />
65, Volunteer  for the mountain bothys association<br />
98, Have done 2/3 by day 666<br />
</em></p>
<p>I am quite heartened by the progress to be honest, considering that I spent half of last year working outside the UK, now things are settling down I should be able to churn through them faster.</p>
<p>If you want to join in on some, let me know!</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/02/27/101in1001-day400/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/02/27/101in1001-day400/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Signals In Ruby / &#8220;rescue Exception&#8221; considered harmful</title>
		<link>http://www.thattommyhall.com/2011/02/24/rescue-exception-harmful-signals-in-ruby/</link>
		<comments>http://www.thattommyhall.com/2011/02/24/rescue-exception-harmful-signals-in-ruby/#comments</comments>
		<pubDate>Thu, 24 Feb 2011 18:35:43 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=611</guid>
		<description><![CDATA[Yesterday we had an issue with the different behaviour of &#8220;kill &#8221; and &#8220;kill -9 &#8221; and in the process I had to refresh my knowledge of Unix signals, learn how you handle them in Ruby and properly learn Rubys exception hierarchy. To -9 or not to -9? The unix kill command is perhaps strangely [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday we had an issue with the different behaviour of &#8220;kill <OUR_APP>&#8221; and &#8220;kill -9 <OUR_APP>&#8221; and in the process I had to refresh my knowledge of Unix signals, learn how you handle them in Ruby and properly learn Rubys exception hierarchy.</p>
<p><strong>To -9 or not to -9?</strong><br />
The unix kill command is perhaps strangely named as it actually sends signals to processes (see &#8220;man signal&#8221; for a full list). It defaults to sending SIGTERM to the process and the application writer can decide how to treat it by &#8220;trapping&#8221; it, allowing for a safe shutdown or debug dumps etc. &#8220;kill -9&#8243; sends SIGKILL and the signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored by your programs.<br />
I think in the first instance you should just use &#8220;kill&#8221;, give the app the chance to do the right thing then get -9 on its ass if you need to.</p>
<p><strong>Catching signals in Ruby</strong></p>
<pre class="brush: ruby; title: ; notranslate">puts &quot;I have PID #{Process.pid}&quot;

Signal.trap(&quot;USR1&quot;) {puts &quot;prodded me&quot;}

loop do
  sleep 5
  puts &quot;doing stuff&quot;
end</pre>
<p>Is about the simplest code that will trap the &#8220;USR1&#8243; signal (which you can send with &#8220;kill -USR1 <APPNAME>&#8220;). The USR1 and USR2 signals are left free for you to use however you wish in your applications.</p>
<p>If you look at the image below you can see that it responds to the USR1 signal I send it and kill (ie sending SIGTERM) works also.<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2011/02/1-simple-small.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/02/1-simple-small.png" alt="" title="1-simple-small" width="661" height="168" class="alignleft size-full wp-image-618" /></a></p>
<p>The following two code snippets are the same except one takes the default and the other catches Exception (ie <strong>any</strong> exception)</p>
<pre class="brush: ruby; title: ; notranslate">#sig-rescue.rb
puts &quot;I have PID #{Process.pid}&quot;

Signal.trap(&quot;USR1&quot;) {puts &quot;prodded me&quot;}

loop do
  begin
  puts &quot;doing stuff&quot;
  sleep 10
  rescue =&gt; e
    puts e.inspect
  end
end</pre>
<p><a href="http://www.thattommyhall.com/wp-content/uploads/2011/02/2-rescue-small.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/02/2-rescue-small.png" alt="" title="2-rescue-small" width="662" height="140" class="alignleft size-full wp-image-619" /></a><br />
So that still works as before and errors in our &#8220;do stuff&#8221; loop would get caught.</p>
<pre class="brush: ruby; title: ; notranslate">#sig-rescue-E.rb
puts &quot;I have PID #{Process.pid}&quot;

Signal.trap(&quot;USR1&quot;) {puts &quot;prodded me&quot;}

loop do
  begin
  puts &quot;doing stuff&quot;
  sleep 10
  rescue Exception =&gt; e
    puts e.inspect
  end
end</pre>
<p><a href="http://www.thattommyhall.com/wp-content/uploads/2011/02/3-rescue-E-small.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/02/3-rescue-E-small.png" alt="" title="3-rescue-E-small" width="664" height="212" class="alignleft size-full wp-image-620" /></a><br />
This fails though. You can see that SIGTERM no longer works and CTRL-C from the terminal does not work also. This is because we are catching the SignalException when we do &#8220;rescue Exception&#8221;. Kill -9 worked though, as it will kill any application as the signal cannot be caught.</p>
<p><strong>Rubys Exception Heirachy</strong><br />
The full exception heirachy (from the excellent <a href="http://blog.nicksieger.com/articles/2006/09/06/rubys-exception-hierarchy">cheat gem</a>) is </p>
<pre class="brush: plain; title: ; notranslate">Tom-Halls-MacBook-Pro:signal tomh$ cheat exceptions
exceptions:
  Exception
   NoMemoryError
   ScriptError
     LoadError
     NotImplementedError
     SyntaxError
   SignalException
     Interrupt
       Timeout::Error    # require 'timeout' for Timeout::Error
   StandardError         # caught by rescue if no type is specified
     ArgumentError
     IOError
       EOFError
     IndexError
     LocalJumpError
     NameError
       NoMethodError
     RangeError
       FloatDomainError
     RegexpError
     RuntimeError
     SecurityError
     SocketError
     SystemCallError
     SystemStackError
     ThreadError
     TypeError
     ZeroDivisionError
   SystemExit
   fatal
</pre>
<p>I think you should only catch StandardError or its children, possibly some of its siblings and avoid catching Exception as you probably dont want to change how the process deals with signals (you could trap them if you need to)</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/02/24/rescue-exception-harmful-signals-in-ruby/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/02/24/rescue-exception-harmful-signals-in-ruby/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Ruby On Windows &#8211; Forking other processes</title>
		<link>http://www.thattommyhall.com/2011/02/20/ruby-on-windows-running-other-executables/</link>
		<comments>http://www.thattommyhall.com/2011/02/20/ruby-on-windows-running-other-executables/#comments</comments>
		<pubDate>Sun, 20 Feb 2011 23:08:11 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[VMware]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=540</guid>
		<description><![CDATA[While moving our VM deployment site written in Sinatra to a Windows machine with the VMware PowerCLI toolkit installed the only snag was where we forked a process to do the preparation of the machines. Both Kernel.fork and Process.detach seemed to have issues. Original MRI on Linux IronRuby We tried IronRuby and the same bit [...]]]></description>
			<content:encoded><![CDATA[<p>While moving our VM deployment site written in Sinatra to a Windows machine with the VMware PowerCLI toolkit installed the only snag was where we forked a process to do the preparation of the machines. Both Kernel.fork and Process.detach seemed to have issues.</p>
<p><strong>Original MRI on Linux<br />
</strong></p>
<pre class="brush: ruby; title: ; notranslate">
  def build
    pid = fork { run_command }
    Process.detach(pid)
  end

  def run_command
    `sudo /opt/script/deployserver/setupnewserver.sh -p #{poolserver} -i #{ip} -s #{@size} -v #{@vlan} -a &quot;#{@owner}&quot; -n #{@name} -e &quot;#{@email}&quot;`
  end
</pre>
<p><strong>IronRuby</strong><br />
We tried IronRuby and the same bit of the script broke as on win32 MRI (though I was pleased and surprised that Sinatra worked)</p>
<pre class="brush: ruby; title: ; notranslate">
  def build
    WindowsProcess.start &quot;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe&quot;,
&quot;-PSConsoleFile \&quot;C:\\Program Files (x86)\\VMware\\Infrastructure\\vSphere PowerCLI\\vim.psc1\&quot; \&quot;&amp; C:\\script\\DataStoreUsage.ps1\&quot;&quot;
  end
</pre>
<p>Using the following DotNet code</p>
<pre class="brush: ruby; title: ; notranslate">
class WindowsProcess
  def self.start(file, arguments)
    process = System::Diagnostics::Process.new
    process.StartInfo.FileName = file
    process.StartInfo.CreateNoWindow = true
    process.StartInfo.Arguments = arguments
    process.Start
  end
end
</pre>
<p><strong>Workaround using Windows &#8220;start&#8221; command</strong><br />
I had hoped the module at <a href="http://win32utils.rubyforge.org/">win32utils</a> would let me just use the original script but fork did not work properly still.</p>
<pre class="brush: ruby; title: ; notranslate">
def build
  commandstr = &quot;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe -PSConsoleFile \&quot;C:\\Program Files (x86)\\VMware\\Infrastructure\\vSphere PowerCLI\\vim.psc1\&quot; \&quot;&amp; C:\\Sites\\vmdeploy\\PrepNewMachine.ps1 -type #{@type} -machinename #{@name} -size #{@size} -vlan #{@vlan} -creator #{@owner} -creatoremail #{@email} -ipaddress #{ip}&quot;

  system (&quot;start #{commandstr} &gt; ./log/#{@name}.log 2&gt;&amp;1&quot;)
end
</pre>
<p>This uses the windows &#8220;start&#8221; command and works pretty well.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/02/20/ruby-on-windows-running-other-executables/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/02/20/ruby-on-windows-running-other-executables/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Measuring Disk Usage In Linux (%iowait vs IOPS)</title>
		<link>http://www.thattommyhall.com/2011/02/18/iops-linux-iostat/</link>
		<comments>http://www.thattommyhall.com/2011/02/18/iops-linux-iostat/#comments</comments>
		<pubDate>Fri, 18 Feb 2011 15:43:20 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[linux]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=572</guid>
		<description><![CDATA[This occurred to me when looking at our Hadoop servers today, lots of our devs use IOWait as an indicator of IO performance but there are better measures. IOWait is a CPU metric, measuring the percent of time the CPU is idle, but waiting for an I/O to complete. Strangely &#8211; It is possible to [...]]]></description>
			<content:encoded><![CDATA[<p>This occurred to me when looking at our Hadoop servers today, lots of our devs use IOWait as an indicator of IO performance but there are better measures. IOWait is a CPU metric, measuring the percent of time the CPU is idle, but waiting for an I/O to complete. Strangely &#8211; <strong>It is possible to have healthy system with nearly 100% iowait, or have a disk bottleneck with 0% iowait.</strong> A much better metric is to look at disk IO directly and you want to find the IOPS (IO Operations Per Second). </p>
<p><strong>Measuring IOPS</strong><br />
In linux I like the iostat command, though there are a few ways to get at the info. In debian/ubuntu it is in the sysstat package (ie: sudo apt-get install sysstat)</p>
<pre class="brush: plain; title: ; notranslate">
root@MACHINENAME:/home/deploy# iostat 1
Linux 2.6.24-28-server (MACHINENAME.forward.co.uk) 	18/02/11
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    45.51    0.00    1.85    0.62       0.00       52.03

Device:        tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cciss/c0d0     4.00       0.00       40.00          0       40
cciss/c0d1     4.00       0.00       64.00          0       64
cciss/c0d2    12.00       0.00      248.00          0       248
cciss/c0d3     0.00       0.00        0.00          0       0
cciss/c0d4    25.00       0.00      320.00          0       320
cciss/c0d5     0.00       0.00        0.00          0       0
cciss/c0d6    30.00       0.00      344.00          0       344
cciss/c0d7    42.00    3144.00        0.00         3144     0
</pre>
<p>iostat 1 refreshes everysecond, if you do it over a longer period it will average the results. tps is what you are interested in, Transactions Per Second (ie IOPS). -x will give a more detailed output and separate out reads and writes and let you know how much data is going in and out per second.</p>
<p><strong>What is a good or bad number though?</strong><br />
As with most metrics, if the first time you look at it is when you are in trouble then it&#8217;s less helpful. You should have an idea of how much IO you typically do, then if you experience issues and are doing 10x that or only getting 1/10 from the disks then you have a good candidate explanation for them.</p>
<p><strong>How much can I expect from my storage?</strong><br />
It depends how fast the disks are spinning, and how many there is.<br />
As a rule of thumb I assume for a single disk:<br />
7.2k RPM -> ~100 IOPS<br />
10k RPM -> ~150 IOPS<br />
15k RPM -> ~200 IOPS<br />
Our hadoop servers were pushing about 70 IOPS to each disk at peak and they are 7.2k ones so that is in line with this estimate.</p>
<p>See <a href="http://www.zdnetasia.com/calculate-iops-in-a-storage-array-62061792.htm">here</a> for a breakdown of why these are good estimates for random IOs from a single disk. Interestingly a large amount of it comes from the latency of the platter spinning, which is why SSDs do so well for random IO (Compared to a 15k disk, ~50x for writes, ~200x reads)<br />
<strong>See also:</strong><br />
A concrete example of faster CPU causing higher %iowait while actually doing more transactions <a href="http://www.ee.pw.edu.pl/~pileckip/aix/iowait.htm">here</a></p>
<p>Extreme Linux Performance Monitoring and Tuning: <a href="http://www.ufsdump.org/papers/uuasc-june-2006.pdf">Part 1 (pdf)</a> and <a href="http://www.ufsdump.org/papers/io-tuning.pdf">Part 2 (pdf)</a> from <a href="http://www.ufsdump.org/">ufsdump.org/</a></p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/02/18/iops-linux-iostat/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/02/18/iops-linux-iostat/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Running Any Executable As A Windows Service (Ruby / Sinatra)</title>
		<link>http://www.thattommyhall.com/2011/02/14/srvany-sinatra-ruby-windows-service/</link>
		<comments>http://www.thattommyhall.com/2011/02/14/srvany-sinatra-ruby-windows-service/#comments</comments>
		<pubDate>Mon, 14 Feb 2011 13:06:52 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[Ruby]]></category>
		<category><![CDATA[sinatra]]></category>
		<category><![CDATA[VMware]]></category>
		<category><![CDATA[windows]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=541</guid>
		<description><![CDATA[While migrating an automated VM deployment page using a combination of Sinatra on Linux and Bash scripts using the Perl toolkit with a simpler script using the VMWare PowerCLI that I love so much I needed to create a windows service from the Sinatra App and had to do some googleing so I thought I [...]]]></description>
			<content:encoded><![CDATA[<p>While migrating an automated VM deployment page using a combination of <a href="http://www.sinatrarb.com/">Sinatra</a> on Linux and Bash scripts using the Perl toolkit with a simpler script using the VMWare PowerCLI that I <a href="http://www.thattommyhall.com/index.php?s=powercli&#038;submit=Search">love so much</a> I needed to create a windows service from the Sinatra App and had to do some googleing so I thought I would share how I did it.</p>
<p>You only need two things &#8211; the built-in &#8220;sc&#8221; command and an executable from <a href="https://www.microsoft.com/downloads/en/details.aspx?FamilyID=9d467a69-57ff-4ae7-96ee-b18c4790cffd&#038;displaylang=en">Windows Server 2003 Resource Kit Tools</a> called srvany (works with 2008 too). Get just that exe <a href="http://dl.dropbox.com/u/2039069/srvany.exe">here</a> (if you trust me of course <img src='http://www.thattommyhall.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  )</p>
<p><strong>Creating the service</strong><br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2011/02/1-CreateService.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/02/1-CreateService.png" alt="" title="1-CreateService" width="669" height="78" class="alignleft size-full wp-image-550" /></a><br />
<strong>Check it exists</strong><br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2011/02/2-service.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/02/2-service.png" alt="" title="2-service" width="537" height="25" class="alignleft size-full wp-image-552" /></a><br />
<strong>Set Parameters In The Registry</strong><br />
Configure it at HKLM/SYSTEM/CurrentControlSet/Services/APPNAME/Parameters<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2011/02/Screen-shot-2011-02-14-at-12.54.15.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2011/02/Screen-shot-2011-02-14-at-12.54.15.png" alt="" title="Screen shot 2011-02-14 at 12.54.15" width="725" height="165" class="alignleft size-full wp-image-561" /></a></p>
<pre class="brush: plain; title: ; notranslate">Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\VMdeploy\Parameters]
&quot;Application&quot;=&quot;C:\\Ruby192\\bin\\ruby&quot;
&quot;AppParameters&quot;=&quot;C:\\Sites\\vmdeploy\\server.rb -p 80&quot;
&quot;AppDirectory&quot;=&quot;C:\\Sites\\vmdeploy&quot;
&quot;AppEnvironment&quot;=hex(7):65,00,78,00,61,00,6d,00,70,00,6c,00,65,00,3d,00,32,00,\
  37,00,00,00,62,00,6c,00,61,00,68,00,3d,00,63,00,3a,00,5c,00,74,00,65,00,6d,\
  00,70,00,66,00,69,00,6c,00,65,00,73,00,00,00,00,00</pre>
<p>Note the AppEnvironment is a multiline string, the rest are strings</p>
<p>This lets you run any executable file, change the directory you run it from and pass any arguments or environment variables so should cover most use cases.</p>
<p>I will be sharing the code for both the Sinatra app and the PowerShell deploy script in later posts.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/02/14/srvany-sinatra-ruby-windows-service/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/02/14/srvany-sinatra-ruby-windows-service/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>2010 Retrospective</title>
		<link>http://www.thattommyhall.com/2011/01/08/2010-retrospective/</link>
		<comments>http://www.thattommyhall.com/2011/01/08/2010-retrospective/#comments</comments>
		<pubDate>Sat, 08 Jan 2011 17:53:56 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>
		<category><![CDATA[Life]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=524</guid>
		<description><![CDATA[Well, 2010 was a great year &#8211; thanks to all the great people who made it so. Just a brief outline: Work: Started the year down in Kent working on a XenApp deployment on VMware at the beginning of the year, then over in Dubai building a VMware View solution, went to Libya and built [...]]]></description>
			<content:encoded><![CDATA[<p>Well, 2010 was a great year &#8211; thanks to all the great people who made it so. </p>
<p><strong>Just a brief outline:</strong></p>
<p><strong>Work</strong>: Started the year down in Kent working on a XenApp deployment on VMware at the beginning of the year, then over in Dubai building a VMware View solution, went to Libya and built the corporate infrastructure for their biggest telco, spent 2 months working on INGs next gen datacentre in the Hague and ended in London for Forward working on merging USwitch&#8217;s infrastructure after the acquisition.<br />
While this was fun, I got fatigued with traveling and living out of a suitcase and was very happy to settle down a bit in London (which I think is the greatest city in the world) and work for a great company which I am very pleased and proud to say I have joined permanently. It is exciting to work somewhere using so much great tech and with so many sound people (with huge brains).</p>
<p><strong>Jollys</strong>: I managed to go to Russia, visiting Moscow and St Petersburg (OK, that was Dec 2009 but what the hell), the Czech Republic, Italy, go to Oklahoma for a friends wedding followed by a visit to two <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Ancient_Pueblo_Peoples">ancestral puebloan</a> sites, <a href="http://www.thattommyhall.com/2010/10/03/beautiful-berli/">Berlin</a>, Paris, <a href="http://www.thattommyhall.com/2010/12/17/forward-to-vegas/">Vegas</a> and <a href="http://www.thattommyhall.com/2010/12/16/egypt-trip/">Egypt</a>. I feel privileged to have the opportunity to have done all this and am still a bit amazed at just how much happened.</p>
<p><strong>Life: </strong>I have always thought their was an embarrassment of riches when I look at the great people I get to call my friends, I don&#8217;t get to see any of them enough and no amount is too much. Thanks to you all, I always say you are what makes the universe great for me, you conspire to make it so even though you don&#8217;t all know one another. The biggest change this year was finding someone patient enough to pair up with me full time, thanks Petra &#8211; I love you. We are moving in together soon and I look forward to having a permanent home in the greatest city in the world, with a spare room &#8211; please come visit!. Thanks to my family too, we had some bad news in 2010 but you all dealt with it with the usual aplomb, you are all ace.</p>
<p><strong>Resolutions</strong> There is no new resolutions for 2010, I will be reporting on the <a href="http://www.thattommyhall.com/2010/04/15/101-goals-100-day-update/">101 goals in 1001 days</a> soon.</p>
<p><strong>2011: </strong>Looks set to be the best year ever, thanks in advance for helping <a href="http://www.coachbarrow.com/blog/wp-content/uploads/2009/06/make-it-so.gif">make it so</a></p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2011/01/08/2010-retrospective/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2011/01/08/2010-retrospective/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Donating To Wikipedia</title>
		<link>http://www.thattommyhall.com/2010/12/31/donating-to-wikipedia/</link>
		<comments>http://www.thattommyhall.com/2010/12/31/donating-to-wikipedia/#comments</comments>
		<pubDate>Fri, 31 Dec 2010 03:56:47 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=514</guid>
		<description><![CDATA[I realised when auditing my delicious bookmarks recently how much I rely on Wikipedia to look things up and today donated for the first time. I had previously moaned about seeing Jimmy Wales&#8217;s face every time I logged in, like this and laughed my head off at this piss take from The Daily What: I [...]]]></description>
			<content:encoded><![CDATA[<p>I realised when auditing my delicious bookmarks recently how much I rely on Wikipedia to look things up and today donated for the first time. </p>
<p>I had previously moaned about seeing Jimmy Wales&#8217;s face every time I logged in, like this<br />
<img alt="" src="http://static02.mediaite.com/geekosystem/uploads/2010/11/jimmy-wales-wikipedia-appeal.png" title="face" class="alignnone" width="550" height="284" /><br />
 and laughed my head off at this piss take from <a href="http://thedailywh.at/post/1620254631/this-looks-shopped-of-the-day-the-next-logical">The Daily What:</a><br />
<img alt="JimmyFace" src="http://27.media.tumblr.com/tumblr_lc50dmfjCt1qzrlhgo1_r1_500.png" title="JimmyFace" class="alignnone" width="500" height="459" /></p>
<p>I found today at <a href="http://www.informationisbeautiful.net/2010/the-science-behind-wikipedias-jimmy-appeal/">Information Is Beautiful</a> the following demonstration of just how effective the campaign has been though.<br />
<img alt="" src="http://infobeautiful2.s3.amazonaws.com/wikipedia_jimmy_appeal.png" title="WhyJimFace" class="alignnone" width="550" height="700" /></p>
<p>Wikimedia have done some <a href="https://secure.wikimedia.org/wikipedia/meta/wiki/Fundraising_2010/Banner_testing">nice analysis of the campaign</a> on the Meta Wiki if you are interested.</p>
<p><a href="http://wikimediafoundation.org/wiki/WMFJT002/GB?utm_medium=sitenotice&#038;utm_campaign=20101229JT001_UK&#038;utm_source=20101229_JAT002_UK&#038;country_code=GB">Give to Wikipedia here</a></p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/12/31/donating-to-wikipedia/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/12/31/donating-to-wikipedia/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Load Based Nic Teaming vs Link Aggregation</title>
		<link>http://www.thattommyhall.com/2010/12/21/load-based-nic-teaming-vs-link-aggregation/</link>
		<comments>http://www.thattommyhall.com/2010/12/21/load-based-nic-teaming-vs-link-aggregation/#comments</comments>
		<pubDate>Tue, 21 Dec 2010 22:26:29 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[VMware]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=488</guid>
		<description><![CDATA[I remembered seeing Simon Long&#8217;s comment on twitter a few weeks ago and it was rattling around in the back of my mind. Will #VMware Load-Based Teaming remove the need for #Cisco EtherChannel? Discuss&#8230;. I long ago investigated NIC Teaming algorithms and settled on IP Hash with Cisco Etherchannels for most environments, only really using [...]]]></description>
			<content:encoded><![CDATA[<p>I remembered seeing Simon Long&#8217;s <a href="https://twitter.com/#!/SimonLong_/status/14599422625193984">comment on twitter</a> a few weeks ago and it was rattling around in the back of my mind.</p>
<blockquote><p>Will #VMware Load-Based Teaming remove the need for #Cisco EtherChannel? Discuss&#8230;.</p></blockquote>
<p>I long ago investigated NIC Teaming algorithms and settled on IP Hash with <a href="https://secure.wikimedia.org/wikipedia/en/wiki/EtherChannel">Cisco Etherchannels</a> for most environments, only really using something else if the client happened not have stacked switches. Thanks to Scott Lowe for <a href="http://blog.scottlowe.org/2006/12/04/esx-server-nic-teaming-and-vlan-trunking/">this superb article</a> on the matter.</p>
<p>When vSphere 4.1 came out with <a href="http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&#038;cmd=displayKC&#038;externalId=1022590">Load Based Teaming</a>, I was pleased that at last we had an algorithm that would have a go at proper load balancing and not just load distribution but had not got round to investigating much more.</p>
<p>At Forward we have just updated to 4.1, Enterprise Plus and have bought some shiny new Extreme <a href="http://www.extremenetworks.com/products/summit-x650.aspx">Summit X650 Series</a> 10G switches; so Simon&#8217;s comment was particularly apropos. </p>
<p>I had decided I wanted to try and use LBT but was unsure if I should port-channel the uplink ports. It turns out you can&#8217;t. I thought maybe you should to be honest, it does not mention in the dvSwitch guide as far as I can see but the <a href="http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&#038;cmd=displayKC&#038;externalId=1001938">ESX host requirements for link aggregation</a> KB (updated today) is very clear </p>
<blockquote><ul>
<li>The switch must be set to perform 802.3ad link aggregation in static mode ON and the virtual switch must have its load balancing method set to Route based on IP hash.</li>
<li>Enabling either Route based on IP hash without 802.3ad aggregation or vice-versa disrupts networking</li>
</ul>
</blockquote>
<p>ie you need both IP Hash and EtherChannel and neither will work without the other.</p>
<p>In answer to Simon&#8217;s question, my feeling is you may still get better performance from EtherChannel and IP based hash for some workloads but would guess &#8220;usually&#8221; LBT wins. I think the case where you may get better utilisation is when certain VMs have very high bandwidth requirements to different IPs. As described <a href="https://kensvirtualreality.wordpress.com/2009/04/05/the-great-vswitch-debate%E2%80%93part-3/">here</a> IP Hash is the only way to allow traffic from one vNIC to leave over different pNICs at the same time.</p>
<p>It is interesting that even with LBT bandwidth is still limited to the maximum bandwidth a single pNIC can provide for individual VMs / vmkernels, also IP hash will not get higher than a single pNIC for a vMotion or other point to point connections. So 10G is going to perform better for these operations than 10x1G, however you team them.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/12/21/load-based-nic-teaming-vs-link-aggregation/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/12/21/load-based-nic-teaming-vs-link-aggregation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hyper9 Saves The Day</title>
		<link>http://www.thattommyhall.com/2010/12/20/hyper9-saves-the-day/</link>
		<comments>http://www.thattommyhall.com/2010/12/20/hyper9-saves-the-day/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 14:43:16 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[VMware]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=468</guid>
		<description><![CDATA[We recently bought the Hyper9 capacity planning, reporting and monitoring solution for our VMware infrastructure and I quite soon made use of it to troubleshoot some problems reported to us like backups taking longer and databases being slower than normal. In the 3par storage I could see that IO was unusually high of late. Then [...]]]></description>
			<content:encoded><![CDATA[<p>We recently bought the <a href="http://www.hyper9.com/product_overview.aspx">Hyper9</a> capacity planning, reporting and monitoring solution for our VMware infrastructure and I quite soon made use of it to troubleshoot some problems reported to us like backups taking longer and databases being slower than normal.</p>
<p>In the 3par storage I could see that IO was unusually high of late.<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2010/12/1-3par.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/1-3par-300x282.png" alt="" title="1-3par" width="300" height="282" class="alignleft size-medium wp-image-469" /></a></p>
<p>Then I looked at the top-n datastores by IOPS and graphed them<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2010/12/2-datastores.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/2-datastores-300x190.png" alt="" title="2-datastores" width="300" height="190" class="alignleft size-medium wp-image-470" /></a><br />
A huge jump for sharedstorage8, so I looked at its VMs<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2010/12/3-vms.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/3-vms-300x187.png" alt="" title="3-vms" width="300" height="187" class="alignleft size-medium wp-image-472" /></a><br />
and found the culprit VM.</p>
<p>Here it is against our big &#8220;Superhero&#8221; database and the vCenter server with the DB.<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2010/12/4-VSvCenter.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/4-VSvCenter-300x224.png" alt="" title="4-VSvCenter" width="300" height="224" class="alignleft size-medium wp-image-473" /></a><br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2010/12/5-top3.png"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/5-top3-300x233.png" alt="" title="5-top3" width="300" height="233" class="alignleft size-medium wp-image-474" /></a></p>
<p>A lot of IO from a machine the owner thought was doing nothing!</p>
<p>Hyper9 is a pretty good tool for reporting, alerting and troubleshooting your VMware infrastructure, the query language is lucene based and this gives you lots of options in creating custom views and alerts.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/12/20/hyper9-saves-the-day/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/12/20/hyper9-saves-the-day/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Visualising Tommy</title>
		<link>http://www.thattommyhall.com/2010/12/19/visualising-tommy/</link>
		<comments>http://www.thattommyhall.com/2010/12/19/visualising-tommy/#comments</comments>
		<pubDate>Sun, 19 Dec 2010 16:18:03 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[random]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=415</guid>
		<description><![CDATA[Here are two Wordle visualisations, first the words used in my blog. I like reading obviously but it seems to be quite heavy on stuff from the single article I wrote on Rhipe &#8211; unusual words I suppose. The second is my del.icio.us tags, when is later going to happen? With delicious closing I am [...]]]></description>
			<content:encoded><![CDATA[<p>Here are two <a href="http://www.wordle.net">Wordle</a> visualisations, first the words used in my blog. I like reading obviously but it seems to be quite heavy on stuff from the single <a href="http://www.thattommyhall.com/2010/11/20/using-r-on-hadoop-with-rhipe/">article I wrote on Rhipe</a> &#8211; unusual words I suppose.<br />
<a href="http://www.wordle.net/show/wrdl/2892354/thattommyhall_Blog"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/blogTagCloud-300x194.png" alt="" title="blogTagCloud" width="300" height="194" class="alignleft size-medium wp-image-418" /></a></p>
<p>The second is my <a href="http://www.delicious.com/thattommyhall">del.icio.us</a> tags, when is later going to happen?<br />
<a href="http://www.wordle.net/show/wrdl/2892361/thattommyhall_delicious"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/DeliciousTagCloud-300x197.png" alt="" title="DeliciousTagCloud" width="300" height="197" class="alignleft size-medium wp-image-419" /></a></p>
<p>With delicious closing I am looking at alternatives, send recommendations if you have any.</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/12/19/visualising-tommy/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/12/19/visualising-tommy/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Forward To Vegas</title>
		<link>http://www.thattommyhall.com/2010/12/17/forward-to-vegas/</link>
		<comments>http://www.thattommyhall.com/2010/12/17/forward-to-vegas/#comments</comments>
		<pubDate>Fri, 17 Dec 2010 18:10:50 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[travel]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=456</guid>
		<description><![CDATA[Well, I guess I need to say something about my company Christmas present to all of their ~150 staff, a three day trip to Vegas. We flew out last Thursday and stayed for 3 nights at the Wynn, which is a great hotel. Thursday: Arrive and sleep (forgive me I only just got back from [...]]]></description>
			<content:encoded><![CDATA[<p>Well, I guess I need to say something about my company Christmas present to all of their ~150 staff, a three day trip to Vegas.</p>
<p>We flew out last Thursday and stayed for 3 nights at the Wynn, which is a great hotel.<br />
<a href="http://www.flickr.com/photos/thattommyhall/5264466446/" title="Vegas2010-46.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5165/5264466446_4b4467e4cb.jpg" width="500" height="375" alt="Vegas2010-46.JPG" /></a></p>
<p>Thursday: Arrive and sleep (forgive me I only just got back from Egypt after flying to Manchester instead of London and having to sit all night on a freezing cold coach!). </p>
<p>Friday: Flew in a helicopter into the Grand Canyon which was rather awesome.<br />
<a href="http://www.flickr.com/photos/thattommyhall/5264403368/" title="Vegas2010-12.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5083/5264403368_143969f962.jpg" width="500" height="375" alt="Vegas2010-12.JPG" /></a><br />
<a href="http://www.flickr.com/photos/thattommyhall/5264433930/" title="Vegas2010-29.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5164/5264433930_b05b534bba.jpg" width="500" height="375" alt="Vegas2010-29.JPG" /></a></p>
<p>Sat: Went to see <a href="http://www.cirquedusoleil.com/en/shows/zumanity/home.aspx">Zumanity</a> by Cirque du Soleil, it was amazing, I must see them again.<br />
<a href="http://www.thattommyhall.com/wp-content/uploads/2010/12/zumanity_1.jpg"><img src="http://www.thattommyhall.com/wp-content/uploads/2010/12/zumanity_1.jpg" alt="" title="zumanity_1" width="329" height="381" class="alignleft size-full wp-image-458" /></a></p>
<p>Sunday: A few of us went shooting,<br />
An M16<br />
<a href="http://www.flickr.com/photos/thattommyhall/5263882651/" title="Vegas2010-61.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5167/5263882651_5984f6f4a0.jpg" width="500" height="375" alt="Vegas2010-61.JPG" /></a></p>
<p>An H&#038;K MP5<br />
<a href="http://www.flickr.com/photos/thattommyhall/5264488850/" title="Vegas2010-59.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5289/5264488850_b5e2857f97.jpg" width="500" height="375" alt="Vegas2010-59.JPG" /></a></p>
<p>A Mac-10<br />
<a href="http://www.flickr.com/photos/thattommyhall/5263893221/" title="Vegas2010-66.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5008/5263893221_8f3ef7c17d.jpg" width="500" height="375" alt="Vegas2010-66.JPG" /></a><br />
not firing it gangster style unfortunately.</p>
<p>A Tommygun<br />
<a href="http://www.flickr.com/photos/thattommyhall/5264494024/" title="Vegas2010-62.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5169/5264494024_e099283521.jpg" width="500" height="375" alt="Vegas2010-62.JPG" /></a></p>
<p>A Desert Eagle<br />
<object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/nRvP-AZrLqw?fs=1&amp;hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/nRvP-AZrLqw?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>And I did a Shotgun<br />
<object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/C6p4n_xv1oE?fs=1&amp;hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/C6p4n_xv1oE?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>And I&#8217;ve got the T-shirt to prove it<br />
<a href="http://www.flickr.com/photos/thattommyhall/5264508628/" title="Vegas2010-69.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5127/5264508628_d193d46813.jpg" width="500" height="375" alt="Vegas2010-69.JPG" /></a><br />
Peace Through Superior Firepower indeed</p>
<p><a href="http://www.flickr.com/photos/thattommyhall/sets/72157625605862860/">Pics On Flickr</a></p>
<p><a href="http://www.youtube.com/view_play_list?p=BD0C4777569C2E46">Vids On Youtube</a></p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/12/17/forward-to-vegas/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/12/17/forward-to-vegas/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Egypt Trip</title>
		<link>http://www.thattommyhall.com/2010/12/16/egypt-trip/</link>
		<comments>http://www.thattommyhall.com/2010/12/16/egypt-trip/#comments</comments>
		<pubDate>Thu, 16 Dec 2010 16:26:05 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>
		<category><![CDATA[archaeology]]></category>
		<category><![CDATA[travel]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=430</guid>
		<description><![CDATA[I has been a crazy few weeks, after being in the UK only 2 days after the Egypt trip I went to Las Vegas with Forward and I am just getting my head back together now. You know you are a huge geek when you take as much space for books as clothes. The trip [...]]]></description>
			<content:encoded><![CDATA[<p>I has been a crazy few weeks, after being in the UK only 2 days after the Egypt trip I went to Las Vegas with Forward and I am just getting my head back together now.</p>
<p>You know you are a huge geek when you take as much space for books as clothes.<br />
<a href="http://www.flickr.com/photos/thattommyhall/5266864589/" title="Egypt2010-00.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5202/5266864589_d74ceef239.jpg" width="500" height="375" alt="Egypt2010-00.JPG" /></a></p>
<p>The trip was ace, a week on the <a href="http://www.traveline-eg.com/CSP/Default.aspx?A=2&#038;P=52">M/S Hamees</a>, Stopping at:<br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Valley_of_the_Kings">Valley of the Kings</a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Mortuary_Temple_of_Hatshepsut">Temple of Queen Hatshepsut</a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Colossi_of_Memnon">The Colossi of Memnon</a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Temple_of_Edfu">Edfu Temple</a><br />
<a href="http://www.flickr.com/photos/thattommyhall/5267493582/" title="Egypt2010-32.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5288/5267493582_12fbc49a7d.jpg" width="500" height="375" alt="Egypt2010-32.JPG" /></a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Kom_Ombo">Kom Ombo, </a>which I had not heard of but has some extraordinarily vibrant original colour remaining<br />
<a href="http://www.flickr.com/photos/thattommyhall/5266868123/" title="Egypt2010-05.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5128/5266868123_ea8178f8c6.jpg" width="375" height="500" alt="Egypt2010-05.JPG" /></a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Abu_Simbel_temples">Abu Simbel</a><br />
<a href="http://www.flickr.com/photos/thattommyhall/5266882081/" title="Egypt2010-26.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5283/5266882081_e25b402be5.jpg" width="500" height="375" alt="Egypt2010-26.JPG" /></a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Karnak">Karnak Temple</a><br />
<a href="http://www.flickr.com/photos/thattommyhall/5267499736/" title="Egypt2010-39.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5168/5267499736_3c58bc0df6.jpg" width="500" height="375" alt="Egypt2010-39.JPG" /></a><br />
<a href="https://secure.wikimedia.org/wikipedia/en/wiki/Luxor_Temple">Luxor Temple</a> (we had the place to ourselves at night, it was amazing)<br />
<a href="http://www.flickr.com/photos/thattommyhall/5267502590/" title="Egypt2010-42.JPG by thattommyhall, on Flickr"><img src="http://farm6.static.flickr.com/5041/5267502590_5e0c32d674.jpg" width="500" height="375" alt="Egypt2010-42.JPG" /></a></p>
<p>5 Nights in <a href="http://www.moevenpick-hotels.com/en/pub/your_hotels/worldmap/el_gouna/overview.cfm">Movenpick El Gouna </a> on the Red Sea and 2 nights in the <a href="http://www.steigenberger.com/en/Luxor">Nile Palace</a></p>
<p>It was incredibly relaxing, nice to be incommunicado for 2 weeks and catch up on some reading and just mince around the beach. I managed to read Godel Escher Bach at last and also ticked off The Ode Less Travelled from my <a href="http://tinyurl.com/thattommyhall101">101 Goals</a>. It was weird not having my phone to distract me in spare moments, making me daydream more and think about people and events I have not thought about it ages, I should do it more often.</p>
<p>Pictures are <a href="http://www.flickr.com/photos/thattommyhall/sets/72157625613600066/">on flickr </a></p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/12/16/egypt-trip/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/12/16/egypt-trip/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Off To Egypt</title>
		<link>http://www.thattommyhall.com/2010/11/21/off-to-egypt/</link>
		<comments>http://www.thattommyhall.com/2010/11/21/off-to-egypt/#comments</comments>
		<pubDate>Sun, 21 Nov 2010 18:01:21 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>
		<category><![CDATA[archaeology]]></category>
		<category><![CDATA[jolly]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=413</guid>
		<description><![CDATA[I am off to Egypt for 2 weeks, a one week Nile cruise with Voyages Joules Verne starting at Luxor and working down to the Aswan High Dam via The Valley Of The Kings and ending in the Moevenpick Hotel in El Gouna. No Cairo, Pyramids or the Cairo Museum this trip, another time though. [...]]]></description>
			<content:encoded><![CDATA[<p>I am off to Egypt for 2 weeks, a one week Nile cruise with <a href="http://www.vjv.com/destinations/africa/egypt-tours/gift-nile/index.html">Voyages Joules Verne</a> starting at <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Luxor">Luxor</a> and working down to the <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Aswan_High_Dam">Aswan High Dam</a> via <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Valley_of_the_Kings">The Valley Of The Kings</a> and ending in the <a href="http://www.moevenpick-hotels.com/en/pub/your_hotels/worldmap/el_gouna/overview.cfm">Moevenpick Hotel</a> in El Gouna. No Cairo, Pyramids or the Cairo Museum this trip, another time though.</p>
<p>Im particularly excited about <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Abu_Simbel_temples">Abu Simbel,</a>, <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Chapelle_Rouge">The Red Chapel of Hatshepsut</a>, <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Mortuary_Temple_of_Amenhotep_III">Temple of Amenhotep III</a> among tons of others.</p>
<p>Also I am very much looking forward to <em>No Phone</em> and <em>No Computers</em> for the whole trip, the longest I will have been without either for for a good few years. I am taking the chance to catch up on some reading and take a break, I will have my Kindle with loads of books on it but am taking <a href="https://www.amazon.co.uk/Dots-boxes-Game-Sophisticated-Childs/dp/1568811292">The Dots-and-boxes Game: Sophisticated Child&#8217;s Play</a> (one of my 101 goals is to make a dots and boxes program), <a href="https://secure.wikimedia.org/wikipedia/en/wiki/G%C3%B6del,_Escher,_Bach">Gödel, Escher, Bach</a> (another 101, started 3 times and never finished and it has intrigued me since my first year in university) and <a href="https://secure.wikimedia.org/wikipedia/en/wiki/Winning_Ways_for_your_Mathematical_Plays">Winning Ways for your Mathematical Plays</a>. Pen&#8217;n'paper geekery for the win!</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/11/21/off-to-egypt/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/11/21/off-to-egypt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using R on Hadoop with Rhipe</title>
		<link>http://www.thattommyhall.com/2010/11/20/using-r-on-hadoop-with-rhipe/</link>
		<comments>http://www.thattommyhall.com/2010/11/20/using-r-on-hadoop-with-rhipe/#comments</comments>
		<pubDate>Sat, 20 Nov 2010 12:42:34 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[hadoop]]></category>
		<category><![CDATA[r]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=395</guid>
		<description><![CDATA[I spent a while this week getting Rhipe, a java package that integrates the R environment with Hadoop, to work. Forward are pretty heavy users of Hadoop and it&#8217;s supporting ecosystem so R will be another way for the devs to interrogate the huge (and rapidly growing!) datasets we have. Installing R Adding the repositry [...]]]></description>
			<content:encoded><![CDATA[<p>I spent a while this week getting <a href="http://www.stat.purdue.edu/~sguha/rhipe/">Rhipe</a>, a java package that integrates the R environment with Hadoop, to work. Forward are pretty heavy users of Hadoop and it&#8217;s supporting ecosystem so R will be another way for the devs to interrogate the huge (and rapidly growing!) datasets we have.</p>
<p><strong>Installing R</strong><br />
<em>Adding the repositry</em><br />
Create a new file at /etc/sources.list.d/R.list</p>
<pre class="brush: plain; title: ; notranslate">
#R repositry
deb http://rh-mirror.linux.iastate.edu/CRAN/bin/linux/ubuntu hardy/
</pre>
<p>(we are still using hardy, with the Cloudera packages)</p>
<p>Add the gpg keys for the repository</p>
<pre class="brush: plain; title: ; notranslate">
gpg --keyserver pgp.mit.edu --recv-key E2A11821
gpg -a --export E2A11821 | sudo apt-key add -
</pre>
<p><em>Install and update R</em><br />
Easy:</p>
<pre class="brush: plain; title: ; notranslate">$ sudo apt-get install r-base r-base-dev pkg-config littler
$ sudo R
&gt; update.packages()
</pre>
<p><em>Set environment variables for Rhipe</em><br />
Add to bottom of /etc/environment</p>
<pre class="brush: plain; title: ; notranslate">HADOOP=/usr</pre>
<p>create it for current session</p>
<pre class="brush: plain; title: ; notranslate">$ export HADOOP=/usr</pre>
<p><em>install protobuff</em></p>
<pre class="brush: plain; title: ; notranslate">
# wget http://protobuf.googlecode.com/files/protobuf-2.3.0.tar.bz2
# tar jxf protobuf-2.3.0.tar.bz2
# cd protobuf-2.3.0
# ./configure
# make
# make install
# ldconfig
</pre>
<p><em>install Rhipe</em></p>
<pre class="brush: plain; title: ; notranslate">
# wget http://www.stat.purdue.edu/~sguha/rhipe/dn/Rhipe_0.64.tar.gz
# R CMD INSTALL Rhipe_0.64.tar.gz
</pre>
<p>So all is well except that the test code <a href="http://www.stat.purdue.edu/~sguha/rhipe/doc/html/installation.html">here</a> is a bit off.</p>
<p>For me today</p>
<pre class="brush: plain; title: ; notranslate">&gt; library(Rhipe)</pre>
<p>Only works as root</p>
<p>It seems that</p>
<pre class="brush: plain; title: ; notranslate">&gt; rhwrite(list(1,2,3),&quot;/tmp/x&quot;)</pre>
<p>should be:</p>
<pre class="brush: plain; title: ; notranslate">&gt; rhwrite(list(1,2,3),&quot;/tmp/x&quot;,1)</pre>
<p>then </p>
<pre class="brush: plain; title: ; notranslate">&gt; rhread(&quot;/tmp/x&quot;)</pre>
<p>works properly.</p>
<p>Also in the longer example</p>
<pre class="brush: plain; gutter: true; title: ; notranslate">map &lt;- expression({
  lapply(seq_along(map.values),function(r){
    x &lt;- runif(map.values[[r]])
    rhcollect(map.keys[[r]],c(n=map.values[[r]],mean=mean(x),sd=sd(x)))
  })
})

## Create a job object
z &lt;- rhmr(map, ofolder=&quot;/tmp/test&quot;, inout=c('lapply','sequence'),
          N=10,mapred=list(mapred.reduce.tasks=0),jobname='test')

## Submit the job
rhex(z)

## Read the results
res &lt;- rhread('/tmp/test/p*')
colres  &lt;- do.call('rbind', lapply(res,&quot;[[&quot;,2))

colres
       n      mean        sd
 [1,]  1 0.4983786        NA
 [2,]  2 0.7683017 0.2937688
 [3,]  3 0.5936899 0.3425441
 [4,]  4 0.3699087 0.2666379
 [5,]  5 0.5179839 0.4060244
 [6,]  6 0.6278925 0.2952608
 [7,]  7 0.4920088 0.2785893
 [8,]  8 0.4592598 0.2674592
 [9,]  9 0.5734197 0.1928496
[10,] 10 0.4942676 0.2989538
</pre>
<p>Where line 16 has been changed from the original</p>
<pre class="brush: plain; title: ; notranslate">res &lt;- rhread('/tmp/test')
</pre>
<p>Thanks to <a href="http://www.stat.purdue.edu/~sguha/">Saptarshi Guha</a>, the author of Rhipe for so quickly responding to my query in the <a href="https://groups.google.com/group/rhipe?pli=1">group</a> and also the authors of <a href="https://stat.ethz.ch/pipermail/r-help/2009-February/187626.html">this discussion</a> on setting up R in Ubuntu</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/11/20/using-r-on-hadoop-with-rhipe/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/11/20/using-r-on-hadoop-with-rhipe/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Finding Primes In SICP</title>
		<link>http://www.thattommyhall.com/2010/11/02/finding-primes-in-sicp/</link>
		<comments>http://www.thattommyhall.com/2010/11/02/finding-primes-in-sicp/#comments</comments>
		<pubDate>Tue, 02 Nov 2010 16:20:21 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[lisp]]></category>
		<category><![CDATA[SICP]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=390</guid>
		<description><![CDATA[I was reading SICP over lunch and found this lovely footnote on probabilistic methods for deciding if a number is prime. (it is #47) Numbers that fool the Fermat test are called Carmichael numbers, and little is known about them other than that they are extremely rare. There are 255 Carmichael numbers below 100,000,000. The [...]]]></description>
			<content:encoded><![CDATA[<p>I was reading <a href="http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-11.html#%_sec_1.2.6">SICP</a> over lunch and found this lovely footnote on probabilistic methods for deciding if a number is prime. (it is #47)</p>
<blockquote><p>Numbers that fool the Fermat test are called Carmichael numbers, and little is known about them other than that they are extremely rare. There are 255 Carmichael numbers below 100,000,000. The smallest few are 561, 1105, 1729, 2465, 2821, and 6601. In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a &#8220;correct&#8221; algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering. </p></blockquote>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/11/02/finding-primes-in-sicp/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/11/02/finding-primes-in-sicp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Day 300 Update</title>
		<link>http://www.thattommyhall.com/2010/10/28/day-300-update/</link>
		<comments>http://www.thattommyhall.com/2010/10/28/day-300-update/#comments</comments>
		<pubDate>Thu, 28 Oct 2010 18:00:23 +0000</pubDate>
		<dc:creator>tom</dc:creator>
				<category><![CDATA[101]]></category>
		<category><![CDATA[Life]]></category>

		<guid isPermaLink="false">http://www.thattommyhall.com/?p=369</guid>
		<description><![CDATA[Well, it is day 300 of my 101 goals in 1001 days. So as per meta-goal 101 &#8220;Do 100 day updates&#8221; here is a quick report on my progress. Completed 66 &#8211; Via Feratta in Italy 85 &#8211; Visit Pergamon Museum 1 &#8211; Teetotalitarianism for 3 months 2 &#8211; Cheeseless for 3 months 11 -Reread [...]]]></description>
			<content:encoded><![CDATA[<p>Well, it is day 300 of my <a href="http://www.thattommyhall.com/2010/04/15/101-goals-100-day-update/">101 goals in 1001 days</a>. So as per meta-goal 101 &#8220;Do 100 day updates&#8221; here is a quick report on my progress.</p>
<p><strong>Completed</strong><br />
66 &#8211; Via Feratta in Italy<br />
85 &#8211; Visit Pergamon Museum<br />
1 &#8211; Teetotalitarianism for 3 months<br />
2 &#8211; Cheeseless for 3 months<br />
11 -Reread all Dennett books<br />
15 -<a href="http://www.pgdp.net/c/">Proofread for Project Guttenburg</a><br />
53 &#8211; Make Jam<br />
86 &#8211; Give Carrie a British Museum Tour<br />
48 &#8211; Create a Backblaze storage pod<br />
100	Set success criteria / progression metrics for each goal</p>
<p><strong>Changing</strong><br />
18 &#8211; Read epic literature<br />
A bit vague (in spite of meta-goal 100) and I have loads of reading goals so im changing it to &#8220;Read Joyce&#8221;, in particular:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Dubliners">Dubliners</a></li>
<li><a href="http://en.wikipedia.org/wiki/A_Portrait_of_the_Artist_as_a_Young_Man">A Portrait of the Artist as a Young Man</a></li>
<li><a href="http://en.wikipedia.org/wiki/Ulysses_%28novel%29">Ulysses</a></li>
<li><a href="http://en.wikipedia.org/wiki/Finnegans_Wake">Finnegans Wake</a></li>
</ul>
<p>26 -Scuba<br />
My ears are bad, I worry this may destroy them, not settled on a replacement goal yet.</p>
<p><strong>Sustained Effort &#8211; On Track</strong><br />
19 &#8211; Blog on average once a week<br />
50 &#8211; Move 10 people to FreeAgent<br />
68 &#8211; Complete Pimsleur Spanish<br />
88 &#8211; Go to the theatre on average once a month<br />
95 &#8211; Pay off all credit cards<br />
96 &#8211; Let loans run course and dont get any more<br />
101 &#8211; Do 100 day updates<br />
5 &#8211; Lose 2 stone<br />
13 &#8211; Release 303 books on bookcrossing.com<br />
76 -Do on average 1 Project Euler problem per week<br />
<img alt="" src="http://projecteuler.net/profile/thattommyhall.png" title="thattommyhall" class="alignnone" width="200" height="60" /></p>
<p><strong>Sustained Effort &#8211; Behind</strong><br />
8 &#8211; Read all the VSIs<br />
12 &#8211; Read all PG Wodehouse<br />
60 &#8211; Hike on average once a month<br />
79 &#8211; Raise £5005 for charity<br />
81 &#8211; Watch all TTC Art history DVDs<br />
90 &#8211; See all world heritage sites in the UK</p>
<p><strong>In Progress</strong><br />
91 &#8211; Memorise 10 poems<br />
92 &#8211; Read &#8220;An Ode Less Travelled&#8221;, do the exercises (but not share them!)<br />
72 &#8211; Read &#8220;Winning Ways&#8221;<br />
75 &#8211; Watch SICP, do exercises from book</p>
<p class="facebook"><a href="http://www.facebook.com/share.php?u=http://www.thattommyhall.com/2010/10/28/day-300-update/" target="_blank" title="Share on Facebook">Share on Facebook</a></p>]]></content:encoded>
			<wfw:commentRss>http://www.thattommyhall.com/2010/10/28/day-300-update/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

