Egypt Trip

December 16th 2010

I has been a crazy few weeks, after being in the UK only 2 days after the Egypt trip I went to Las Vegas with Forward and I am just getting my head back together now.

You know you are a huge geek when you take as much space for books as clothes.
Egypt2010-00.JPG

The trip was ace, a week on the M/S Hamees, Stopping at:
Valley of the Kings
Temple of Queen Hatshepsut
The Colossi of Memnon
Edfu Temple
Egypt2010-32.JPG
Kom Ombo, which I had not heard of but has some extraordinarily vibrant original colour remaining
Egypt2010-05.JPG
Abu Simbel
Egypt2010-26.JPG
Karnak Temple
Egypt2010-39.JPG
Luxor Temple (we had the place to ourselves at night, it was amazing)
Egypt2010-42.JPG

5 Nights in Movenpick El Gouna on the Red Sea and 2 nights in the Nile Palace

It was incredibly relaxing, nice to be incommunicado for 2 weeks and catch up on some reading and just mince around the beach. I managed to read Godel Escher Bach at last and also ticked off The Ode Less Travelled from my 101 Goals. It was weird not having my phone to distract me in spare moments, making me daydream more and think about people and events I have not thought about it ages, I should do it more often.

Pictures are on flickr

Posted by tom under 101 & archaeology & travel | No Comments »

Off To Egypt

November 21st 2010

I am off to Egypt for 2 weeks, a one week Nile cruise with Voyages Joules Verne starting at Luxor and working down to the Aswan High Dam via The Valley Of The Kings and ending in the Moevenpick Hotel in El Gouna. No Cairo, Pyramids or the Cairo Museum this trip, another time though.

Im particularly excited about Abu Simbel,, The Red Chapel of Hatshepsut, Temple of Amenhotep III among tons of others.

Also I am very much looking forward to No Phone and No Computers for the whole trip, the longest I will have been without either for for a good few years. I am taking the chance to catch up on some reading and take a break, I will have my Kindle with loads of books on it but am taking The Dots-and-boxes Game: Sophisticated Child’s Play (one of my 101 goals is to make a dots and boxes program), Gödel, Escher, Bach (another 101, started 3 times and never finished and it has intrigued me since my first year in university) and Winning Ways for your Mathematical Plays. Pen’n'paper geekery for the win!

Posted by tom under 101 & archaeology & jolly | No Comments »

Using R on Hadoop with Rhipe

November 20th 2010

I spent a while this week getting Rhipe, a java package that integrates the R environment with Hadoop, to work. Forward are pretty heavy users of Hadoop and it’s supporting ecosystem so R will be another way for the devs to interrogate the huge (and rapidly growing!) datasets we have.

Installing R
Adding the repositry
Create a new file at /etc/sources.list.d/R.list

#R repositry
deb http://rh-mirror.linux.iastate.edu/CRAN/bin/linux/ubuntu hardy/

(we are still using hardy, with the Cloudera packages)

Add the gpg keys for the repository

gpg --keyserver pgp.mit.edu --recv-key E2A11821
gpg -a --export E2A11821 | sudo apt-key add -

Install and update R
Easy:

$ sudo apt-get install r-base r-base-dev pkg-config littler
$ sudo R
> update.packages()

Set environment variables for Rhipe
Add to bottom of /etc/environment

HADOOP=/usr

create it for current session

$ export HADOOP=/usr

install protobuff

# wget http://protobuf.googlecode.com/files/protobuf-2.3.0.tar.bz2
# tar jxf protobuf-2.3.0.tar.bz2
# cd protobuf-2.3.0
# ./configure
# make
# make install
# ldconfig

install Rhipe

# wget http://www.stat.purdue.edu/~sguha/rhipe/dn/Rhipe_0.64.tar.gz
# R CMD INSTALL Rhipe_0.64.tar.gz

So all is well except that the test code here is a bit off.

For me today

> library(Rhipe)

Only works as root

It seems that

> rhwrite(list(1,2,3),"/tmp/x")

should be:

> rhwrite(list(1,2,3),"/tmp/x",1)

then

> rhread("/tmp/x")

works properly.

Also in the longer example

map <- expression({
  lapply(seq_along(map.values),function(r){
    x <- runif(map.values[[r]])
    rhcollect(map.keys[[r]],c(n=map.values[[r]],mean=mean(x),sd=sd(x)))
  })
})

## Create a job object
z <- rhmr(map, ofolder="/tmp/test", inout=c('lapply','sequence'),
          N=10,mapred=list(mapred.reduce.tasks=0),jobname='test')

## Submit the job
rhex(z)

## Read the results
res <- rhread('/tmp/test/p*')
colres  <- do.call('rbind', lapply(res,"[[",2))

colres
       n      mean        sd
 [1,]  1 0.4983786        NA
 [2,]  2 0.7683017 0.2937688
 [3,]  3 0.5936899 0.3425441
 [4,]  4 0.3699087 0.2666379
 [5,]  5 0.5179839 0.4060244
 [6,]  6 0.6278925 0.2952608
 [7,]  7 0.4920088 0.2785893
 [8,]  8 0.4592598 0.2674592
 [9,]  9 0.5734197 0.1928496
[10,] 10 0.4942676 0.2989538

Where line 16 has been changed from the original

res <- rhread('/tmp/test')

Thanks to Saptarshi Guha, the author of Rhipe for so quickly responding to my query in the group and also the authors of this discussion on setting up R in Ubuntu

Posted by tom under hadoop & r | 1 Comment »

Finding Primes In SICP

November 2nd 2010

I was reading SICP over lunch and found this lovely footnote on probabilistic methods for deciding if a number is prime. (it is #47)

Numbers that fool the Fermat test are called Carmichael numbers, and little is known about them other than that they are extremely rare. There are 255 Carmichael numbers below 100,000,000. The smallest few are 561, 1105, 1729, 2465, 2821, and 6601. In testing primality of very large numbers chosen at random, the chance of stumbling upon a value that fools the Fermat test is less than the chance that cosmic radiation will cause the computer to make an error in carrying out a “correct” algorithm. Considering an algorithm to be inadequate for the first reason but not for the second illustrates the difference between mathematics and engineering.

Posted by tom under lisp & SICP | No Comments »

Day 300 Update

October 28th 2010

Well, it is day 300 of my 101 goals in 1001 days. So as per meta-goal 101 “Do 100 day updates” here is a quick report on my progress.

Completed
66 – Via Feratta in Italy
85 – Visit Pergamon Museum
1 – Teetotalitarianism for 3 months
2 – Cheeseless for 3 months
11 -Reread all Dennett books
15 -Proofread for Project Guttenburg
53 – Make Jam
86 – Give Carrie a British Museum Tour
48 – Create a Backblaze storage pod
100 Set success criteria / progression metrics for each goal

Changing
18 – Read epic literature
A bit vague (in spite of meta-goal 100) and I have loads of reading goals so im changing it to “Read Joyce”, in particular:

26 -Scuba
My ears are bad, I worry this may destroy them, not settled on a replacement goal yet.

Sustained Effort – On Track
19 – Blog on average once a week
50 – Move 10 people to FreeAgent
68 – Complete Pimsleur Spanish
88 – Go to the theatre on average once a month
95 – Pay off all credit cards
96 – Let loans run course and dont get any more
101 – Do 100 day updates
5 – Lose 2 stone
13 – Release 303 books on bookcrossing.com
76 -Do on average 1 Project Euler problem per week

Sustained Effort – Behind
8 – Read all the VSIs
12 – Read all PG Wodehouse
60 – Hike on average once a month
79 – Raise £5005 for charity
81 – Watch all TTC Art history DVDs
90 – See all world heritage sites in the UK

In Progress
91 – Memorise 10 poems
92 – Read “An Ode Less Travelled”, do the exercises (but not share them!)
72 – Read “Winning Ways”
75 – Watch SICP, do exercises from book

Posted by tom under 101 & Life | No Comments »

Borges on Poetry

October 26th 2010

I really enjoy the work of Jorge Luis Borges. If you have not heard of him, check out In Our Time – Borges.

I found some audio of some wonderful lectures on poetry he gave at Harvard in the late 60s. The interesting thing about them is that he composes them from memory as he was blind by that point. He demonstrates the breadth of his reading with examples from Middle German, Old Norse, Spanish and English, tracing down the sources of some of the quotes was quite a challenge as they come thick and fast during the lectures. I have listened to them a lot and he has become like an old friend, I can hear his voice when I read his essays or stories.

My two favorite quotes are:

Awake! for Morning in the Bowl of Night
Has flung the Stone that puts the Stars to Flight:
And Lo! the Hunter of the East has caught
The Sultan’s Turret in a Noose of Light.

Dreaming when Dawn’s Left Hand was in the Sky
I heard a Voice within the Tavern cry,
“Awake, my Little ones, and fill the Cup
Before Life’s Liquor in its Cup be dry.”

From The Rubaiyat Of Omar Khayyam, translated by Edward Fitzgerald and

“Great wine like blood from Burgundy,
Cloaks like the clouds from Tyre,
And marble like solid moonlight,
And gold like frozen fire.

“Smells that a man might swill in a cup,
Stones that a man might eat,
And the great smooth women like ivory
That the Turks sell in the street.”

from G. K. Chesterton’s The Ballad Of The White Horse (which was actually the first poem of this length I have ever read, I got it on my kindle from Project Gutenberg)

Posted by tom under 101 & poetry | No Comments »

Creating Something (In fact everything) out of nothing, Twice

October 25th 2010

I am reading through SICP with some of the guys in work and got to a bit that fascinated me and reminded me of my previous life as a mathematician.

The basic data structure in Scheme is the cons cell, essentially representing a ordered pair. So

> (cons 2 3)
(2 . 3)

creates the pair (2 . 3)

Lets call that a,

> (define a (cons 2 3))

There are 2 other functions that act on these cons cells, car returns the first, cdr returns the second

> (car a)
2

> (cdr a)
3

You can build lists by chaining these cons cells and unpick the contents by using car and cdr

> (define b (cons 3 a))
> b
(3 2 . 3)

> (car b)
3

> (cdr b)
(2 . 3)

> (cdr (cdr b))
3

The interesting bit is at p91 of the book where it breaks down the distinction between procedures and data

(define (cons x y)
  (define (dispatch m)
    (cond ((= m 0) x)
          ((= m 1) y)
          (else (error "arg no 0 or 1" m))))
  dispatch)

So here cons is a function of x and y as you would expect. The trick is that its return value is a function (internally called dispatch) of one argument that returns x if passed 0 and y if passed 1, otherwise it throws an error.

Now car and cdr can be defined

(define (car z) (z 0))

(define (cdr z) (z 1))

So car takes z as an argument and tries to apply z to 0, while cdr applys it to 1. While car and cdr will accept any function as an argument it makes sense if z is the cons of something as before.

As an example

(car (cons 1 2))

car will pass 0 to (cons 1 2)
(cons 1 2) returns 1,
similarly

(cdr (cons 1 2))

returns 2 as cdr will pass 1 to (cons 1 2)

This is not how the interpreter actually works, just the introduction to the idea of code as data (as code etc) in Lisp. This is the feature that sets it aside from other languages, in particular code is not parsed, jut loaded up as it is a valid data structure itself and allows for the powerful macro systems usually seen in Lisps.

Now for twice, it reminded me of some Set Theory, in particular forming the Natual Numbers in terms of sets.

We define 0 to be the empty set
.

Then we can define 1 to be

(this has 1 element – we say has cardinality 1)

Then 2 is

Notice this has cardinality 2 and is formed as the set containing all the previous numbers.

So we have all positive whole numbers. You can then define arithmetic and number theory in terms of these sets.

See this site for a nice introduction to set theory (it also begins by defining ordered pairs)

Loving SICP, got to do some Induction for the first time in ages last night!

Posted by tom under lisp & SICP | 1 Comment »

Krapp’s Last Tape

October 23rd 2010

A few weeks ago I went to see Krapp’s Last Tape at the Duchess Theatre in London with some friends. I use “with” in a weak sense as as ever I was running late and had to get a jog on to arrive something like on time, just arriving as it started and pleading to be let in. I came in, trying not to pant from running, to a very oppressive silence; the entire audience watching Michael Gambon slumped in a chair and just sat at the first free seat I saw. The Silence went on for quite some time and really built up a lot of tension, so much in me that I did not take my coat or jacket off throughout the entire performance. I loved the premise: a (senile?) old man listening to tapes of his younger self that he recorded on his birthday, as there is only one actor he is engaged in dialogue with himself via the tapes he listens to. I wont pretend to have any more insight than the Wikipedia article I’ve already linked to but the other thing that resonated with me was the shifting accent of Krapp, sometimes sounding quite RP, sometimes Irish and sometimes just a lilt. A wonderful production, go see it if you can.

Posted by tom under Life & theatre | No Comments »

Infinite Prime Number Generator In Python

October 18th 2010

I was looking at this article (The Genuine Sieve of Eratosthenes / PDF) about a common functional prime number generator that is mistakenly called the Sieve of Eratosthenes.

It is quite complicated, section 2 deals with the performance of the naive algorithm and 3 implements the sieve properly. I could not quite grok the Haskell but the following quote helped me know what was going on

Whereas the original algorithm crosses off all multiples of a prime at once, we perform these “crossings off” in a lazier way: crossing off just-in-time. For this purpose, we will store a table in which, for each prime p that we have discovered so far, there is an “iterator” holding the next multiple of p to cross off. Thus, instead of crossing off all the multiples of, say, 17, at once (impossible, since there are infinitely many for our limit-free algorithm), we will store the first one (at 17 × 17; i.e., 289) in our table of upcoming composite numbers. When we come to consider whether 289 is prime, we will check our composites table and discover that it is a known composite with 17 as a factor, remove 289 from the table, and insert 306 (i.e., 289+17). In essence, we are storing “iterators” in a table keyed by the current value of each iterator.

Here it is using a dictionary for each composite number with a list of increments it was reached by:

from itertools import count

def gen_primes():
    yield 2
    def update_composites(number,increment):
        try:
            composites[number] += [increment]
        except:
            composites[number] = [increment]

    composites = {4:[2]}

    c = count(3)
    while 1:
        #print composites
        next = c.next()
        smallest = min(composites.keys())
        incs = composites[smallest]
        del composites[smallest]
        for inc in incs:
            update_composites(smallest+inc,inc)
        if next < smallest:
            yield next
            c.next()
            update_composites(next**2,next)

g=gen_primes()

for i in range(100):
    print g.next()

(On Github)

They mention a speedup moving to a heapqueue as the data structure (as we just take the lowest each lookup) so I implemented that too, here is the code

from itertools import count
from heapq import heappush,heappop

def gen_primes():
    yield 2

    def update_composites(number,increment):
        heappush(composites,(number,increment))
    composites = [(4,2)]

    c = count(3)

    while 1:
        #print composites
        smallest,increment = heappop(composites)
        update_composites(smallest+increment,increment)
        if composites[0][0] == smallest:
            continue
        next = c.next()
        if next < smallest:
            yield next
            c.next()
            update_composites(next**2,next)

g=gen_primes()

for i in range(100):
    print g.next()

(On github)

Posted by tom under euler & Mathematics & Python | No Comments »

Learning Ruby: methods vs procs (or Ruby vs Python?)

October 4th 2010

I have been meaning to learn ruby for a while and the place I am working now uses a lot so I had another look at it. I read Learn To Program, a simple but good book and found the bit on blocks and procs etc pretty good and wanted to see if I could do the stuff in Python as well. Python has anonymous “lambda” functions but they are limited to one line a subset of the syntax which is a bit annoying sometimes. My worry with methods in Ruby is that they are not first class, I think because you can omit parenthesis and so you have no way of referring to them without invoking them.

I remembered this while reading the SICP book, the question was about the difference between this program in applicative and normal order evaluation

(define (p) (p))

(define (test x y)
  (if (= x 0)
      0
      y))

It rang a bell as (define (p) p) does not go into an infinite loop if you invoke p. In lisp (p) calls the procedure p with no arguments whereas p is just a reference to the function. In python someinstance.method refers to the method, someinstance.method() calls it, Ruby seems to need Proc objects to get around this (IMHO as a beginner!, see the end for John Leach’s lovely response via email at the time)

I redid all the examples from the book in Python

Eg 1
Ruby

def maybe_do some_proc
  if rand(2) == 0
    some_proc.call
  end
end

def twice_do some_proc
  some_proc.call
  some_proc.call
end

wink = Proc.new do
  puts '<wink>'
end

glance = Proc.new do
  puts '<glance>'
end

Python

import random

def maybe_do(some_proc):
    if random.choice(range(2)) == 0:
        some_proc()

def twice_do(some_proc):
    some_proc()
    some_proc()

def wink():
    print 'wink'

def glance():
    print 'glance'

for i in range(5):
    print 'running for i=',i
    maybe_do(wink)

Eg2
Ruby

def do_until_false first_input, some_proc
  input = first_input
  output = first_input
  while output
    input = output
    output = some_proc.call input
  end
  input
end

build_array_of_squares = Proc.new do |array|
  last_number = array.last
  if last_number <= 0
    false
  else
    # Take off the last number...
    array.pop
    # ...and replace it with its square...
    array.push last_number*last_number
    # ...followed by the next smaller number.
    array.push last_number-1
  end
end

always_false = Proc.new do |just_ignore_me|
  false
end

puts do_until_false([5], build_array_of_squares).inspect

yum = 'lemonade with a hint of orange blossom water'
puts do_until_false(yum, always_false)

Python

def do_untill_false(first_input, some_proc):
    input = first_input
    output = first_input
    while output:
        input = output
        output = some_proc(input)
    return input

def build_array_of_squares(array):
    last_number = array.pop()
    if last_number <= 0:
        return False
    else:
        array.append(last_number * last_number)
        array.append(last_number - 1)
        return array

def always_false(just_ignore_me):
    return False

def just_ignore_me():
    pass

print do_untill_false([5], build_array_of_squares)
yum = 'lemonade with a hint of orange blossom water'
print do_untill_false(yum, always_false)

Eg3
Ruby

def compose proc1, proc2
  Proc.new do |x|
    proc2.call(proc1.call(x))
  end
end

square_it = Proc.new do |x|
  x*x
end

double_it = Proc.new do |x|
  x+x
end

double_then_square = compose double_it, square_it 

square_then_double = compose square_it, double_it

puts double_then_square.call(5) puts square_then_double.call(5)

Python

def compose(proc1,proc2):
    def composed(x):
        return proc2(proc1(x))
    return composed

def square_it(x):
    return x**2

def double_it(x):
    return x*2

double_then_square = compose(double_it,square_it)
square_then_double = compose(square_it,double_it)

print double_then_square(5)
print square_then_double(5)

Eg4

class Array
  def each_even(&was_a_block__now_a_proc)
    # We start with "true" because
    # arrays start with 0, which is even.
    is_even = true
    self.each do |object|
      if is_even
        was_a_block__now_a_proc.call object
      end
      # Toggle from even to odd, or odd to even.
      is_even = !is_even
    end
  end
end

fruits = ['apple', 'bad apple', 'cherry', 'durian']
fruits.each_even do |fruit|
  puts "Yum! I just love #{fruit} pies, don't you?"
end

[1, 2, 3, 4, 5].each_even do |odd_ball|
  puts "#{odd_ball} is NOT an even number!"
end

Python

class MyArray(list):
    def each_even(self):
        for i in range(len(self)):
            if i % 2 == 0:
                yield self[i]

fruits = MyArray(['apple', 'bad apple', 'cherry', 'durian'])

for fruit in fruits.each_even():
    print 'yum! I love %s pies, dont you?' % fruit

for odd_ball in MyArray([1,2,3,4,5]).each_even():
    print '%s is NOT an even number' % odd_ball

Eg5
Ruby

def profile block_description, &block
  start_time = Time.new
  block.call
  duration = Time.new - start_time
  puts "#{block_description}: #{duration} seconds"
end

profile '25000 doublings' do
  number = 1
  25000.times do
    number = number + number
  end

  puts "#{number.to_s.length} digits"
  # That's the number of digits in this HUGE number.
end

profile 'count to a million' do
  number = 0 1000000.times do
    number = number + 1
  end
end

Python

def profile(description, function):
    import time
    start_time = time.time()
    function()
    duration = time.time() - start_time
    print '%s: %s seconds' % (description, duration)
    print function.__name__
    print 'see, "function.__name__" can be used in place of description in python'

def count_to_a_million():
    number = 0
    for i in range(1000000):
        number = number+1

profile('count to a million', count_to_a_million)

def profiled(function):
    def new_function(*args, **kwargs):
        import time
        start_time = time.time()
        result = function(*args, **kwargs)
        print function.__name__, 'took', time.time() - start_time, 'secs'
        return result
    return new_function

@profiled
def count_to_a_million_again():
    number = 0
    for i in range(1000000):
        number = number + 1

count_to_a_million_again()

This uses decorators, a nice Python feature that uses higher order functions (and the fact functions are first class in python).

In Conclusion
IMHO, at this point in my experience of Ruby, with all the disclaimers about my non expert status etc.
Like:

  • No restriction on complexity of anonymous functions

Dont Like:

  • Methods being different from Procs/Blocs, non-uniform syntax
  • Leaving out parenthesis (though I await DSL goodness later!)
  • “end” everywhere (I know the indentation thing in python is contentious!)

John Leach’s thought provoking tuppence

Young padawan, you look but you do not see, you will learn

or rather

Yeah, but blocks are closures Tom

Tom goes to google and comes back with http://www.artima.com/intv/closures2.html
Matz

I think it’s not that useful in the daily lives of programmers. It doesn’t matter that much.

Then john came back with

I can think of one example in Rails right away where it's useful, transactions:

r = Record.new params[:record]

Record.transaction do
 r.save
 RecordLog.create(:text => "created a new record")
end


that code takes some input from a browser (in params), instantiates a
new Record object, then writes it and a RecordLog entry to the database
atomically.
All the Record.transaction does is sends a BEGIN to the db server,
executes the block, and sends a COMMIT (or a ROLLBACK if the block
errors for any reason).
The block needs access to the r object. We could have created that
inside the block, but then it'd need access to the params object. So
without real closure support, Record.transaction would have had to
support passing in arbitrary variables.
Remember, that interview with Matz was in 2003 - more people are using
Ruby for more things nowadays, for uses beyond the imagination of it's
creator I'm sure :)

Final Thoughts
I am waiting to be blown away by Ruby and Rails

Posted by tom under Python & Ruby | No Comments »

« Prev - Next »