Everything is a Ghetto

While reading this controversial link bait, consider buying my product/service

How Many IOPS Can a Single Disk Provide?

[edit] I wish I could retract that comment about adding:- at the point the seek is done, you need to wait for the platter to spin into place, on average half a turn (the latency time). They do therefore happen in sequence! (If it was not for those pesky disks always spinning, I’d have been right!). Lesson learned: don’t let intuition lead you astray, don’t blog in haste, and realise that sometimes oft repeated advice is true. I am keeping the post up out of intellectual honesty, but will be blogging furiously to get it off the front page. [/edit]

I have just read an article on roughly how many IOPs you can expect from a single disk and encountered what I consider to be a frequently repeated mistake in the calculation.

Before I begin I want to point out that it is only an approximation anyway and caching in enterprise storage systems makes it perhaps a moot point anyway.

The article is here if you want to go and see it. He says there are 3 factors that influence the number of random IOPS you can do with one disk (the assumption throughout is that these are random)

  • Rotational speed

  • Average latency

  • Average seek time

and I agree, but would like to point out that 1 and 2 are related:

Average latency = how long 1/2 a turn takes

So, a 15000 RPM disk:

  • Turns 15000/60 = 250 times a second

  • One turn takes 1000/250 = 4ms

  • 1/2 a turn takes 2ms (your average latency) And lets say for arguments sake your seek time is 5ms

How do you work out IOPS (IO operations per second)? There are 1000ms in a second so where t is the average time taken to do an IO operation.

For example: If the average IO takes 5ms, you would have 200 IOPS as you could do 200 of them in 1 sec (200x5=1000)

EVERYONE AGREES UP TO HERE ** What is more important, latency or seek time?** Well you have to do both, wait for the disk to spin into position and move the head. These two things are in parallel and both need to happen so I would say your average IO takes the worst of the two, ie t = max{seek time,average latency} I have seen people say that t = seek time Which matches my thoughts with all disks I have ever seen

Or that t = latency as a max, as you can get heads to match any latency you like, but it is expensive. (Jeff Bonwick says so here )

The thing that I hate seeing is: t = latency + seek time By adding those two times together you are saying that your average IO time is the time it takes to do both bits added together. The only physical interpretation of this is that they happen one after the other with no overlap, which is clearly not true.

This fallacy is made explicit here on ZDnet

Since the overall access time is determined by the sum of the average rotational latency (2ms) and the average seek time (3.7ms), this high-end 15000 RPM hard drive has an average access time of 5.7 milliseconds.

Why the sum?

It comes up unchallenged in Duncans otherwise excellent article about the RAID IOPS penalty (interestingly he references the same article, saying

In short; It is based on “average seek time” and the half of the time a single rotation takes. These two values added up result in the time an average IO takes.

Again, why the sum?

I think it does not really make all that much difference, so long as you have a rule of thumb for roughly how many IOPS a disk has and understand how the way you combine them in RAID impacts it.

Comments