I started writing a contiguous read/write filesystem the other day for fun. I want to use it to benchmark against other filesystems to see how much the placement of a file matters on current storage devices.
As part of this test, I started writing a userspace utility to figure out which blocks are closest to which other blocks. The first step of that was to try and determine block sizes by statistics. And that's when everything came to a grinding halt. No matter what I did, there seemed to be no strict correlation between the size of a request and the speed at which it was filled.
I started off using gettimeofday(), I've since switched to using rdtsc. I started doing single reads, now I'm doing 100 and averaging the times. I started doing reads from the same sector, now I'm doing reads from random sectors.
I've switched through a number of techniques for disabling caching. I open the device with O_DIRECT, set block readahead and filesystem readahead to 0 using ioctl()s, use an ioctl() to flush buffers, call sync(), and write 3 to /proc/sys/vm/drop_caches.
Presently, it looks like this:
optime:1407374890344 for size 1 optime:6778 for size 2 optime:6950 for size 4 optime:6952 for size 8 optime:6699 for size 16 optime:6765 for size 32 optime:7063 for size 64 optime:7030 for size 128 optime:6720 for size 256 optime:6713 for size 512 optime:6696 for size 1024 optime:7043 for size 2048 optime:7190 for size 4096 optime:7006 for size 8192 optime:7137 for size 16384 optime:6586 for size 32768 optime:6977 for size 65536 optime:6846 for size 131072 optime:6819 for size 262144 optime:6718 for size 524288 optime:6930 for size 1048576 optime:6797 for size 2097152 optime:7254 for size 4194304 optime:6922 for size 8388608 optime:7412 for size 16777216 optime:7838 for size 33554432 optime:6854 for size 67108864 optime:7125 for size 134217728 optime:6937 for size 268435456 optime:6753 for size 536870912 optime:6988 for size 1073741824 optime:7051 for size 2147483648 optime:6869 for size 4294967296 optime:6687 for size 8589934592 optime:6645 for size 17179869184 optime:6758 for size 34359738368 optime:8503 for size 68719476736 optime:6971 for size 137438953472 optime:6802 for size 274877906944
(Times are from tsc cycles on a 2 GHz Core 2 Duo, sizes are in bytes.)
It's interesting that size 1 accesses always end up being a great deal of cycles. Having figured that one out yet.
Code available here.
I'm not sure why I'm seeing this effect or how I could avoid OS effects.
EDIT: And then I realized that I should really watch my return values for read() when they get above SSIZE_MAX. That's what I get for being lazy.