?

Log in

No account? Create an account
I started writing a contiguous read/write filesystem the other day… - CERisE's Testing for L

> Recent Entries
> Archive
> Friends
> Profile

April 10th, 2010


Previous Entry Share Next Entry
02:25 pm


I started writing a contiguous read/write filesystem the other day for fun. I want to use it to benchmark against other filesystems to see how much the placement of a file matters on current storage devices.

As part of this test, I started writing a userspace utility to figure out which blocks are closest to which other blocks. The first step of that was to try and determine block sizes by statistics. And that's when everything came to a grinding halt. No matter what I did, there seemed to be no strict correlation between the size of a request and the speed at which it was filled.

I started off using gettimeofday(), I've since switched to using rdtsc. I started doing single reads, now I'm doing 100 and averaging the times. I started doing reads from the same sector, now I'm doing reads from random sectors.

I've switched through a number of techniques for disabling caching. I open the device with O_DIRECT, set block readahead and filesystem readahead to 0 using ioctl()s, use an ioctl() to flush buffers, call sync(), and write 3 to /proc/sys/vm/drop_caches.

Presently, it looks like this:
optime:1407374890344 for size 1
optime:6778 for size 2
optime:6950 for size 4
optime:6952 for size 8
optime:6699 for size 16
optime:6765 for size 32
optime:7063 for size 64
optime:7030 for size 128
optime:6720 for size 256
optime:6713 for size 512
optime:6696 for size 1024
optime:7043 for size 2048
optime:7190 for size 4096
optime:7006 for size 8192
optime:7137 for size 16384
optime:6586 for size 32768
optime:6977 for size 65536
optime:6846 for size 131072
optime:6819 for size 262144
optime:6718 for size 524288
optime:6930 for size 1048576
optime:6797 for size 2097152
optime:7254 for size 4194304
optime:6922 for size 8388608
optime:7412 for size 16777216
optime:7838 for size 33554432
optime:6854 for size 67108864
optime:7125 for size 134217728
optime:6937 for size 268435456
optime:6753 for size 536870912
optime:6988 for size 1073741824
optime:7051 for size 2147483648
optime:6869 for size 4294967296
optime:6687 for size 8589934592
optime:6645 for size 17179869184
optime:6758 for size 34359738368
optime:8503 for size 68719476736
optime:6971 for size 137438953472
optime:6802 for size 274877906944

(Times are from tsc cycles on a 2 GHz Core 2 Duo, sizes are in bytes.)

It's interesting that size 1 accesses always end up being a great deal of cycles. Having figured that one out yet.

Code available here.

I'm not sure why I'm seeing this effect or how I could avoid OS effects.


EDIT: And then I realized that I should really watch my return values for read() when they get above SSIZE_MAX. That's what I get for being lazy.

(5 comments | Leave a comment)

Comments:


[User Picture]
From:knight3d
Date:April 11th, 2010 09:07 pm (UTC)
(Link)
... of course, this is only relevent for as long as we're using rotating discs with heads.

but cool. What does fixing your code show?
[User Picture]
From:testing4l
Date:April 11th, 2010 09:17 pm (UTC)
(Link)
.. of course, this is only relevent for as long as we're using rotating discs with heads.

Not so! Even flash memory isn't truly random access in the sense of instantaneous access!

but cool. What does fixing your code show?

Annoyingly similar constant times of access. I believe the problem may be a lazy transition of memory from the kernel to userspace.
[User Picture]
From:knight3d
Date:April 11th, 2010 09:24 pm (UTC)
(Link)
Not so! Even flash memory isn't truly random access in the sense of instantaneous access!

True, but you're not exactly worrying about where on the disk everything is. Pretty much all data will take a more consistent amount of time to access (if I understand it correctly.)

Still, you'd expect to see larger memory sizes take more time to access. Maybe it has to do with the disk drive's cpu's ability to multi-task the search.
[User Picture]
From:testing4l
Date:April 11th, 2010 09:36 pm (UTC)
(Link)
Pretty much all data will take a more consistent amount of time to access (if I understand it correctly.)

In the case of flash memory, you're correct. Still, there have started to be improvements in flash memory such as multiple controllers. The end result is that it'll be possible to handle simultaneous requests to different banks -- and locality of access becomes something to optimize (or, more appropriately, to pessimize) for all over again.

MRAM has always seemed more promising to me, even if it hasn't really made an appearance in the 10 or so years since I worked with it.

Still, you'd expect to see larger memory sizes take more time to access. Maybe it has to do with the disk drive's cpu's ability to multi-task the search.

That's why I made the reads from random starting points.

It doesn't seem to be a lazy transition of memory -- access times to the read memory seems pretty quick. I'm investigating the lseek() right before it. It's possible that's giving the kernel a head start.
[User Picture]
From:babe_of_beyazit
Date:April 24th, 2010 09:00 am (UTC)
(Link)
I don't think I ever understand these posts, but I would like to.

> Go to Top
LiveJournal.com