Friday, January 20. 2012
Rusty Russell & Matt Evans
Three Cool Projects
- Spark - command-line tool that generates sparks
- Plover - An open-source stenography alternative.
- Homebrew Cray-1A http://chrisfenton/homebrew-cray-1a/
In The Beginning
- “The year was 1976, the hair was long, the shoes were tall.”
- It’s a multi-user machine, it has two teletype machines in front of it!
- Bring on the PDP-11 simulator!
Comparisons
- cat, grep, and ls are the punching bags.
- “Bigger is better. grep is 20 times bigger.”
- “cd only came in in V7.” “I edited the shell so I could have cd.”
- cat was written in assembler in 1976.
- In V6 arguments were in-line.
- The first thing you notice is that they are memory-concious. Although Rusty points out that they don’t bother with system calls; they also just use assembler because it’s natural.
- Rusty re-implemented cat with bug/behaviour and it was only twice is big in C. Modern cat is big in part becuase of more features, error messages, and so on.
- We pay a 30% memory penalty if we use -O2 instead of -Os.
- But -Os is slower by about 6% for these simple utilities.
- Automated runtime analysis tells us 99% of the instructions are used at some point, with only one instruction ever being used. 1% bloat!
- Even going to V7 in 1979 ls has doubled in size. cat only uses 57% of its instructions.
- ...but if you built static cat instead of shared libraries it pulls in another 700KB of glibc dependcies!
- There’s a dependency graph. It looks like scribble.
- It includes TLS (in case you need to fetch from Reddit).
- When we instrument cat on x86 we find that we use... um... 2% of it. Bugger.
0
- On a whole-system analysis there’s 33 MB of wasted RAM. Not much compared to all the memory.
- But there may be a TLB hit.
- Of course, 16-bit vs 64-bit is unfair. So Rusty guessimated the change in Text and Data segments. There’s some big growth, around 50%.
- By way of comparison 32-bit to 64-bit Ubuntu is only 9%.
- If you pull the old code to an Ubuntu system you actually cut the text segment (if you’ve got a stripped-down glibc). ELF, on the other hand, adds stuff, mostly to force on-page alignment. Even so, cat is only marginally bigger that 64-bit PDP cat. Not too bad.
- grep embiggened significantly.
- ls was already complex, with 10 flags. We also have to grow buffers for moving from 14 byte filenames to 255 bytes. It doesn’t use malloc, doing funky magic to grow the program when it starts running out. You kind of need it to use malloc() nowadays. So you grow 120%, because of a combination the changes.
Backporting
- Turns out GNU ls has 60-odd options. A survey of Rusty’s friends says 11 of them were never used.
- So some of this size are the extra options.
- For cat it’s easy: remove all the options and error reporting.
- Cat does some odd malloc() behaviour to have aliged, page-size buffers.
- Backported cat is still bigger than forward-ported cat.
- ls required Vast Surgery.
- It grabs system to the nanosecond so it can show entries more than 6 months old differently.
- It’s much faster, although it’s probably down to LOCALE complexity slowing up non-backported.
- There’s a 60% penality. But that’s for portability, 64-bit and so on.
- 400-odd% bloat? That’s the extra features.
Conclusions
- Most people aren’t prepared to go to the same lengths to keep things small.
- asmutils - reimplementation of *ix utils, but it’s not actually that efficient: it loses all the gains in BSS bloat that they don’t botherr measuring. Bummber.
- Features are the reason for growth.
|