A lot of us in India learn OS concepts from textbooks. The concepts really go directly from the textbooks into examination papers for most, including me. We would grab on to key phrases like "semaphores", "paging", "segmentation", "stack" and so on, and never really stop to wonder how this all is *really* implemented. Some of us aren't interested since all we care about is getting a well paying job while others find goofing around to be a more attractive alternative. Either way, very few really get it.
Four years since I finished college, six since I last took an OS class, and I can say now that I finally got it. Somewhat.
Recently there was a very interesting case I hit upon, which got me wondering how memory allocation was managed by the workhorse of memory allocation, the malloc() function. The malloc function is implemented in the glibc library for Linux (and also for other Unix systems if you want). Its major responsibility (along with its friends, free, mallopt, etc.) is to do accounting of memory allocated to a process. The actual *giving* and *taking* of memory is done by the kernel.
Now the fun part is that on a typical *nix system, there are two ways to request memory from the OS. One is using the brk() system call and the other is by using the mmap() system call. So which one should glibc use to allocate memory? The answer is, both. What malloc does is that it uses brk() to allocate memory for small requests and mmap() for large requests.
So what is the difference between mmap and brk you ask? Well, every process has a block of contiguous memory called the data area. The brk() system call simply increases one end of the data area and hence increases size, allocating memory to the process. To free, all it does is decrease the same end of the data area. This operation is quite fast most of the time. On the other hand, the mmap system call picks up a completely different portion of memory and maps it into the address space of the process, so that the process can see it. Additionally, the mmap call also has to put zeroes in the entire memory area it is about to allocate so that it does not end up leaking information of some old process to this one. This makes mmap quite slow.
So why have mmap at all if it is so slow? The reason is the inherent limitation that brk() has due to the fact that it only grows one way and is always contiguous. Take a case where I allocate 10 objects using brk and then free the one that I had allocated first. Despite the fact that the location is now free, it cannot be given back to the OS since it is locked by the other 9 objects. One way that malloc works around this is by trying to reuse these free spaces. But what if the size of the object I am about to allocate next is larger than any of these freed "holes"? Those holes remain and the process ends up using more memory than it really needs. This is "internal fragmentation".
So to minimize the effect of this internal fragmentation, glibc limits allocation of small objects to brk(). Larger objects are allocated with mmap(). A threshold was set at 128KB, so objects smaller than it are allocated using brk and anything larger is allocated using mmap. The assumption is that smaller object requests would come more often, so the little fragmentation is worth the improvement in speed. Oh, and as for the reuse of the memory holes, it does the "best fit algorithm" -- remember that phrase? ;)
But with more recent versions of glibc (around 2006 actually, so not *very* recent), this threshold limit is dynamic. glibc now tries to adjust to the memory allocation pattern of your program. If it finds that you are allocating larger objects and freeing them soon, it will then increase the threshold, expecting you to allocate larger objects and free them more often. There is of course an upper limit of 32 MB to this. So anything larger than 32 MB will *always* be allocated using mmap. This is quite awesome since it speeds up malloc quite a bit. But it obviously comes with the price of potentially larger memory holes.
There is so much more to this, like the actual details of the way accounting of the brk'ed memory is done, obstacks, arenas. The fun seems to be only beginning.