Segmentation
Crux: How to support a large address space?
Segmentation: Generalized Base/Bounds
Instead of having just one base and bounds pair in our MMU, we have a base and bounds pair per logical segment of the address space.
A segment is a contiguous portion of the address space of a particular length, and in our canonical address space, we have three logically-different segments: code, stack and heap.Segmentation allows the OS to place each one of those segments in different parts of physical memory, and thus avoid filling physical memory with unused virtual address space.
Consider the address space of a program given in Figure 16.1. With a base and bounds pair per segment, we can place each segment independently in physical memory like in Figure 16.2.
Only used memory is allocated space in physical memory, and thus large address spaces with large amounts of unused address can be accommodated.
For more details about address translation, take a look at P143 of the book.
Which Segment Are We Referring To?
1. Explicit Approach
Chop up the address space into segments based on the top few bits of the virtual address. In the example above, we have three segments, thus we need two bits to accomplish this task. We use the top two bits of the 14-bit virtual address to select the segment. The virtual address looks like this:
- If the top two bits are 00, the hardware knows the virtual address is in the code segment, and thus uses the code base and bounds pair to relocate the address to the correct physical location.
- If the top two bits are 01, the hardware knows the address is in the heap, and thus uses the heap base and bounds.
Consider there is an access to virtual address 4200, which in binary form looks like:
The top two bits (01) tell the hardware which segment we are referring to. The bottom 12 bits are the offset into the segment: 0000 0110 1000, or 104 in decimal. Thus, the hardware simply takes the first two bits to determine which segment register
to use, and then takes the next 12 bits as the offset into the segment. By adding the base register to the offset, the hardware arrives at the final physical address.
If base and bounds were arrays, the hardware would be doing something like this to obtain the desired physical address:
In the example above,SEG_MASK
would be0x3000
,SEG_SHIFT
would be12
andOFFSET_MASK
would be0xFFF
.
2. Implicit Approach
The hardware determines the segment by noticing how the address was formed. If, for example, the address was generated from the program counter (i.e., it was an instruction fetch), then the address is within the code segment; if the address is based off of the stack or base pointer, it must be in the stack segment; any other address must be in the heap.
What About The Stack?
In the example above, the stack has been relocated to physical address 28KB, but stack grows backwards in physical memory, it starts at 28KB and grows back to 26KB, corresponding to virtual addresses 16KB to 14KB. Therefore, translation has to proceed differently.
What we need is an extra hardware (one bit, for example) to know which way the segment grows:
In this example, assume we wish to access virtual address 15KB, which should map to physical address 27KB. Our virtual address, in binary form, thus looks like this: 11 1100 0000 0000 (hex0x3C00). The hardware uses the top two bits (11) to designate the segment, but then we are left with an offset of 3KB. To obtain the correct negative offset, we must subtract the maximum segment size from 3KB: in this example, a segment can be 4KB, and thus the correct negative offset is 3KB - 4KB which equals -1KB. We simply add the negative offset (-1KB) to the base (28KB) to arrive at the correct physical address: 27KB. The bounds check can be calculated by ensuring the absolute value of the negative offset is less than the segment’s size.
Support for Sharing
To save memory, sometimes it is useful to share certain memory segments between address spaces. To support sharing, we need a little extra support from the hardware, in the form of protection bits. Basic support adds a few bits per segment,
indicating whether or not a program can read or write a segment, or perhaps execute code that lies within the segment.
With protection bits, the hardware must also check whether a particular access is permissible.
Fine-grained vs. Coarse-grained Segmentation
Most of our examples thus far have focused on systems with just a
few segments (i.e., code, stack, heap); we can think of this segmentation as coarse-grained, as it chops up the address space into relatively large, coarse chunks.
Supporting many segments requires even further hardware support, with a segment table of some kind stored in memory.
OS Support
Segmentation raises a number of new issues:
- What should the OS do on a context switch?
- How to manage free space in physical memory?
The physical memory quickly becomes full of little holes of free space, making it difficult to allocate new segments, or to grow existing ones. This is called external fragmentation, as in Figure 16.3 (left).
One solution to this problem is to compact physical memory by rearranging the existing segments. The OS could stop whichever processes are running, copy their data to one contiguous region of memory, change their segment register values to point to the new physical locations, and thus have a large free extent of memory with which to work. Figure 16.3 (right) shows a diagram of compacted physical memory.
However, compaction is expensive, as copying segments is memory-intensive and thus would use a fair amount of processor time.
A simpler approach is to use a free-list management algorithm that tries to keep large extents of memory available for allocation, such as best-fit, worst-fit, first-fit and buddy algorithm. But no matter how smart the algorithm, external fragmentation still exists.
Actually, the only one solution is to avoid the problem altogether, which is to never allocate memory in variable-sized chunks.
Summary
Benefits of Segmentation
- Dynamic relocation
- Better support for sparse address spaces
- Fast and well-suited to hardware
- Code sharing
Problems
- External fragmentation
- If the model of how the address space is being used doesn't exactly match how the underlying segmentation has been designed to support it, segmentation doesn't work very well