General CS Notes: The Hardware-Software Interface

Using this as a catch-all for “everything going on under the hood of a program,” so it’ll include lower-level process and OS-level stuff as well. May split off if there’s a good boundary, but easier not to at first.

Stack / heap

From the program’s perspective, there’s stack vs heap for where data is stored. The CPU doesn’t—in general—care about stack vs heap, or the cache vs RAM memory hierarchy. It just reads/writes from/to addresses.

The stack stores one top-level block of OS-written program/system metadata (args, env, name, page size, some random bytes), plus a frame for each function. Each frame has: stack metadata (ret and prev frame addrs), function-local vars and args, previous register values. (Interestingly: some registers are saved by the caller (volatile), and some by the callee (non-volatile; if it wants to use them).)

The heap stores larger data (allocated, arbitrary lifetime). (Interestingly, malloc is typically in userspace!)

In virtual address space, stack grows down from high (0xFFFFFFFF) and heap grows up from low. Stack and heap are typically read/write, but not executable (exception: JITs execute on heap). (Code lives in its own memory place (“text”), is read-only and executable.) The CPU doesn’t really care about where the stack/heap are; just memory addrs.

Both stack and heap support read/write memory operations. But many operations do explicitly operate on the stack. E.g., push/pop/call/ret read/write the stack pointer, and read/write the memory there. The OS sets up the stack, and the compiler emits code for managing the stack.

TODO: Add later:

“Virtual Memory Areas (VMAs) - kernel data structures tracking each mapped region (stack, heap, libraries, etc.)”
/proc/[pid]/maps mapped regions for a process
/proc/[pid]/status memory usage summaries.

Memory hierarchy

The memory hierarchy has different access speeds. Registers fastest (no delay), then L1, L2, L3 caches, RAM, disk, network. Furthermore, chunks of memory are loaded into faster at a time (pages; usually 4KB, though can tune up to MBs or GB). So accessing sequential values in a loaded page is fast — of course, must be done while the page is still there. (Spatial and temporal locality.)

The cache system (L1/L2/L3) intercepts and accelerates RAM lookups if possible. (Registers, though, are different in that they’re directly referenced in machine code, and compilers try to keep hot values in them.) So from both the programmer’s and CPU’s perspective, the caches are invisible and not referenced in code. But, of course, knowing their behavior helps optimize code.

Interaction with stack/heap: Are caches big enough to fit the whole stack? Sometimes yes, but it likely wouldn’t make sense. Only the “hot” portion of the stack is likely to be kept in the cache, because other frequently-accessed memory will want to be cached too: other processes’ stacks, hot heap data structures, program code, kernel data.

TODO: Add later:

hardware MMU
full page tables in RAM
fast TLB

C Memory Sizes

char is almost always 8 bits. C std: char is smallest addressable unit of memory.

int is usually 32-bits, even on 64-bit consoles (x86-64, arm64). For back-compat reasons. The C standard guarantees:

short, int >= 16 bits
long >= 32 bits
long long >= 64 bits

pointers (of course: 64 bits for 64-bit, 32 bits for 32-bit) can be found out with sizeof(void*).

word size (CPU register size) almost always == pointer size (memory address len). but they are technically different concepts. (rare past exception: old 16-bit x86 that could use 16 or 32-bit pointers.)

Here’s a summary table of typical values on modern 64-bit systems:

C Type	Unix-like (bytes)	Windows (if different)
`bool`	1
`char`	1
`short`	2
`int`	4
`long`	8	4
`long long`	8
`float`	4
`double`	8
`long double`	16	8
`void *`	8
`size_t`	8

Basic virtual memory layout

High Addresses
--------------------------------------------
| Stack (grows down)                         |                    |
|--------------------------------------------|--------------------|
|                                            |                    |
| ------------------------------------------ |                    |
| mmap region (shared libs)                  | (e.g. libc, ld.so) |
| ------------------------------------------ |                    |
|                                            |                    |
| ------------------------------------------ |                    |
| Heap (grows up)                            | (starts empty)     |
| ------------------------------------------ |                    |
| Uninitialized data (BSS)                   |                    |
| ------------------------------------------ |                    |
| Initialized data segment                   |                    |
| ------------------------------------------ |                    |
| Code / Text segment                        |                    |
| ------------------------------------------ |                    |
| NULL (0x0)                                 |                    |
--------------------------------------------
Low Addresses

The mmaped region is new to me. Interesting Q/A about mmap region:

location change from legacy (below heap) to modern systems (where it is now)
that (on modern systems) it grows downwards from stack towards heap

TODO:

mmaped region
- always present? (e.g., if statically linked?)
- does this affect growing the heap with mmap, or separate?
sample addrs
- I noticed stack starts lower than expected; may be b/c only 48 bits of addr space used
- investigating on macOS may be harder (no /proc/..., but vmmap)

Getting memory

At program start, the heap address space is “reserved” (naturally via the memory layout) but begins empty.

malloc is a userspace program. This is cool, because it can be replaced. It’s in the a C std library, part of a suite: malloc, realloc, calloc, aligned_alloc and free.

An example of a malloc replacement is mimalloc, which Python adopted. (Unsure yet whether Python uses it to replace malloc, or as a higher-level allocator once they have a private heap, or both.)

Under the hood, malloc et al. use system calls to grow the heap’s size.

brk and sbrk are legacy methods to increase/decrease the “program break” (i.e., end of the heap). From '70s Unix. They were removed from POSIX, but still exist in Linux, and are still used by older malloc implementations.
mmap and munmap are more modern alternatives to brk/sbrk, added with virtual memory in the '80s. They’re preferred by modern allocators, even for small allocations. Unlike brk/sbrk, they’re thread-safe and get page-aligned chunks.
other methods:
- mprotect lets you change read / write / execute permissions on memory. Cool for, e.g., JITs, so you can write code to the heap then execute it.
- madvise lets you give hints about anticipated memory usage so you can help the OS optimize caching. e.g., I’m going to keep reading sequentially, I’m only going to read this once, or I don’t need this anymore.
- mremap seems like the mmap equivalent of realloc, resizing an existing allocation instead of having to copy, but Linux-only and less widely used.

Manual regions

Note that some exist in multiple sections, so there’s man 1 read (default) and man 2 read.

Some of these are likely wrong, but it’s a useful overview.

Section	Content	Examples	Frequency in Practice
1	General programs (including shell builtins)	`ls`, `grep`, `find`, `vim`, `cd`	Very High - Most commonly referenced
2	System calls (kernel functions)	`open`, `read`, `write`, `fork`, `execve`	High - Essential for system programming
3	Library functions (C standard library, etc.)	`printf`, `malloc`, `strcmp`, `signal`, `regex`	High - Critical for C/C++ development
4	Kernel interfaces (and (?) device files)	`null`, `zero`, `random`, `tcp`, `ip`	Low - Specialized hardware/driver work
5	File formats	`passwd`, `hosts`, `fstab`, `crontab`, `resolv.conf`	Medium - Useful for system administration
6	Games	`fortune`, `cowsay`, `sl`, `banner`, `factor`	Very Low - Rarely encountered
7	Miscellaneous (conventions, protocols, etc.)	`ascii`, `environ`, `hier`, `intro`, `time`	Medium - Helpful for understanding concepts
8	System admin commands	`mount`, `ifconfig`, `iptables`, `cron`, `systemctl`	Medium-High - Important for sysadmin work
9	Kernel developer’s manual (Linux-specific)	`kmalloc`, `copy_to_user`, `request_irq`, `mutex_lock`	Very Low - Kernel development only

Published	Aug 7, 2025
Disclaimer	This is an entry in the garage. It may change or disappear at any point.
Tags	garage
Inbound	Data Structures

The Hardware-Software Interface

Stack / heap#

Memory hierarchy#

C Memory Sizes#

Basic virtual memory layout#

Getting memory#