General CS Notes

  1. Data Structures
  2. Python Data Structures
  3. Python Memory Management
  4. The Hardware-Software Interface

General CS Notes

Python Memory Management

Editor’s note: This is an extremely rough page of observation from blundering through snippets python’s source code and some docs. It’s even less of “cliff notes” than other pages.

This was a deep dive spawned by looking into Python’s data structures.

Summary

Python manages its own memory. It has multiple layers of allocators, which they call domains. Lower-level allocators handle grabbing memory for Python to use. Higher-level allocators split up this memory smartly into chunks (“pools” and “arenas”) to try to get good performance for memory used by Python programs. I think that many small and short-lived objects are probably a common consideration.

Software called “mimalloc” now seems to play a large role in the code. It’s billed as a drop-in replacement for libc’s malloc. I believe that Python can optionally use it as that. But I think Python may also use it as an alternative to one of it’s higher-level allocators called “pymalloc.” (Both I and LLMs are unclear on this point and I think it’s not worth the time to clarify.)

This diagram from pycore_obmalloc.h is great. (Also repeated below).

    Object-specific allocators
_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python's object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python's raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager's control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |

=========================================================================
_______________________________________________________________________
[ OS-specific Virtual Memory Manager (VMM) ]
-1 | <--- Kernel dynamic storage allocation & management (page-based) ---> |
__________________________________ __________________________________
[ ] [ ]
-2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> |

Source: https://github.com/python/cpython/blob/main/Include/internal/pycore_obmalloc.h

Notebook

There’s a whole rabbit whole of figuring out: where does Python’s source code actually call malloc() and friends?

Editor’s note: This page is written linearly as an investigation. Reading the docs helped clarify the different allocator domains (levels of abstraction), and mimalloc is, and what it’s probably replacing. It’s an interesting example of introducing complexity to a codebase. mimalloc was added for (what I would consider to be) relatively small gains on some benchmarks, and as a result there’s a lot more code. As a newcomer who’d never heard of mimalloc, I didn’t realize that was a term I should be looking for in file paths, so when I first stumbled on files like alloc-override.c and page.c, I thought these were core to Python’s memory management, not a separate allocator’s implementation that’s dropped in to replace a lower level allocator used by Python’s memory system.

Will “unwind the stack” so to speak.

mimalloc/page.c.

My search path ended in page.c. https://github.com/python/cpython/blob/main/Objects/mimalloc/page.c

It looks like all this code is written by Microsoft Research (in… 2018?) by Daan Leijen. Interesting. And like everything has mi as a prefix, which I presume is “Microsoft?” (Update: it looks like it’s short for “mimalloc” which is the name of the project. Other Microsoft things start with “MS,” so link unlikely.)

mi_find_free_page(heap, size) seems to find a free page from a queue and return it.

mi_page_queue_t* pq = mi_page_queue(heap,size);
mi_page_t* page = pq->first;
// (...)
return page

Many more functions there like _mi_malloc_generic. (TODO: which all call into it?)

I’m not sure how it’s all initialized, but it does seem like, yes, Python (its… “runtime?”) is doing a big honking layer of memory management, not just for GC, but tracking chunks of memory from the heap and sending pages into its own internal memory management system.

I’d be curious to see where the “startup” is and whether it initially grabs a pool of memory.

mimalloc/alloc.c

alloc.c has a ton of wrappers. Everything seems to call into functions that (can?) zero out memory. Maybe this is for ease/safety? https://github.com/python/cpython/blob/main/Objects/mimalloc/alloc.c

Most functions are just one line that call another function with passed args, like:

The central bad boy here looks like _mi_heap_malloc_zero_ex(heap, size, zero, huge_alignment).

mimalloc/alloc-override.c

There’s a bunch of spooky C shit going on. (Just kidding, it’s not spooky, just terrifying.) I believe they literally override the colloquial functions (malloc()) with their own versions. https://github.com/python/cpython/blob/main/Objects/mimalloc/alloc-override.c

// On all other systems forward to our API
mi_decl_export void* malloc(size_t size) MI_FORWARD1(mi_malloc, size)
mi_decl_export void* calloc(size_t size, size_t n) MI_FORWARD2(mi_calloc, size, n)
// (... etc.)

There are macros to do this stuff. I forget how macros work other than they’re compile-time string substitution that can do unholy stuff like implement an entire hash map.

// Override system malloc
// (^ this is really written; I'm omitting a ton of OS and compiler checks in the macro #if blocks)
#define MI_FORWARD(fun) __attribute__((alias(#fun), used, visibility("default")));
#if // (don't worry about it)
#define MI_FORWARD1(fun,x) MI_FORWARD(fun)
#define MI_FORWARD2(fun,x,y) MI_FORWARD(fun)
#define MI_FORWARD3(fun,x,y,z) MI_FORWARD(fun)
#define MI_FORWARD0(fun,x) MI_FORWARD(fun)
#define MI_FORWARD02(fun,x,y) MI_FORWARD(fun)
#else
// otherwise use forwarding by calling our `mi_` function
#define MI_FORWARD1(fun,x) { return fun(x); }
#define MI_FORWARD2(fun,x,y) { return fun(x,y); }
#define MI_FORWARD3(fun,x,y,z) { return fun(x,y,z); }
#define MI_FORWARD0(fun,x) { fun(x); }
#define MI_FORWARD02(fun,x,y) { fun(x,y); }
#endif

obmalloc.c

This file is interesting, and makes me wonder if you can actually bottom-out here, or if the above (alloc-overrides, mi_allocs, page, …) are always used. https://github.com/python/cpython/blob/main/Objects/obmalloc.c

For example, here’s the PyMem_Malloc implementation we’ve been chasing down.

/***********************/
/* the "mem" allocator */
/***********************/

void *
PyMem_Malloc(size_t size)
{
/* see PyMem_RawMalloc() */
if (size > (size_t)PY_SSIZE_T_MAX)
return NULL;
OBJECT_STAT_INC_COND(allocations512, size < 512);
OBJECT_STAT_INC_COND(allocations4k, size >= 512 && size < 4094);
OBJECT_STAT_INC_COND(allocations_big, size >= 4094);
OBJECT_STAT_INC(allocations);
return _PyMem.malloc(_PyMem.ctx, size);
}

Of minor note, I’ll remark that the this, like so much of the code, is a very simple wrapper with some checks and a call into another layer of abstraction. The macros OBJECT_STAT_INC_COND are little stats recorders:

// this is from pycore_stats.h
#define OBJECT_STAT_INC_COND(name, cond) \
do { if (_Py_stats && cond) _Py_stats->object_stats.name++; } while (0)

Two questions arise for the above PyMem_Malloc.

  1. what is _PyMem?
  2. what is this 'the “mem” allocator" business?

Let’s answer them. For 1, elsewhere in that file is

#define _PyMem (_PyRuntime.allocators.standard.mem)

…whatever that is. And 2, we see several different kinds of allocators in this file. Sections in the comments:

Note: from future code and docs reading, it seems like these are used as follows:

  • low-level allocator implementations — actual (maybe) memory from system. literally calls malloc
  • the "arena" allocator — for real big memory chunks (MB)
  • the "raw" allocator — supposedly raw memory from system (but does still call an abstraction?); no GIL
  • the "mem" allocator — memory buffer within Python’s private heap; with GIL
  • the "object" allocator — memory for objects from Python’s private heap; with GIL

Looking at the “low-level” allocators is why I questioned whether these allocator overrides and mi* wrappers we saw are always used. Check it out:

/* the default raw allocator (wraps malloc) */

void *
_PyMem_RawMalloc(void *Py_UNUSED(ctx), size_t size)
{
/* PyMem_RawMalloc(0) means malloc(1). Some systems would return NULL
for malloc(0), which would be treated as an error. Some platforms would
return a pointer with no memory behind it, which would break pymalloc.
To solve these problems, allocate an extra byte. */

if (size == 0)
size = 1;
return malloc(size);
}

I think mimalloc may still be used if the overrides is loaded, because I think a macro literally swaps out malloc() with its own.

It’s fun to see the variants for windows and linux that lacks mmap. E.g., for the arena allocator:

Freeing: VirtualFree(...) / munmap(...) / free(ptr)

Also, that they’re providing a nicer interface over the platform diffs. The function you call that multiplexes into the above 3 just takes a context object and a size as arguments.

pymem.h

These actually define the functions that are used, like PyMem_Malloc. https://github.com/python/cpython/blob/main/Include/pymem.h

There’s some fun notes at the top of the file on not mixing their APIs (PyMem_Malloc()) with calling the raw system memory functions (malloc()):

Here are the definitions:

PyAPI_FUNC(void *) PyMem_Malloc(size_t size);
PyAPI_FUNC(void *) PyMem_Calloc(size_t nelem, size_t elsize);
PyAPI_FUNC(void *) PyMem_Realloc(void *ptr, size_t new_size);
PyAPI_FUNC(void) PyMem_Free(void *ptr);

What does the PyAPI_FUNC macro do? Interestingly, I can’t easily find it defined in the code with a simple GitHub search. (Historically bad, but I thought improved.) StackOverflow suggests it lives in pyport.h but it’s not there.

Reading the docs

This was a fun exploration, but at this point things were complex enough it made sense to read the docs a bit. https://docs.python.org/3/c-api/memory.html

This confirms that Python manages a private heap and allocator internally for its own usage. It seems like this means it grabs a chunk of memory from the system, and then internally allocates it upon request (for all Python usage: objects, buffers, etc.). It also says there are different specialized (“object-specific”) allocators used within this private heap to optimize for those access patterns. Cool.

It describes the raw/mem/obj allocation domains.

The “pymalloc allocator” uses memory-mapped “arena” regions, within which small short-lived objects (<= 512 bytes) are placed.

The “mimalloc” allocator is mentioned extremely briefly. Added in 3.13 from what looks like a separate MSR project, it says two curious things:

  1. “Python supports the mimalloc allocator when the underlying platform support is available”

    • does this mean it’s on by default if possible? what does this mean? what layer of abstraction does it replace?
  2. “mimalloc “is a general purpose allocator with excellent performance characteristics. Initially developed by Daan Leijen for the runtime systems of the Koka and Lean languages.””

    • Sounds cool, I guess, but… that’s it?

I read through https://github.com/python/cpython/issues/90815 which was fun. From the issue:

For 3.11 I plan to integrate mimalloc as an optional drop-in replacement for obmalloc.

If this is obmalloc.c, that file seemed to have a whole bunch of different allocators. So I can’t tell whether they mean replacing all of those higher-level allocator interfaces, or only the lowest level (malloc) itself.

Mimalloc says (https://github.com/microsoft/mimalloc):

“mimalloc is a drop-in replacement for malloc

… which seems like it is could be the lowest level allocator. This would also jive with what I saw in mimalloc/alloc-override.c, where it seemed to replace calls to malloc with mi_malloc.

More context from this macro I stumbled across in obmalloc.c:

#if defined(Py_GIL_DISABLED) && !defined(WITH_MIMALLOC)
# error "Py_GIL_DISABLED requires WITH_MIMALLOC"
#endif

So it could be that mimalloc was introduced for the GIL removal (er, optional-ifying) effort.

Chatting with a few different LLMs about what layer mimalloc replaces. They get confused and answer different things, but it seems at least possible that mimalloc can operate at multiple levels in the stack in python:

Would need to either spend a more time in the code or ask someone who knows to understand this.

pycore_obmalloc.h

Adding one more reference because it actually contains a wonderful diagram. https://github.com/python/cpython/blob/main/Include/internal/pycore_obmalloc.h

    Object-specific allocators
_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python's object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python's raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager's control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |

=========================================================================
_______________________________________________________________________
[ OS-specific Virtual Memory Manager (VMM) ]
-1 | <--- Kernel dynamic storage allocation & management (page-based) ---> |
__________________________________ __________________________________
[ ] [ ]
-2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> |

Where the lines intentionally don’t align are interesting:

There’s other fun comments in that file. Of note:

post info


Published Aug 7, 2025
Disclaimer This is an entry in the garage. It may change or disappear at any point.
Inbound
Outbound

General CS Notes series