Oct 23, 2025

Fast substring search, tree serialization order, playing cartpole by hand

Week 5, day 4 at Recurse F2’25

substring search is naively O(mn), but can be done in O(m + n) with O(1) space

Say:

Naively, O(mn):

Fancier techniques precompute information into tables, jump around cleverly, and/or use hashing.

Exhaustive resource:
http://www-igm.univ-mlv.fr/~lecroq/string/index.html

Wikipedia has a whole page, here’s the table: https://en.wikipedia.org/wiki/String-searching_algorithm#Single-pattern_algorithms

python’s implementation has a short note about their approach: https://github.com/python/cpython/blob/main/Objects/stringlib/fastsearch.h#L5-L8

… and a much longer writeup with lots of examples: https://github.com/python/cpython/blob/main/Objects/stringlib/stringlib_find_two_way_notes.txt

This is a cool instance where Python’s built-in approach:

If I ever wanted to implement one, it seems like Rabin-Karp might be the simplest. It uses a rolling hash:

So worst case O(nm) time if bad hash and lots of collisions, though ought to be O(n + m).

O(1) space, as each hash is a single are number.

tree serialization order matters

Assume you are serializing a binary tree, and you render the nulls. You can do pre-order, in-order, or post-order traversals. Which one(s) will give you an unambiguous representation of the tree?

order unambiguous serialization with nulls?
pre yes
in no
post yes

This was surprising to me. And I still have a rough intuition of parts of it, but not of the whole thing.

It makes sense that pre- and post-order traversals make unambiguous serializations, because they can be used to unambiguously represent pre/post-fix math expressions.

With in-order traversal, a node’s position depends on its subtrees, which makes serialization ambiguous. Both of the following two trees have an in-order traversal of [null, A, null, B, null]:

Tree 1:    B          Tree 2:      A
/ \
A B

So while I understand why pre and post should work, I still only have a vague intuition why exactly they do, as well as why in-order does not.

(The difference between saying true words which correctly explain something, and having the actual intuition click, remains interesting for me.)

What if you don’t have nulls? Pre- and post-order traversals no longer unambiguously serialize a tree. But if you have pre + in, or post + in, you can do it. There are even leetcode problems for this:

w/o nulls: pre + in → tree
https://leetcode.com/problems/construct-binary-tree-from-preorder-and-inorder-traversal/description/

w/o nulls: in + post → tree
https://leetcode.com/problems/construct-binary-tree-from-inorder-and-postorder-traversal/description/

(I may try these tomorrow if I still have the enthusiasm.)

What about pre + post (still no nulls)? It’s not enough. When a node has one child, you can’t tell where it is.

Tree 1:    1          Tree 2:    1
/ \
2 2

For both: pre: [1, 2], post: [2, 1].

cart pole

Me attempting to play by hand, with modified environment.

summary of tweaks / learnings:

human performance

who score
Chris 277
Dan 93
Alex 83
me 188
Ben pretty good, wasn’t tracking yet

(this was actually with calling render() twice accidentally. (once explicitly, because I didn’t realize step() calls it automatically.) it is WAY harder not doing that.)

registering custom env

subclassing an env (cart pole) to tweak it. (was just editing the package contents directly.)

gymnasium envs are registered in:

# .venv/lib/python3.12/site-packages/gymnasium/
envs/__init__.py

like:

register(
id="CartPole-v0",
entry_point="gymnasium.envs.classic_control.cartpole:CartPoleEnv",
vector_entry_point="gymnasium.envs.classic_control.cartpole:CartPoleVectorEnv",
max_episode_steps=200,
reward_threshold=195.0,
)

register(
id="CartPole-v1",
entry_point="gymnasium.envs.classic_control.cartpole:CartPoleEnv",
vector_entry_point="gymnasium.envs.classic_control.cartpole:CartPoleVectorEnv",
max_episode_steps=500,
reward_threshold=475.0,
)

reward_threshold is strange.

reward_threshold

reward_threshold doesn’t appear to actually do anything! It’s just metadata that’s saved into the spec (the EnvSpec), and is available for querying.

Claude was very emphatic after its little grepping:

  • Does NOT stop training
  • Does NOT cap rewards
  • Does NOT send signals anywhere
  • Does NOT trigger any behavior

I guess it makes sense as metadata an environment creator would provide to agent builders to communicate success. For cart pole it seems trivial, since the max episode steps (500) is so closely related to the reward threshold (really ought to be about 500) because it gets +1 per step, but for other environments it’s not necessarily obvious.

Actually, there’s a bug. This reward_threshold is not correct, because they’ve implemented sutton_barto_reward as a command line option, which gives 0 per step and -1 on failure. Any agent with any performance will always get a -1 reward. (It’s interesting that the total reward for Sutton & Barto’s formulation isn’t a sign of whether the agent has learned!) In other words, having a command line option for changing the reward isn’t compatible with a fixed, upfront, static reward_threshold. (Good thing it… doesn’t do anything.)

Other arg notes:

randomness

how is randomness controlled? the base Env’s reset() takes a seed, and then sets two attrs

# in Env (base environment class)
self._np_random, self._np_random_seed = seeding.np_random(seed)

this is stateful; their docs say:

Therefore, :meth:reset should (in the typical use case) be called with a seed right after initialization and then never again.

For Custom environments, the first line of :meth:reset should be super().reset(seed=seed) which implements the seeding correctly.

the original cartpole accesses this with

# in CartPolEnv (original)
self.state = self.np_random.uniform(low=low, high=high, size=(4,))

… which is actually implemented as a property

# in Env (base class)
@property
def np_random(self) -> np.random.Generator:
"""Returns the environment's internal :attr:`_np_random` that
if not set will initialise with a random seed.

Returns:
Instances of `np.random.Generator`
"""

if self._np_random is None:
self._np_random, self._np_random_seed = seeding.np_random()
return self._np_random

What I’m trying to figure out is: by just always calling reset like this:

for episode in range(episodes):
env.reset()
# ...

… am I doing it wrong? Both (a) calling it multiple times, (b) with no explicit seed ever?

It seems like this is fine:

if you call it with the same seed each time, you get the exact same starting observation

# episodes 1, 2, 3
Starting observation: [ 0.01823519 -0.0446179 -0.02796401 -0.03156282]
Starting observation: [ 0.01823519 -0.0446179 -0.02796401 -0.03156282]
Starting observation: [ 0.01823519 -0.0446179 -0.02796401 -0.03156282]
...

… if you instead call the same seed the first time, then None on subsequent episodes, you get the same sequence of starting observations on the episodes of each run:

# episodes 1, 2, 3
Starting observation: [ 0.01823519 -0.0446179 -0.02796401 -0.03156282]
Starting observation: [-0.03240941 0.03120945 0.0423345 -0.02234256]
Starting observation: [ 0.03197546 0.03898927 0.00129705 -0.02550354]
...

# next run, episodes 1, 2, 3
Starting observation: [ 0.01823519 -0.0446179 -0.02796401 -0.03156282]
Starting observation: [-0.03240941 0.03120945 0.0423345 -0.02234256]
Starting observation: [ 0.03197546 0.03898927 0.00129705 -0.02550354]

I got into all this because there’s a nondeterministic flag when registering the environment, which made me wonder whether CartPole actually implemented deterministic seeded “randomness.” And it does.

rendering

Rendering is basically:

def render(self):
if self.render_mode is None:
return
# (omitted draw with pygame)
if self.render_mode == "human":
# update pygame window
return None
else:
np_pixels = # pygame render
return numpy_pixels

all methods?

custom envs have a pretty minimal API!

note that:

def CustomEnv(Env):
def __init__(): ...
def reset(seed) -> state: ...
def step(action) -> state, reward, term, trunc, info: ...
def render(): Optional[pixels]...
def close(): ...

vector env

TBD, but seems important for speed. there’s CartPoleVectorEnv(VectorEnv)

post info


Published Oct 23, 2025