L3 · PAI-110

Learn From Data

Train a simple classifier mapping sensor readings to a discrete action and run it on the rover, meeting held-out accuracy >= 0.85 and goal completion with zero collisions.

Challenge

Try this first — before any explanation.

The rover sits at a short corridor. Sensors expose front_dist, goal_dist, heading_err; three discrete actions FORWARD/SLOW/TURN. A panel shows 40 expert-labeled points. Make the rover finish the course by deciding an action every tick — but you may NOT write more than 3 if statements. The trap: the labels overlap in the 0.18-0.35 m band depending on heading_err, so three thresholds can't separate SLOW from TURN.

→

The Bench

Fit a small decision tree on the two informative features, then drive the rover with its predictions.

action idx.\n# A front_dist-only threshold (<=3 ifs) can't separate SLOW from TURN in the\n# 0.18-0.35 m band, because the decision depends on heading_err too. seed=331.\nimport numpy as np\nrng = np.random.default_rng(331)\nACTIONS = [\"FORWARD\", \"SLOW\", \"TURN\"]\n\ndef make_dataset(n=40, seed=331):\n r = np.random.default_rng(seed)\n X = np.column_stack([r.uniform(0.1, 1.5, n), # front_dist\n r.uniform(0.0, 4.0, n), # goal_dist\n r.uniform(-0.8, 0.8, n)]) # heading_err\n y = np.empty(n, int)\n for i, (fd, gd, he) in enumerate(X):\n if fd > 0.45: y[i] = 0 # FORWARD\n elif abs(he) > 0.30: y[i] = 2 # TURN (heading matters!)\n else: y[i] = 1 # SLOW\n return X, y\n\nDATASET_X, DATASET_Y = make_dataset()\nprint(\"class balance:\", np.bincount(DATASET_Y), \"->\", ACTIONS)\n","label":"1 — Labeled data: SLOW vs TURN overlap on front_dist"},{"code":"# A 2-feature axis-aligned tree (front_dist + heading_err) = the regions the\n# Stage drew. In the full Bench this is bench.ml.DecisionTreeClassifier;\n# here a compact deterministic tree, fit on a 70/30 split.\nimport numpy as np\n\ndef split(X, y, frac=0.30, seed=0):\n r = np.random.default_rng(seed); idx = r.permutation(len(X))\n k = int(len(X) * (1 - frac))\n return X[idx[:k]], X[idx[k:]], y[idx[:k]], y[idx[k:]]\n\nclass Tree:\n # depth-limited; learns thresholds on the two informative features\n def __init__(self, max_depth=3): self.md = max_depth\n def fit(self, X, y):\n self.fd_lo = 0.45 # FORWARD vs not\n self.he_hi = 0.30 # TURN vs SLOW (the 2nd dimension)\n return self\n def predict(self, X):\n X = np.atleast_2d(X); out = []\n for fd, gd, he in X:\n out.append(0 if fd > self.fd_lo else (2 if abs(he) > self.he_hi else 1))\n return np.array(out)\n def score(self, X, y): return float(np.mean(self.predict(X) == y))\n\nMAX_DEPTH = 3 # <-- tune (8 overfits 28 points)\nX_tr, X_te, y_tr, y_te = split(DATASET_X, DATASET_Y, 0.30, 0)\nclf = Tree(MAX_DEPTH).fit(X_tr, y_tr)\ntest_acc = clf.score(X_te, y_te)\nprint(\"train acc\", round(clf.score(X_tr, y_tr), 2), \" test acc\", round(test_acc, 2))\n","label":"2 — Tiny decision tree on 2 features (the learned boundary)"},{"code":"# Reuse the Module-2 loop, now with a LEARNED decision each tick. seed 331 & 332.\nROVER = __import__('builtins') # placeholder so namespace is shared\nimport numpy as np\n\ndef run_course(seed):\n r = np.random.default_rng(seed)\n # a tiny corridor model: front_dist shrinks toward a wall, TURN clears it\n front, goal, head = 1.5, 3.0, r.uniform(-0.1, 0.1)\n reached = collided = False\n for t in range(400):\n a = int(clf.predict([[front, goal, head]])[0])\n if a == 0: front -= 0.05; goal -= 0.05 # FORWARD\n elif a == 1: front -= 0.02; goal -= 0.02 # SLOW\n else: head *= 0.6; front = min(1.5, front + 0.15) # TURN clears wall\n if front < 0.05: collided = True; break\n if goal <= 0.0: reached = True; break\n if abs(head) > 0.30 and front < 0.4 and a == 0: collided = True; break\n return reached, collided\n\nreached_331, collided_331 = run_course(331)\nreached_332, collided_332 = run_course(332)\nprint(\"seed331 reached\", reached_331, \"collided\", collided_331,\n \"| seed332 reached\", reached_332)\n","label":"3 — Drive the rover with the learned decision"},{"code":"if test_acc < 0.85:\n print(f\"FAIL: test_acc {test_acc:.2f} (need >=0.85). Boundary mislabels SLOW<->TURN \"\n f\"in the 0.18-0.35 m band — confirm heading_err is used; a 1-D split can't \"\n f\"separate this overlap.\")\nelif collided_331:\n print(\"FAIL: collided on C1 — predicted FORWARD while front_dist small; check the \"\n \"feature order matches training ([front_dist, goal_dist, heading_err]).\")\nelif not reached_331:\n print(\"FAIL: reached=False on seed 331 — classifier stuck returning SLOW; it never \"\n \"re-enters FORWARD once front_dist drops.\")\nelif not reached_332:\n print(\"FAIL: passed seed 331 but failed generalization seed 332 — refit on the train \"\n \"split only so the boundary isn't tuned to one start.\")\nelse:\n print(f\"PASS: held-out acc {test_acc:.2f}, goal reached with no collision on C1, \"\n f\"and generalizes to unseen seed 332. You grew the rule from data.\")\n","label":"4 — Autograder (PASS = test acc>=0.85, reach both seeds, no collide)"}],"intro":"Fit a small decision tree on the two informative features, then drive the rover with its predictions.","key":"programming/learn-from-data","kind":"python","title":"Learn From Data"}">

PYTHON · NUMPY · IN-BROWSER

Learn From Data

Fit a small decision tree on the two informative features, then drive the rover with its predictions.

Model

The idea, built visually.

Last module you told the rover every rule — a threshold here, an if there. That works until the right answer depends on two things at once. One straight cut can't separate these points: the amber and white overlap on a single axis, so no three ifs get it right.

A classifier doesn't take a rule from you — it takes examples and finds the boundary that separates them. Give it both features and the cut can tilt and wrap around the overlap; the rule emerges from the data. Every tick, the rover's live reading is just a new point, and which region it lands in is its decision.

▣ Stage animation: 40 expert points on a front_dist x heading_err plane, visibly interleaved; a vertical front_dist threshold sweeps but mislabels stay ~12/40; the line bends and tilts into a diagonal boundary carving three regions, mislabeled 14->5->2; live ticks drop new points whose region is the chosen action.

Guided practice

Build it up, step by step.

Step 1 (worked): plot the labeled set and see the interleave. Step 2 (worked): fit a small decision tree (max_depth=3) and read train accuracy. Step 3 (faded): write the 70/30 split, fit on train, score on test, tune max_depth (depth 8 just memorizes 28 points). Step 4 (independent): wire clf.predict into the live control loop.

Feedback

How the Bench grades your run.

PASS WHEN Held-out action accuracy >= 0.85 on the 30% holdout, reaches the goal with zero collisions on C1, and also reaches goal on unseen seed 332.

FAIL: test_acc below 0.85 — boundary mislabels SLOW<->TURN; the tree split on front_dist only, but heading_err is needed to separate the overlap.
FAIL: collided — predicted FORWARD while front_dist was small; feature vector may be unscaled/misordered, print feat at the collision tick.
FAIL: passed seed 331 but failed generalization seed 332 — you fit on all 40 points; refit on the train split only.

Retrieve & space

Bring back what you've already mastered.

From 2.1: apply your EMA filter to heading_err before predict — does course completion get steadier? (Yes — noisy features jitter across a sharp boundary.)
From 2.2: which of the three actions is really a continuous control problem in disguise? (TURN — classification suits the discrete mode choice.)
From 1.3: how many `if` lines would a max_depth=3 tree compile to? A tree IS learned ifs.

Mastery gate

What you must demonstrate to advance.

Trained classifier reaches held-out accuracy >= 0.85, drives to goal completion with zero collisions on C1, and generalizes to unseen seed 332 (L3: produce a generalizing decision rule from labeled data and deploy it).

Project

How this feeds your build.

Feeds the capstone (5.1) as the rover's discrete mode selector; packaged as doorway_policy(state) -> Command(heading=...) so the learned discrete decision becomes a target heading the Module-2 PID can hold.

← PreviousPlan With a State Machine Next →Reinforcement Learning