| Rank | Author | Strategy | Parameters | Score | Status | Attempts |
|---|---|---|---|---|---|---|
A chess engine has two parts: a search algorithm that explores future move sequences, and a position evaluator that scores how good a board position is. Traditional engines use handcrafted evaluation functions (material count, piece-square tables, king safety) paired with deep search (looking many moves ahead via alpha-beta pruning).
In this challenge, your neural network is the position evaluator. It uses depth-1 search with quiescence—it looks at every legal move, then follows any capture sequences until the position is quiet before evaluating with your network. This prevents tactical blind spots where a piece is about to be captured. Despite this shallow search, a good evaluation function can beat engines that search much deeper but evaluate positions poorly.
You'll face 4 progressively harder baselines, from depth 1 up to a full-strength depth-4 engine. Submissions are ranked by highest level cleared first, then fewest parameters. Can you compress chess knowledge into the fewest possible weights?
Your model faces 4 progressively harder baselines. Each level plays 50 games (25 openings × 2 colors) and requires 70% to pass. If you fail a level, you stop there.
| Level | Name | Baseline | Description |
|---|---|---|---|
| 1 | Beginner | Depth 1, classic | Depth 1 + quiescence — both sides follow captures |
| 2 | Novice | Depth 2, classic | Alpha-beta search + quiescence |
| 3 | Advanced | Depth 3, enhanced | TT/PVS/null-move pruning |
| 4 | Expert | Depth 4, enhanced | Full strength baseline (~1500–1600 Elo) |
The board is encoded as a 1540-dimensional dual perspective vector: two 770-element halves, each containing 12 piece planes × 64 squares + 2 castling rights. The first half encodes the board from the side-to-move's (STM) perspective; the second half encodes it from the non-side-to-move's (NSTM) perspective. Ranks are flipped when a perspective's color is Black, so the network always sees that side's pawns advancing up the board.
Dual perspective: 2 × (12 piece planes × 64 squares + 2 castling rights) = 1540 inputs
Input shape: [1, 1540] (float32, binary 0/1) Dual perspective encoding: two 770-element halves First 770 — Side-to-move (STM) perspective: [0-383] STM's pieces (pawn, knight, bishop, rook, queen, king) × 64 squares [384-767] Opponent's pieces (same order) × 64 squares [768] STM can castle kingside (1.0 / 0.0) [769] STM can castle queenside (1.0 / 0.0) Last 770 — Non-side-to-move (NSTM) perspective: [770-1153] NSTM's pieces × 64 squares [1154-1537] STM's pieces × 64 squares [1538] NSTM can castle kingside (1.0 / 0.0) [1539] NSTM can castle queenside (1.0 / 0.0) Square ordering: a1=0, b1=1, ..., h8=63 (rank-major) Ranks flipped when perspective's color is Black
Castling rights are included because they carry information piece positions alone cannot express—a king on e1 with castling rights is fundamentally different from one without. This dual encoding enables NNUE-style architectures where a shared feature transformer processes both perspectives through the same weights.
Your model outputs a single scalar: positive means good for the current player, negative means bad. The engine generates all legal moves, makes each move on a copy of the board, then runs quiescence search—following all capture sequences until the position is quiet before evaluating with your network. The eval is negated (opponent's perspective) and the move with the highest score is chosen.
Depth-1 + quiescence: follow captures to quiet positions, then evaluate with NN
Output shape: [1, 1] (float32)
For each legal move:
1. Make move on board copy
2. If checkmate or draw → score immediately
3. Otherwise → run quiescence search:
Follow all captures until position is quiet,
then evaluate with NN
4. Score = -eval (negate because opponent's good is our bad)
Best move = argmax(scores)
Your NN and the baseline use identical quiescence structure:
captures/promotions only, no check extensions, no delta pruning.
The only difference is the eval function.Each baseline pairs a handcrafted evaluation function (material balance, piece-square tables, king safety, pawn structure, mobility) with alpha-beta search at increasing depths. Search is what lets an engine “think ahead”—at depth 4, the baseline considers sequences of 4 moves before scoring the resulting position with its eval function.
Levels 1–2 use classic alpha-beta with quiescence search. Levels 3–4 add transposition tables, principal variation search, and null-move pruning—standard optimizations that make deeper search feasible. The strongest baseline (Level 4, depth 4) plays at roughly 1500–1600 Elo.
Your neural network searches to depth 1 with identical quiescence structure—captures and promotions only, no check extensions. The bet is that a sufficiently good evaluation function can outperform a weaker eval with deeper search. Where the handcrafted eval counts material and follows tables, your network can learn subtle positional patterns directly from data.
At each level, your model plays 50 games: 25 balanced openings, each played as both White and Black. You must score 70% or higher to pass a level and advance to the next.
Score 70% or higher to qualify — then compete for the smallest model
Per-level scoring: Win = 1.0 point Draw = 0.5 points Loss = 0.0 points Score = sum of points (max 50.0) Score % = (score / 50) × 100 Pass threshold: Score % ≥ 70% (at least 35.0 / 50.0) Ranking: 1st: Highest level cleared (Lv 4 > Lv 3 > Lv 2 > Lv 1 > Lv 0) 2nd: Fewest parameters (among same level)