The Atlantic meridional overturning circulation (AMOC) carries warm water north, keeping Europe 5-10°C warmer than it would otherwise be. It may be approaching a tipping point. SimAMOC simulates this system from first principles in the browser — and uses AI agents to discover and fix its own physics errors.
Climate model development takes years of PhD-level hand-tuning. We demonstrated in one session that AI agents can diagnose structural physics bugs, propose code fixes, and drop RMSE 37% — with zero human physics input.
3-agent dialectic: Physicist diagnoses errors from screenshots, Tuner proposes parameter changes, Validator catches compensating errors. 13 tunable parameters, 4-tier evaluation.
RMSE 7.6 -> 4.7 in 10 iterationsAgents propose modifications to WGSL shaders and JS physics code. Each proposal runs in an isolated git worktree, evaluated against observations. Winners get merged.
Polar OLR, brine rejection discoveredPort to JAX/WebGPU autodiff. Compute gradients of RMSE with respect to all parameters simultaneously. Combine gradient-based optimization with agent-based structural search.
Target: continuous improvement loopMost ML-for-climate work builds emulators (faster but no new physics) or learns parameterizations from high-res simulations. We're doing neither. We're using AI to discover physics — to find missing equation terms, implement them in code, and validate against satellite observations. The git history becomes a record of scientific discoveries, each with a hypothesis, implementation, and observational validation.
8 JavaScript modules sharing global scope, running physics on the GPU via WebGPU compute shaders with CPU fallback. Zero build step — just serve HTML.
Physics engine. All state arrays, ~50 parameters, 5 WGSL compute shader strings, CPU fallback solver, observational data loading. Zero DOM dependencies.
1,985 linesWebGPU compute pipeline. Buffer allocation, shader compilation, dispatch batching, CPU readback. FFT Poisson solver for exact streamfunction inversion.
881 lines14 view modes, colormaps, GPU render pipeline, land elevation rendering with ETOPO1 bathymetry, particle overlays, diagnostic charts.
1,267 linesMain loop orchestrator. GPU tick, atmosphere sub-stepping between readbacks, cloud field updates, and the window.lab API for automation.
Slider bindings for 9 physics parameters, 7 paint brush modes (SimEarth-style), and 6 paleoclimate scenarios (Drake Passage, Panama, Ice Age).
127 linesMobile-first drawer UI. Reparents controls into slide-out drawers, swipe gestures, speed presets (1x / 3x / 10x / MAX).
70 lineswindow.lab exposes the simulation to automation scripts via Playwright — the bridge between the browser and all external tooling.
lab.step(n) // advance n timesteps
lab.diagnostics() // extract SST, salinity, AMOC, zonal profiles
lab.getParams() // read all parameters
lab.setParams({}) // inject parameter changes
lab.sweep() // parameter sweep with scoring
lab.scenario(name) // trigger paleoclimate scenario
lab.fields() // extract raw field arrays
lab.reset() // reinitialize from observations
| Component | GPU Path | CPU Fallback |
|---|---|---|
| Vorticity | WGSL compute | JS loops |
| Poisson solver | FFT (exact) | SOR (iterative) |
| Temperature | WGSL compute | JS loops |
| Atmosphere | CPU (between readbacks) | CPU (every step) |
| Rendering | WebGPU fragment + 2D overlay | 2D canvas only |
512 x 160 cells at ~0.7° resolution. Latitude -79.5° to +79.5° (excludes polar ice caps). Periodic in longitude.
All derivatives include cos(lat) metric correction. Coriolis parameter clamped near equator (|lat| < 5°) to avoid singularities.
Why this size? All climate data products (WOA23, NOAA SST, NCEP wind, MODIS clouds) are distributed at 1° on this grid. 81,920 cells total — fast enough for real-time on mobile GPUs.
One equation produces all major ocean currents. Wind stress curl drives the gyres. The beta effect creates western intensification. Temperature and salinity gradients drive the overturning circulation.
The model's behavior emerges from coupled feedbacks, not individual terms.
Every field the model computes can be viewed in real-time. Click any image to launch the simulation.
The model loads 10 observational datasets at runtime — SST, deep temperature, bathymetry, salinity, wind stress, albedo, precipitation, and cloud data. It initializes from the real Earth, not a blank canvas.
| Dataset | Source | Used for | Period |
|---|---|---|---|
| Sea surface temperature | NOAA OI SST v2 | Init + RMSE scoring | 1991-2020 |
| Deep temperature (1000m) | WOA23 | Deep layer initialization | Climatology |
| Bathymetry + land elevation | ETOPO1 | Depth field, terrain rendering | Static |
| Surface salinity | WOA23 | Salinity restoring target | 1991-2020 |
| Wind stress (tau_x, tau_y) | NCEP Reanalysis | Wind curl + Ekman transport | 1991-2020 |
| Surface albedo | MODIS MCD43A3 | Land surface albedo | 2020-2023 |
| Precipitation | GPM IMERG | Cloud parameterization | 2015-2023 |
| Cloud fraction | MODIS MOD08_M3 | Cloud model validation | 2020-2023 |
| Cloud types (low/high) | MODIS MOD08_M3 | Radiative effect calibration | 2020-2023 |
| Land/ocean mask | Natural Earth 110m | Domain boundaries | Static |
20+ fields at 1024x512 resolution, fetched via Google Earth Engine. Includes monthly climatologies for wind stress and albedo. Waiting for the GPU FFT solver to support larger grids.
RAPID AMOC (26.5°N), CO2 from Mauna Loa, GISTEMP, HadCRUT5, HadSST4, ocean heat content, Arctic/Antarctic sea ice extent. Available for driving scenarios and validation.
Named after the Ralph Wiggum pattern — AI agents iteratively diagnose what's wrong with the physics, propose fixes, and validate them. At ~$0.03 per run, we can afford thousands of iterations.
Sees 4 screenshots + scorecard + zonal error profiles. Produces 2-3 ranked hypotheses about what's physically wrong.
Translates the winning hypothesis into parameter changes. Max 3 parameters, max 30% step. Respects published physical bounds.
Checks physical consistency, catches compensating errors. Can reject proposals that improve one metric while degrading others.
Not all metrics are equal. A model that fits SST perfectly but has wrong AMOC dynamics is worse than one with 1°C higher RMSE but correct tipping behavior.
Binary gate. Energy balance, temperature range, stratification, AMOC positive. Failing T1 caps the score.
35% of score. Western intensification, gyre existence, ACC flow, deep water formation, poleward heat transport.
20% of score. Freshwater weakens AMOC, cooling cools ocean. Tests correct response to perturbations.
35% of score. Zonal-mean SST RMSE vs NOAA observations. The hard number — currently 3.3°C.
Triggered by conditions, not every iteration:
A collaboration between Luke Barrington (original physics engine + GPU compute) and Derek Lomas (salinity, AI loop, data pipeline, clouds, atmosphere, documentation).