LAFR — Learning Agile Quadrotor Flight in the Real World

Abstract

From 2.0 m/s to 7.3 m/s in 100 seconds of physical flight.

LAFR is a self-adaptive framework that learns agile quadrotor flight directly in the real world, without precise system identification, without offline Sim2Real transfer, and without conservative safety margins. The system operates as a continuous closed-loop cycle bridging physical execution and differentiable simulation: a learned hybrid dynamics model closes the reality gap; RASH-BPTT (Real-world Anchored Short-horizon Backpropagation Through Time) optimizes the control policy via massively parallel rollouts anchored at the latest real-world state; and Adaptive Temporal Scaling jointly retunes the reference trajectory's time-scale \(\alpha\) using closed-loop sensitivity, maximizing agility while enforcing safety via a barrier function. The base policy evolves from a peak speed of 2.0 m/s to 7.3 m/s within roughly 100 seconds of physical flight time, converging to a 2.34 s figure-8 lap at \(\alpha = 0.28\).

Peak speed

2.0 → 7.3 m/s

≈ 100 s of physical flight

Lap time

2.34 s

Figure-8, reach CTBR saturation

No offline SysID

Direct in real world

No massive data collection

Cite

BibTeX

If you find this work useful, please cite:

@inproceedings{ren2026agile,
  title     = {Learning Agile Quadrotor Flight in the Real World},
  author    = {Ren, Yunfan and Zhu, Zhiyuan and Xing, Jiaxu and Scaramuzza, Davide},
  booktitle = {Robotics: Science and Systems (RSS)},
  year      = {2026}
}

Method

Interactive Pipeline

A continuous closed-loop cycle bridging physical execution and differentiable simulation. Hover any module to spotlight it; click to pin the detail panel.

A · Policy Learning B · Real-World Rollout C · Model Calibration D · Anchored Init. E · Adaptive Temporal Scaling Continual Learning Loop

Module A: Continual Policy Learning (RASH-BPTT) with Hybrid Dynamics Model

Module B: Real-World Policy Rollout on the physical quadrotor

Module C: Simulation Model Calibration (hybrid ODE + neural residual)

Module D: Real-World Anchored Initialization

Hover · spotlight Click · pin popup Esc · unpin Tab · focus next card

Live demo

Interactive 3D Viewer

The figure below embeds the official Rerun web viewer (WASM) streaming the training recording from this site. Drag to orbit, scroll to zoom, and use the sim_time timeline at the bottom to scrub through the 8 ATS iterations as \(\alpha\) contracts and lap time falls from 8.34 s to 2.34 s.

Loading viewer…

Recording

What you're watching

The recording above corresponds row-for-row to the eleven ATS iterations below. \(\alpha\) contracts from 1.0 to its 0.25 floor; the lap time falls from 8.34 s to 2.08 s and the policy's peak speed grows from 3.4 m/s to 10.0 m/s, while tracking RMSE stays at or below 0.16 m, well clear of the 0.35 m safety guard.

Iter	\(\alpha\)	Lap time	Tracking RMSE	Notes
0	1.000	8.34 s	0.60 m	Base policy, pre-residual
1	0.753	6.28 s	0.19 m	ATS first contraction
2	0.753	6.28 s	0.11 m	Residual closes sim-to-real gap
3	0.579	4.82 s	0.06 m
4	0.471	3.92 s	0.04 m	Best RMSE (0.042 m)
5	0.399	3.32 s	0.06 m
6	0.350	2.92 s	0.07 m
7	0.313	2.60 s	0.08 m
8	0.281	2.34 s	0.09 m
9	0.255	2.12 s	0.16 m	Approaching \(\alpha\) floor
10	0.250	2.08 s	0.09 m	Converged at \(\alpha\) floor

After the residual closes the sim-to-real gap (iter 2 onward), tracking RMSE drops to 0.042 m at iter 4 and stays below 0.16 m for the rest of the run, well clear of the 0.35 m safety guard. The pipeline converges to 2.08 s at \(\alpha = 0.25\) (the \(\alpha\) floor) on iteration 10.

Run it yourself

Reproduce

ROS-free, JAX-only reference implementation. A single workstation with a modern NVIDIA GPU reproduces the figure-8 lap-time curve end-to-end.

1 · Set up the environment

conda install -n base -c conda-forge mamba
mamba create -n flightning python=3.11 -y
mamba activate flightning
pip install --upgrade "jax[cuda12]"
pip install -e ".[dev]"

2 · Train + run the pipeline

python -m flightning.scripts.train \
    --log_dir outputs/tracking

python -m flightning.online_learning.run_pipeline \
    --cfg flightning/cfg/online.yaml

github.com/RENyunfan/flightning README

Thanks

Acknowledgements

We thank the Robotics and Perception Group at the University of Zurich for hardware, lab space, and countless flight sessions. We are grateful to the open-source communities behind JAX, Rerun, and the broader differentiable-simulation ecosystem, on whose tools this work stands.

This research was supported in part by the National Centre of Competence in Research (NCCR) Robotics through the Swiss National Science Foundation (SNSF) and by the European Research Council (ERC) under the European Union's Horizon programme.