“"RAVEN is a 3D memory-based, behavior tree framework for aerial semantic navigation in unstructured outdoor environments." — RAVEN README, castacks/RAVEN (raw.githubusercontent.com, verified 2026-05-27)”
You know that feeling when you need a drone to search an outdoor area for a specific object — a vehicle, a structural defect, a person — but every open-source semantic navigation system you can find was built for indoor rooms and ground robots? Outdoor environments span hundreds of meters, targets appear sparsely (maybe one fire hydrant per city block), and you can't precompute a scene graph every time the mission changes. Reactive policies that only look at current camera frames can't plan ahead; pre-mapped approaches break the moment you move to a new site. RAVEN was built specifically for this gap: large unstructured outdoor environments, sparse targets, zero prior map, aerial platform.
As the drone flies, RAVEN builds a growing 3D memory: objects close enough for the depth sensor to measure get precise 3D coordinates (voxels); objects visible in the camera but farther than the depth sensor's range become directional arrows pointing toward likely target locations (ray frontiers). A behavior tree continuously reads this memory and picks one of four strategies — fly to a confirmed nearby object, follow a directional hint, ask a language model 'what else tends to be near a fire hydrant?' for auxiliary cues when memory is sparse, or explore new areas when memory is empty. The perception backbone is RayFronts (IROS 2025), running at 75.06 frames per second. On real hardware (Jetson AGX Orin), the language model branch runs offboard because the Jetson can't run it locally at inference speed.
If you do robotics research on UAV navigation, semantic SLAM, or outdoor embodied AI, RAVEN gives you a public baseline with ICRA-published numbers to beat or build on. Also directly useful if you're working with CMU's AirStack (ROS 2 autonomy stack) or RayFronts (IROS 2025 perception backbone) and want a reference integration. Not useful yet if you need production deployment, fully onboard LVLM inference, multi-UAV coordination, or a system that runs without GPU hardware and a multi-container Docker setup.
Worth reading the paper and watching the demo video if you work on outdoor UAV autonomy — the ablation study is honest and the voxel-ray architectural split is a clean idea worth understanding. Don't attempt to run it unless you already have Isaac Sim configured, a beefy GPU (the paper used an RTX 6000 Ada), and patience for configuring three Docker containers. At 48 stars and 1 contributor, this is a research artifact accompanying an ICRA paper, not a maintained open-source platform.
Deep-dive insight, Easy and Pro modes, plus action playbooks — the full breakdown is one tap away.