Learning to Navigate without a Map

Yuhuang Hu, Shu Liu, Antoni Rosiñol Vidal, Yang Yu

ETH Zürich

This project is supervised by Nikolay Savinov at Computer Vision and Geometry Group, ETH Zürich.


In this project, we investigated how well deep learning algorithm can be used to navigate a partially observable (PO) grid world. We have implemented popular reinforcement learning architectures, including Policy Gradient (PG), Deep Q-Network (DQN) and Value Iteration Networks (VIN). And we show that VIN performs strongly for this task whilst other RL algorithms fail to generalize. The performance of VIN is compared with the ground truth that is computed by A* in fully observable environment.

Get the code

You can clone the code from here:

git clone https://github.com/ToniRV/Learning-to-navigate-without-a-map

Value Iteration Networks in Partially Observable Environment

VIN defines a fully differentiable neural network that maps state of a PO environment into action. The introduction of VI module estimates an optimal value map through network iterations.

Project Video


Average path difference between learned policy and A* ground truth.

Environment A* D* VIN-PO Policy Gradient DQN
8x8 0 0.0124 0.0526 n/a n/a
16x16 0 0.3426 0.7144 n/a n/a
28x28 0 0.1606 0.5074 n/a n/a

Above table shows that a well trained VIN architecture can find a path that is only one step more than ground truth path on average.


  • In this project, we have verified that, other than fully observable cases, VIN also works strongly on the navigation problem on partially observable grid world. As shown above, the trained VIN model gives a promising result that almost performs the same as ground truth path in 8x8 grid world, and just less than one step more in the 16x16 and 28x28 grid world.
  • Also, it is shown that the reinforcement learning algorithms such as Policy Gradients and Deep Q-Network doesn’t fit this task. There is no presentable result from experiments.
  • Finally, we implemented a demonstration of the application of VIN used on drones based on Gazebo robotic simulator.



Yuhuang Hu, Shu Liu, Antoni Rosiñol Vidal, Yang Yu
Email: {hyh, liush, antonir, yuya}@student.ethz.ch