Learning to Navigate without a Map
Yuhuang Hu, Shu Liu, Antoni Rosiñol Vidal, Yang Yu
ETH Zürich
This project is supervised by Nikolay Savinov at Computer Vision and Geometry Group, ETH Zürich.
Introduction
In this project, we investigated how well deep learning algorithm can be used to navigate a partially observable (PO) grid world. We have implemented popular reinforcement learning architectures, including Policy Gradient (PG), Deep Q-Network (DQN) and Value Iteration Networks (VIN). And we show that VIN performs strongly for this task whilst other RL algorithms fail to generalize. The performance of VIN is compared with the ground truth that is computed by A*
in fully observable environment.
Get the code
You can clone the code from here:
git clone https://github.com/ToniRV/Learning-to-navigate-without-a-map
Value Iteration Networks in Partially Observable Environment

VIN defines a fully differentiable neural network that maps state of a PO environment into action. The introduction of VI module estimates an optimal value map through network iterations.
Project Video
Results
Average path difference between learned policy and A*
ground truth.
Environment | A* |
D* |
VIN-PO | Policy Gradient | DQN |
---|---|---|---|---|---|
8x8 |
0 | 0.0124 | 0.0526 | n/a | n/a |
16x16 |
0 | 0.3426 | 0.7144 | n/a | n/a |
28x28 |
0 | 0.1606 | 0.5074 | n/a | n/a |
Above table shows that a well trained VIN architecture can find a path that is only one step more than ground truth path on average.
Conclusion
- In this project, we have verified that, other than fully observable cases, VIN also works strongly on the navigation problem on partially observable grid world. As shown above, the trained VIN model gives a promising result that almost performs the same as ground truth path in
8x8
grid world, and just less than one step more in the16x16
and28x28
grid world. - Also, it is shown that the reinforcement learning algorithms such as Policy Gradients and Deep Q-Network doesn’t fit this task. There is no presentable result from experiments.
- Finally, we implemented a demonstration of the application of VIN used on drones based on
Gazebo
robotic simulator.
Report
- Hu, Y., Liu, S., Vidal, A.R., Yu, Y. (2017). Learning to navigate without a map. 3D Vision Project Report. Zürich, Switzerland: ETH Zürich.
Contacts
Yuhuang Hu, Shu Liu, Antoni Rosiñol Vidal, Yang Yu
Email: {hyh, liush, antonir, yuya}@student.ethz.ch