본문 바로가기

Enginius/Robotics

Dynamic Occupancy Recurrent Nwork (DORN)

Dynamic occupancy network differs from existing occupancy grid map in that it can 'predict' future occupancy based on past inputs (occupancy information). 


Traditionally, multi-target tracking using data association algorithms were used in this purpose. These algorithms first 'identify (or detect)' each object using object detection algorithms. For example, boosting algorithms such as Viola-Jones have been used for detecting faces or eyes and HOG based filters were used for detecting pedestrians. Based on these detections, trajectories with short length, tracklets, are constructed and further connected to longer trajectories using 'data association' algorithms. 


While detection can be done in real time, the computational complexity for data association is #P complete meaning that it can hardly be done in real time as the number of objects increases. Furthermore, when it comes to unstructured environments, identification itself could be a demanding task to do as there could be infinitely many objects that could move. 


On the other hand, another branch of modeling a dynamic environment is using an occupancy grid map setting. Dynamic occupancy grid filtering aim to handle this issue by filtering not only the occupancy information but also the velocity of each grid. While this framework has been extended to many other studies including applying Bayesian method with pre-given prior information, we believe that it has two major drawbacks; the limitation of filtering algorithm and the computational complexity. 


In our research, we used max-pooling recurrent network to construct the dynamic occupancy network (DONET).


Following is the architecture of my DONET. 



The bottommost and uppermost layers colored in dark gray indicate 'input layer' and 'prediction layer' respectively. One can simply think the 'input layer' as an ordinary occupancy grid map (OCM) and the 'prediction layer' as the prediction for next occupancy. The intermediate layer is called a 'context layer' consists of '$|N|$' cells where '$N$' is a set of neighbors. Each cell represents the prediction of previous time stamp and is connected to other cell and the connection is one-to-one and directed in a recurrent manner. The prediction is performed by max- the context column and pass the value through the sigmoid function to get a valid probabilityThus, the whole network can be seen as a special type of reccurent network with max-pooling.

Algorithm

1. For 'occ' cells, propagate context information (increase confidence)

2. Uncertainty in acceleration

3. For 'nocc' cells, decrease confidence

4. Further prediction (for overconfident cells)

5. Predict future occupancy using max-pooling with sigmoid function 

6. Smooth (not necessary)

7. Penalize doubly idle state 








'Enginius > Robotics' 카테고리의 다른 글

ICRA Learning-related paper survery  (0) 2015.07.31
Robotics in Germany  (1) 2015.06.16
Q Learning for idiots  (0) 2015.04.30
T-RO paper submission procedure  (0) 2015.04.24
소나 센서를 이용한 파이오니어 3DX 로봇 제어  (0) 2015.02.26