# Reviews I got from IROS 2017

Posted 2017.06.27 10:16I got two papers accepted (one with a first author, and the other with a third author), and one paper rejected.

Anyway, I am going to Vancouver!!!!!

**1. Leverage Deep Neural Network**

**Overall comments: **

The theoretical contribution is unclear, since the merits of this work are only assessed in simulation scenarios with unrealistic data. ??

__Reviewer 1 (This guy knows nothing about GP)__

In robotics, I think learning of demonstration is one of hot topics.In real environment, it is difficult to the

quality of observation data in certain. As you mentioned, it is very import problem to deal with data which contain several quality. So, your work is beneficial proposal.

Your paper is easy to understand overall. I have three questions about experiment setting and results.

1. You trained original GP and leveraged GP using 4000 demonstrations.On the other hand, you trained leveraged DNN and leveraged DNN-small using 20,000 or 4,000 demonstrations.Why do not you train original GP and

leveraged GP using 20,000 demonstrations ?

If it compares the performance, it is common to unify the number of the data.

**: It is because Gaussian process has cubic complexity. (I believe I mentioned this issue in the paper -__-)**

2.You used average absolute lane deviation distance in order to evaluate experiment results. Is average absolute

lane deviation distance appropriate? I think when there is a car in front of the right, the one that ran through the

left of the lane is safer.

3.In your experiment, ratio of 3 driving style is constant(14:3:3). if ratio of 3 driving style is different

each data set,your proposed method can learn correct driving policy ? if possible, you should show those result.

**: Of course it will work.**

I think if those question are clear, your paper will become better.

__Reviewer 2 (This guy is helpful)__

The paper presents a new method for training DNN. While **the theoretical framework is explained very well **there are **some obscurities, which should be corrected.** In the introduction (page 1, left, bottom) you state that

„common sense tells us …“ - unfortunately this is not that obvious - can you please explain this?

: **Rhetorical expression!!**

In IV.B., on page 4 bottom, right - it should be „… gamma_i, with gamma_i^2 = 1 ….“ - please verify. In V, page 5, right, top you explain the driving demonstrations. It seems that you used extremely exaggerated demonstrations. Thus the effect of the leverage parameters is evident. Nevertheless it would be interesting to see what the effect is on a rather „normal“, realistic demonstration basis, where aggressive drivers do not tend to hit other cars etc….

**: This is a rather good point! I should check this out. **

Why is the driving velocity fixed to 72 km/h? What would be different when using variable speeds?

Can you please verify the references in V.B (page 6, left,bottom).

It is unclear how many demonstrations you have. In total it should be 40000. For training the DNN you use 20000. Can you explain why and especially how you selected the data (for all methods).

Since you are developing a new method for training DNN it would be very interesting to see how it compares to a simple DNN, or what the effect of wrong leverage parameters is.

**: I believe I already done this.. **

__Reviewer 3 (Pedandtic)__

Summary: The authors' propose a learning from demonstration scheme that takes teacher proficiency into account and demonstrates it on a lane control task for a simulated car. It is based on and extends the authors' prior work on

"leverage optimization" to use deep neural networks instead of Gaussian processes. This allows larger data sizes to be used via mini-batch learning, with resulting better results on a simulated lane change task.

Pros:

Autonomous driving is a relevant topic, as is deep learning.

Learning from demonstrations, where the teachers are of different skill levels, could be useful in general.

Cons:

While the theoretical contribution appears to be the main focus, it is mostly based on the author's prior work, and

appears pretty convoluted (details below).

The usefulness of the actual robotics problem it purports to solve is not very convincing:

-The simulated driving task seems pretty simple, is learning even needed here?

**: Are you serious? **

-The example demonstrations do not seem realistic, some are even aiming for other cars.

:

**Details:**

Learning from demonstrations (LfD), where the teachers have different skill levels, could be useful. However, it is

unclear if this, or LfD in general, is a good approach to learn behavior for autonomous driving. The video, albeit

well made, did not convince me. The simulated lane change tasks seem simple, and likely amenable to a hand coded

solution.

**: Hell No**

Some realistic baseline would have been relevant, currently there are only comparisons to the authors' own LfD variants. Some drivers in the examples also appear to be suicidal, no doubt making skill discerning very important, but it is not very realistic. The weak connection to an actual robotics application is a definite minus.

This leaves the theoretical contributions of the paper, which according to the title and abstract also appears to be the main focus. Their "leveraged optimization" formulation for LfD with teachers of different skill levels is prior work [5]. One thing that bothered me is that the model choices here were not properly explained. The proposed Gaussian process has some odd quirks.

:

The proposed approach can intentionally include negative examples (secIII). First, this does not have an intuitive explanation for regression problems, being real numbers. Presumably it should put an observation density that is non-zero everywhere except at the observed point.

This would be non-Guassian and would not work with a GP though. GPs are spatially correlated Gaussian point processes, and the authors' chose to manipulate the covariance such that a negative example will be negatively correlated.

**: Nope, you are wrong. **

But correlation is reflected around a mean, so effectively a "negative" example of +30deg turn would give a -30deg turn, while a negative example of a +0.1deg turn would result in a -0.1deg turn? It's not clear if negative examples make mathematical sense. Second, negative examples seem like a bad idea for a driving application. E.g. the "aggressive" driver in the paper is actually intentionally aiming for cars, perhaps this is why?

The transformation from GP to a DNN seems a bit heuristic, relying several assumptions. In the we get a loss function formulation that rescales the loss based on driver proficiency. This seems reasonable, but the same quirky mirroring around zero is also included for the regression targets.

**: This is a good point indeed. **

This would have been a much better paper if it ditched the heuristic GP formulation and directly modeled a DNN with unknown proficiency parameters (could be std of observations?). Additionally, more realistic examples and baseline would be interesting. As it is, it seems needlessly convoluted and of unclear merit for the intended application.

#### 'Thoughts > Technical Writing' 카테고리의 다른 글

Postdoc 자리를 물어보는 이메일 (0) | 2017.12.16 |
---|---|

영어 논문 글쓰기 (0) | 2017.07.03 |

Reviews I got from IROS 2017 (0) | 2017.06.27 |

Reviews I got from RSS 2017 (0) | 2017.05.10 |

Another Reject from ICRA 2017 (0) | 2017.01.26 |

Robotics paper (0) | 2015.02.16 |

- Filed under : Thoughts/Technical Writing
- 0 Comments 0 Trackbacks