# Comments from ICRA 2016 (rejected GRP and accepted LevOpt)

Posted 2016. 1. 15. 14:03I am going to Sweden!!

The paper proposes a Gaussian process regression-based path set sampling method which generate a set of paths through some pre-defined anchor points. To generate paths, the authors sample from the posterior distribution and then search through the samples to select the shortest collision-free path.

While the paper shows promise, there are a number of problems with the paper in its current form. The reviewers have identified several technical issues in the paper that should be corrected before publication. Additionally, the authors provide **no ****theoretical guarantees **and **do not compare** to other sampling-based approaches. Doing either of these things would greatly strengthen the author뭩 argument and contribute to the significance of the paper.

Reviewer 2 of ICRA 2016 submission 387

Comments to the author

======================

This paper proposes a Gaussian process regression based path set sampling method which generates a set of paths

passing pre-defined anchoring points (waypoints). The authors also propose an \epsilon Run-Up method which

fixes an initial heading angle of sampled trajectories by adding an extra anchoring point and a hyperparameters-learning method which helps sampled trajectories be more suitable for the specific dynamic models. Additionally, the proposed path set generation method is compared with a parametric curve fitting(spline) method and is applied to local path planning, target tracking and initial guess of humanoid robot motion plan to show the

validity of the proposed method. Overall, the paper is clearly written and effectively conveys its idea.

Followings are comments on this paper.

1. **Theoretical guarantee of optimality**

The idea of the proposed method is to sample a diverse set of trackable paths, select the shortest collision-free path

and then control the robot to track the path. I agree with the idea that with enough computational power, the proposed method can find pretty good trajectory. However, a concern about this idea is that** it cannot ****guarantee** (or the authors didn뭪 provide) any optimality of resulting path theoretically.

To my knowledge, there are the **fast sampling-based motion planning algorithms having the property asymptotic ****optimality in the literature** [R1]-[R5] which have made many successes in many applications.

: **이 알고리즘들은 엄청 느리다! 비교를 해보자. **

Therefore I would suggest either providing some theoretical results about usefulness of the proposed method or comparing the proposed method with algorithms in [R1]-[R5] to show that it provides good enough path with a reasonable computational cost.

2. **Dynamic feasibility of sampled paths**

In section III-B, the authors present the hyperparameters learning method and state that by selecting hyperparameters

maximizing the average log likelihood, the sampling method can effectively capture the robot dynamics. The simulation results in Fig. 4 show that this is the case for unicycle dynamics, but the results seems not sufficient

to show the validity of the proposed method for general dynamics. I believe more explanation about general kernel and dynamics will make the paper more understandable.

: **이건 어쩔 수 없는 문제점인듯. Manipulator를 쓰자. **

3. **Choosing path duration t_g**

In Algorithm 1, the planner choose the final time of the trajectory as t_g = ||x_g-x_0||/v. I뭢 not sure this choice is valid for unicycle model with large maximum angular velocities or for the other dynamics. This point should be addressed to avoid confusion.

[R1] Karaman, Sertac, and Emilio Frazzoli. "Sampling-based optimal motion planning for non-holonomic dynamical

systems." Robotics and Automation (ICRA), 2013 IEEE International Conference on. IEEE, 2013.

[R2] Webb, David J., and Jan van den Berg. "Kinodynamic rrt*: Asymptotically optimal motion planning for robots

with linear dynamics." Robotics and Automation (ICRA), 2013 IEEE International Conference on. IEEE, 2013.

[R3] Goretkin, Gustavo, et al. "Optimal sampling-basedplanning for linear-quadratic kinodynamic systems."

Robotics and Automation (ICRA), 2013 IEEE International Conference on. IEEE, 2013.

[R4] Ha, Jung-Su, Ju-Jang Lee, and Han-Lim Choi. "A successive approximation-based approach for optimal

kinodynamic motion planning with nonlinear differential constraints." Decision and Control (CDC), 2013 IEEE 52nd

Annual Conference on. IEEE, 2013.

[R5] Schmerling, Edward, Lucas Janson, and Marco Pavone. "Optimal sampling-based motion planning under differential constraints: the driftless case." Robotics and Automation (ICRA), 2015 IEEE International Conference on. IEEE, 2015.

Reviewer 3 of ICRA 2016 submission 387

Comments to the author

======================

The authors present an approach to smooth motion generation between a sequence of anchor points whereby a Gaussian process is constructed which encodes these constraints. Multiple paths are then efficiently sampled from this process and the best one chosen. This approach is applied to nonholonomic motion planning with a unicycle model and to initial trajectory selection for trajectory optimization on the physical Nao robot to good effect. It exhibits impressive motion generation performance and the authors demonstrate that it can be readily adapted to a variety of problems. ** Unfortunately, there appear to be a number of ****(entirely correctable) technical errors in the paper which ****detract from its quality**.

In Section III, the authors assert that "a Gaussian process itself cannot model a path which passes through certain

anchoring points." Given that the authors literally define such a Gaussian process in the very next paragraph, I don뭪** ****think this assertion is accurate**. A Gaussian process is, by definition, a random process in which every finite

subset of samples has Gaussian distribution. Since the distribution of a Gaussian conditional on a fixed subspace

is itself Gaussian, anchoring any subset of points in a Gaussian process yields another Gaussian process.

: **??? **

In Section III.A, the reviewer found the epsilon run-up method intuitive and interesting and suspects that the

limit as epsilon approaches 0 exists and is itself a Gaussian process with a closed form.

In Section III.C, the authors state that interpolation with n cubic splines "is defined by 6n parameters and, thus,

requires 6n data points." This is incorrect for a number of reasons. First, in 2D (which their statement assumes

without statement), each data point provides 2 constraints. Second, when interpolating, the splines are not

independent because the tangents at the endpoints ofadjacent splines are constrained to be equal. Finally,

when using cubic splines for interpolation, the tangents are generally not free parameters, but computed directly

from the points (e.g. Catmull-Rom splines). All methods for cubic spline interpolation of which the reviewer is

aware use n-1 splines to interpolate n points. It thus seems reasonable to use 13 splines to interpolate between

the 14 points rather than the 5 or 20 used by the authors. Unless the reviewer has missed something, the authors seemto be presenting something of a strawman and likely unnecessarily given the excellent fit produced by their

method.

In Section IV.A, the authors explain the reduced collision rate of their method over a look-ahead planner as "mainly

due to the fact that whereas the GRPP considers a full path connecting the current robot position to the target

position, the LAP only considers its look-ahead period." This disagrees with the authors?statement that, "Excluding

the path generation step, the LAP works identical [sic] to the GRPP." Furthermore, the graphs indicate that the

collision rate of LAP often increased as a longer look-ahead period was used, which seems to contraindicate

that statement. The reviewer finds this result distinctly counter-intuitive and feels that more information about the

scenario should be provided which might explain it, like the size and shape of the environment and obstacles,

ideally with an accompanying figure. LAP should also be described, as the references [1-4] which the authors cite

do not seem to use the same terminology.

The GPU-accelerated performance of the proposed planner is quite impressive. Given that the planner presented in this paper is effectively performing Monte Carlo optimization over paths, the authors should consider citing the work of Marin Kobilarov on "Cross-Entropy Randomized Motion Planning" in IJRR 2012 and briefly discussing the relative

merits of the approaches.

: **CE를 언급하네..**

The proposed method is interesting and appears to be effective, and the reviewer hopes that the authors will

correct the technical aspects of the paper and clarify the experimental scenarios. However, he feels that the

paper likely should not be accepted in its current state.

: **음... **

The reviewer would also like to make a few other suggestions to improve the technical aspects of the paper

in the hopes that the authors might find them useful.

- In Section I, the definition introduced here (and used throughout the paper) for 뱇ocal planning?appears to be

somewhat atypical, and alternative terminology might be clearer.

- In Section II. The authors say, "functions, i.e., infinite dimensional vectors." While the space of functions here is indeed an infinite dimensional vector space, the use of "i.e." would seem to imply that all infinite dimensional vector spaces are function spaces, which is incorrect.

- In Section III, is proposition 1 used anywhere? It lacks either explanation or proof and seems out of place.

- In Section III, the proof of Proposition 2 would be more self-contained if the reference to the relevant results of

[8] where here instead.

- In Section IV.A, it is probably worth mentioning how collision detection was performed given that sampled paths

do not have closed forms.

- The results present in Section V.B were quite impressive, but the reviewer would like the authors to

include some discussion of how the GPR used here was chosen. In particular, were the anchor points manually

selected?

Typos and Formatting

- In Section I, paragraph 1, should be "path planning was applied".

- In Section I, paragraph 1, there should not be a comma after "scenarios".

- In Section I, paragraph 3, should be "trajectories that pass".

- In Section II, why the change in notation from k(X, X) to \mathbf{K}(X, X)?

- In Section III, paragraph 7, should be "a one-dimensional".

- In Section IV.A, there seems to be a typo around "in Algorithm 1. to control". There also seems to be some

unnecessary redundancy in this paragraph.

- In Section IV.A, should be "works identically to".

- In Figures 5, 6, and 7, the legend text is less than a millimeter in height, which makes it rather difficult to

read.

Comments on the Video Attachment

================================

The video provides a concise overview of the method and

numerous informative example applications.

Reviewer 1 of ICRA 2016 submission 412

Comments to the author

======================

This paper presents a learning method from demonstration using leveraged Gaussian processes and sparse-constrained optimization. The proposed method was evaluated in two different experiments and performance improvement was observed. This manuscript can be accepted for publication with minor revision. But, the following issues need to be addressed:

* In Abstract and Section I, the authors describes demonstrations in existing approaches are provided by

experts. In my understanding, however, **learning from poor ****quality data sets has been also investigated**. In fact,

Reference [1] gives a review on "Limitations of the demonstration dataset". The authors should present other

approaches from such a point of view as:

- Aleotti, S. Caselli, Robust trajectory learning and approximation for robot programming by demonstration, in:

The Social Mechanisms of Robot Programming by Demonstration, Robotics and Autonomous Systems 54 (5)

(2006) 409?13 (special issue).

- M. Kaiser, H. Friedrich, R. Dillmann, Obtaining good performance from a bad teacher, in: Programming by

Demonstration vs. Learning from Examples Workshop at ML?5, 1995.

- N. Delson, H. West, Robot programming by human demonstration: Adaptation and inconsistency in constrained motion, in: Proceedings of the IEEE International Conference on Robotics and Automation, ICRA?6, 1996.

- M. Yeasin, S. Chaudhuri, Toward automatic robot programming: Learning human skill from visual data, IEEE

Transactions on Systems, Man and Cybernetics ?Part B: Cybernetics 30 (2000).

Moreover, the authors must show advantages against them.

* In Figure 8, the starting and ending points should be provided. The author must explain what the color strength

represents. Its caption says that an enlarged view of trajectories are shown in the top-right corner, but

actually shown in the top-left corner.

* **Some figures blur and are hard to read.** Graph legends and scale markings must be clearer and larger. If possible,

the authors should use vector graphics.

* **Future research directions should be provided**.**Further discussion would be welcome.**

Comments on the Video Attachment

================================

This video is entitled "Nonparametric Path Set Generation

Using Gaussian Process Regression". The reviewer is afraid

that the authors uploaded a wrong file.

Reviewer 2 of ICRA 2016 submission 412

Comments to the author

======================

This paper proposed a novel method for robust learning from demonstration using leveraged Gaussian process regression. The proposed method can learn from demonstrations of mixed quality by applying leveraged optimization algorithm to leveraged Gaussian process regression.

I think that this method is** very usefu**l method because it can learn from teaching data **without data cleansing**.

However, I could not understand some points in the experimental results. So I hope that authors add discussion

section and respond follow questions.

I understand that the proposed method can estimate the value of leverage parameter (gamma) by using

sparse-constrained leveraged optimization algorithm.** If so,are the calculated leverage parameters are either -1 or 1,**because teaching data are constructed from inliers (true) and outliers (false) in the experiments ?

: **No. It is between 0 and 1 as the leverage GP will basically ignore zero leverage data**.

In actually, teaching data extracted from experts include noises also. So, if we apply the proposed method to the

real problems, do the leverage parameters take the real value from -1 to 1 ?

: **Yes, the leverage parameters have values between 0 to 1**.

In Gaussian process regression, the degree of fitting for teaching data is **varied by hyper-parameters** of kernel

functions. If we vary hyper-parameter of kernel functions, are leverage parameters varied also ? I hope that authors

explain the relationship between hyper-parameters of kernel functions and leverage parameters in this paper.

: **The basic intuition behind the leverage optimization is that we treat the leverage parameter as hyperparameters of GP ann solve it via MLE. **

I could not understand what is correct leverage detection rate in Fig.6(c) and 7(c). Are outlier detection ratio and

correct leverage detection rate same mean ?

: **My bad, I need to explain things more politely**.

In Fig.6(c), correct leverage detection rates of leverage optimization without regularization and with l_2

regularization are almost the same. Why are the difference of reconstruction error between these algorithms large ?

: ,,?

In Fig.6(a), authors showed that the results of the leverage optimization using 1-norm as the constrain

condition was better than the results of other constrain conditions. Why did the differences of results between

1-norm and 2-norm generate ?

: **My bad, I need to explain things more politely.**

Reviewer 3 of ICRA 2016 submission 412

Comments to the author

======================

Authors proposed a novel method for robust learning from demonstration using leveraged Gaussian process regression. Authors adopted with a sparse constrained leverage optimization method as leveraged Gaussian process regression. This paper presented a sparse-constrained leveraged optimization algorithm using proximal linearized minimization. In all experiments, the proposed sparse-constrained method has outperformed existing LfD

methods. Authors conclueded that the proposed sparse constrained leverage optimization algorithm is successfully

applied to sensory field reconstruction and direct policy learning for planar navigation problems.

In the section IV. LEVERAGE OPTIMIZATION (C. Derivatives), authors mentioned that "Due to the l0-norm constraint, the exact optimization of (16) is demanding, in fact, it is a NP-hard problem. In the following sections, we introduce a sparse-constrained optimization method using proximal linearized minimization (PLM) [9]."

In the world, there are many types of NP-hard problem and optimization tecqnicks. **Why did authors choose** the a

sparse-constrained optimization method? What point is suitable for (16)? Please mention the reason in C. Derivatives.

: **To solve ill-posedness of the MLE of optimizing the leverage parameters whose number equals to the number of training data. **

In the section V. EXPERIMENTS (A. Sensory Field Reconstruction), authors mentioned that "The top left figure in Figure 5 indicates the reference sensory field, the original field we aim to reconstruct. The correctly measured locations (inliers) and outliers are depicted with black circles and red crosses, respectively. The top middle and top right figures show the reconstructed field using Gaussian process regression with only inliers (middle) and both inliers and outliers (right). The bottom left, middle, and right figures illustrate reconstructed fields from the leveraged Gaussian processes, where the leverages are computed with l2-regularized leverage optimization (right), l1-regularized leverage optimization (middle), and leverage optimization without regularization (left).

We can clearly see that the reconstruction performance of the proposed sparseconstrained leverage

optimization outperforms the compared methods." Readers can clearly look at the color map but cannot

understand the meaning. For example, what is difference between red color, blue color, yellow color and green

color? What's the color saturation in Figure 5? Also, readers are interested in the size of the area. How large

is the red area, blue area, yellow area, or green area in PLM field? Readers want to know the difference of the

quantity.

: **My bad, I should be more explanative about the experimental setting. **

In the section V. EXPERIMENTS (A. Sensory Field Reconstruction), authors mentioned that "The average reconstruction errors of three leverage optimization methods at different outlier rates are depicted in Figure 6(a), wherethe averages are computed from 150 independent runs."

Readers are interested in standard deviation at least. Is there statistical difference? Ex. **T-test**

In the section V. EXPERIMENTS (A. Sensory Field Reconstruction), authors mentioned that "Even though we do not adjust the soft threshold value for each outlier rate, the leverage optimization with PLM was able to automatically increase the sparsity level as the outlier rate increases which leads to superior performance in detecting the outliers as shown in Figure 6(c)."

Why did authors get these results? What is key factor? **Please make the section of discussions** and discuss the

consideration.

: **Good comments. I will add discussions. **

#### 'Enginius > Robotics' 카테고리의 다른 글

Policy Gradient Methods to Actor Critic Model (0) | 2017.04.04 |
---|---|

Reviews that I got from IROS 2016 (0) | 2016.07.02 |

Comments from ICRA 2016 (rejected GRP and accepted LevOpt) (0) | 2016.01.15 |

ROS Manipulation with Dynamixel (0) | 2016.01.11 |

Reviews that I got from ICRA and IROS (0) | 2015.11.02 |

Learning from Interactions (LfI) (0) | 2015.10.30 |

- Filed under : Enginius/Robotics
- 0 Comments 0 Trackbacks