ぇかん オグンモゥTo discover and understand.
/
Tue, 21 May 2019 17:53:19 -0500Tue, 21 May 2019 17:53:19 -0500Jekyll v3.8.5Research, Peer Review, Writing Papers.<h4 id="daniel-liberzon-research-quotes"><a href="http://liberzon.csl.illinois.edu/quote-research.html">Daniel Liberzon Research Quotes</a></h4>
<h4 id="daniel-liberzon---how-to-write-a-good-paper"><a href="http://liberzon.csl.illinois.edu/writing-guidelines.html">Daniel Liberzon - How to write a good paper</a></h4>
<h4 id="daniel-liberzon---how-to-peer-review"><a href="http://liberzon.csl.illinois.edu/peer-review.pdf">Daniel Liberzon - How to peer review</a></h4>
Sun, 10 Mar 2019 19:06:00 -0500
/liberzon
/liberzonresearchSci-Tech: Large teams develop; Small teams disrupt.<h2 id="introduction">Introduction</h2>
<p>I came across a very dope article in Nature Journal of Science by Wu and Wang from UChicago earlier this evening. I was so enthralled by the findings that I threw the rest of my half-eaten dinner away just to digest this very interesting article. It compares the impact of the scientific output from large teams versus small teams over a three scores period. I do encourage you to read the <a href="https://www.nature.com/articles/s41586-019-0941-9">paper</a> yourself but if you are too busy, here are my key takeaways from the paper.</p>
<h2 id="takeaways">Takeaways:</h2>
<p><strong>TLDR</strong></p>
<p>Small teams disrupt science and technology by exploring and amplifying promising ideas from older and less-popular work. Large teams develop recent successes, by solving acknowledged problems and refining common designs. Both small and large teams are essential to a flourishing ecology of science and technology.</p>
<p><strong>Longer version</strong></p>
<p>(1) Work by small teams will be substantially more disruptive than work by large teams; large teams may be better designed or incentivized to develop current science and technology, and that small teams disrupt science and technology with new problems and opportunities.</p>
<p>(2) Solo authors are just as likely to produce high-impact papers (in the top 5% of citations) as teams with five members, but solo-authored papers are 72% more likely to be highly disruptive (in the top 5% of disruptive papers). By contrast, ten-person teams are 50% more likely to score a high-impact paper, yet these contributions are much more likely to develop existing ideas already prominent in the system, which is reflected in the very low likelihood they are among the most disruptive.</p>
<p>(3) High-impact papers produced by small teams are the most disruptive, and high-impact papers produced by large teams are the most developmental. Case in point, within the pool of high-impact articles and patents, small teams are more disruptive w.r.t more new ideas.</p>
<p>(4) Solo authors and small teams much more often build on older, less popular ideas. Larger
teams more often target recent, high-impact work as their primary source of inspiration, and this tendency increases monotonically with team size.</p>
<p>(5) Large teams receive more of their citations rapidly, as their work is immediately relevant to more contemporaries whose ideas they develop and audiences primed to appreciate them. Conversely, smaller teams experience a much longer citation delay.</p>
<p>(6) Even though small teams receive less recognition overall owing to the rapid decay of collective attention, their successful research produces a ripple effect, which becomes an influential source of later large-team success.</p>
<p>(7) Consistent diminishing marginal increases to novelty with team size, such that with each new team member, their contribution to novel combinations decreases.</p>
<p>(8) Whereas larger teams facilitate broader search, small teams search deeper.</p>
Fri, 15 Feb 2019 16:10:00 -0600
/sci-tech
/sci-techcontrolstabilityoptimalityOptimality vs. Stability of Feedback Control Systems
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<ul>
<li><a href="#table-o-conts">Table of Contents</a></li>
<li><a href="#intro">Introduction</a></li>
<li><a href="#case">Stability vs optimality</a></li>
<li><a href="#issues">Optimality</a></li>
</ul>
<p><a name="intro"></a></p>
<h3 id="intro">Intro</h3>
<p>The question of the connection between optimality and stability is a curious thing. On the one hand, we are led to believe that if we can find an optimal control law, that can execute a plan, with the least amount of energy possible, then we are satisfied to know we have fulfilled the specifications posed in our objective function. But consider: an optimal controller is not necessarily a stable controller for a system. Why so? I lay out my case in the next section.</p>
<p><a name="case"></a></p>
<h3 id="stability-vs-optimality">Stability vs optimality</h3>
<p>Systems under the influence of optimal control laws enjoy a nice set of properties, provided that the associated cost functional enforces a constraint that is desirable on the state and control. LQ optimal control systems have nice gain and phase margins coupled with reduced sensitivity and I understand that there are similar properties that have been shown for nonlinear systems. Optimal control has the attractive property that the control effort is not wasted in mitigating the effects of nonlinearities as it chooses among a set of policies (or stabilizing control sequences) that yield a desirable effect on the system. The intractability of the HJB equation however makes optimal control as a synthesis tool for nonlinear problems a painful one.</p>
<p>Enter Lyapunov stability. Lyapunov defines classical stability as the system’s behavior near an equilibrium point such that there exists a real number \(\delta(\epsilon, t_0)>0\) for every real \(\epsilon > 0\) for which the state of the system is bound – essentially a local stability concept, a scalar bound that expresses how far away a system could ever get from the equilibrium (based on how far away it started). As Engineers, we do not want to limit ourselves to this local stability context. We want every motion starting sufficiently close to the equilibrium state to converge to the equilibrium as time approaches <em>ad infinitum</em>. Asymptotic stability captures this need. But again, asymptotic stability is as well a local concept since we do not know <em>a priori</em> how much magnitude we want for the bound. Enter equiasymptotic stability in the large. For an \(r>0\) that is fixed and arbitrarily large, we find that as \(t \rightarrow \infty\), all motions converge to the equilibrium uniformly in the initial state from which they start for \(|x_0| \le r \).</p>
<p>Note that all these definitions merely impose a constraint on the behavior of the states as they evolve over the trajectories of the system. That begs the question, can a control law be stable, yet not optimal (or vice versa)? I think so. Why?</p>
<p><a name="issues"></a></p>
<h3 id="optimality">Optimality</h3>
<blockquote>
<p>This section has an update based on what I found from Freeman and Kokotovic’s 1996 Paper in the Int’l J. Optimal & Control. Please skip to the <a href="#updatedOptStab">updated part</a></p>
</blockquote>
<p>Optimality, as Bellman would have us think, deals with reaching the goal state with as minimal an energy as possible. I would think that the principle of optimality and Lyapunov stability have a fundamental disconnect. It seems to me that we may find an optimal control law that is not stable (i.e it’s V(x) gradient function does not strictly decrease along the trajectories of the solution to the dynamic system’s differential equation).</p>
<p><del>To buttress this fact, consider that the concepts of stability and optimality appeared in the consciousness of control theorists at two distinct and disconnected eras (or so to say) in history. On the one hand, Lyapunov’s thesis got published in the Soviet union in the 1890’s but his work was not available in English until 1947. Even so, western researchers did not adequately grasp its usefulness until Kalman’s 1960 seminal paper on the second method of Lyapunov. Meanwhile, Bellman’s last formal work on DP and applied DP did not become published until 1962. What is more intriguing is that not anywhere in Bellman’s stability tests (as far as I can tell from what I have read from his books) did he use Lyapunov analyses’ rigor to establish the stability of his principle of optimality methods. Kalman, remarked in his paper in 1960 that few researchers were aware of Lyapunov methods. We can make a fairly accurate “guesstimation” that had Bellman been aware of Lyapunov’s analyses earlier, it might have creeped into his optimality analyses.</del></p>
<p><del>I had an exchange with someone about this a while ago, and I am quoting the caveats they expressed in their agreement with my observation below.</del></p>
<blockquote>
<p><del>1) If optimality is concerned only with the cost from initial condition to final condition, a control law that makes the system unstable might be desirable as unstable systems tend to be very fast.</del></p>
</blockquote>
<blockquote>
<p><del>2) The problem is what happens when you reach the final condition? An unstable system will not stop there, but will overshoot the goal and go off to infinity. So you must have the ability to switch to a stabilizing controller when you reach the goal.</del></p>
</blockquote>
<blockquote>
<p><del>An example is in fighter aircraft. I understand that they become unstable during certain maneuvers such as tight turns so they can move very fast, but then “catch” themselves and stabilize before going too far from the equilibrium.</del></p>
</blockquote>
<p><a name="updatedOptStab"></a>
<strong>UPDATE [Aug 28, 2018]</strong></p>
<p>Most of the discussions below are drawn from Freeman and Kokotovic’s <sup id="fnref:Freeman_Kokotovic"><a href="#fn:Freeman_Kokotovic" class="footnote">1</a></sup> 1996 work on <em>point-wise min-norm control laws for robust control lyapunov functions</em>.</p>
<p>They provide an optimality-based method for choosing a <em>stabilizing</em> control law once an rclf is known without resorting to cancellation or domination of nonlinear terms, which do not necessarily possess the desirable properties of optimality and may lead to poor robustness and wasted control effort.</p>
<p>The value function for a meaningful optimal stabilization problem is a Lyapunov function for the closed-loop system.</p>
<ul>
<li>
<p>Every meaningful value function is a Lyapunov function (Freeman and Kokotovic, 1996). Every Lyapunov function for every stable closed-loop system is also a value function for a meaningful optimal stabilization problem.</p>
</li>
<li>
<p>Every Lyapunov function is a meaningful value function</p>
</li>
</ul>
<p>Both bullets above are important since the first point helps with the analysis of the stability of an optimal feedback control system, while the second link will have implications for their synthesis.</p>
<ul>
<li>
<p>Every robust control lyapunov function (rclf) is a meaningful upper value function</p>
<ul>
<li>Every rclf solves the Hamilton Jacobi Isaacs equation associated with a meaningful game. For a known rclf, a feedback law that is optimal w.r.t a meaningful cost functional can be constructed. Matter-of-factly, this can be accomplished without solving the HJI equation for the upper value function or without constructing a cost functional as the optimal feedback can be directly calculated from the rclf without recourse to the HJI equation. Such control laws are called <em>pointwise min-norm</em> control laws and each one inherits the desirable properties of optimality because <em>every pointwise min-norm control law is optimal for a meaningful game</em>.</li>
</ul>
</li>
</ul>
<p>Essentially, this task is an <em>inverse optimal stabilization problem</em> where for LTI systems, the solution involves choosing a candidate value function and then constructing a meaningful cost functional in order to make the HJB equation valid. For open-loop stable nonlinear systems, one can find a solution by choosing the candidate value function as a Lyapunov function for the open-loop system. For openloop <em>unstable</em> systems, one can choose a candidate value function as a clf for the system. In Freeman and Kokotovic’s <sup id="fnref:Freeman_Kokotovic:1"><a href="#fn:Freeman_Kokotovic" class="footnote">1</a></sup>, actually the authors solve the inverse optimal <em>robust</em> stabilization problem for systems with disturbances and showed that evert rclf is an upper value function for a meaningful differential game.</p>
<div class="footnotes">
<ol>
<li id="fn:Freeman_Kokotovic">
<p>Freeman, R. A., & Kokotovic, P. V. (1996). Inverse Optimality in Robust Stabilization. SIAM Journal on Control and Optimization, 34(4), 1365–1391. https://doi.org/10.1137/S0363012993258732 <a href="#fnref:Freeman_Kokotovic" class="reversefootnote">↩</a> <a href="#fnref:Freeman_Kokotovic:1" class="reversefootnote">↩<sup>2</sup></a></p>
</li>
</ol>
</div>
Wed, 22 Aug 2018 08:10:00 -0500
/opti-stable
/opti-stablecontrolstabilityoptimalityControl Commons
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<ul>
<li><a href="#table-o-conts">Table of Contents</a></li>
<li><a href="#intro">Introduction</a></li>
<li><a href="#defs">Definitions, Theorems, Lemmas etc</a></li>
<li><a href="#nlnr">Nonlinear Control Theory</a></li>
<li><a href="#stab">Stability</a></li>
</ul>
<p><a name="intro"></a></p>
<h3 id="intro">Intro</h3>
<p>Here are a few control theorems, concepts and diagrams that I think every control student should know. I keep updating this post, so please check back from time to time.</p>
<p><a name="defs"></a></p>
<h3 id="definitions-theorems-lemmas-and-such">Definitions, Theorems, Lemmas and such.</h3>
<p><a name="nlnr"></a></p>
<h4 id="nonlinear-control-theory">Nonlinear Control Theory</h4>
<ul>
<li>A differential equation of the form</li>
</ul>
<p>\begin{align}
dx/dt = f(x, u(t), t), \quad -\infty < t < +\infty
\label{eq:diff_eq}
\end{align}</p>
<p>is said to be free (or unforced) if \(u(t) \equiv 0\) for all \(t\). That is \eqref{eq:diff_eq} becomes</p>
<p>\begin{align}
dx/dt = f(x, t), \quad -\infty < t < +\infty
\label{eq:unforced}
\end{align}</p>
<ul>
<li>If the differential equation in \eqref{eq:diff_eq} does not have an explicit dependence on time, but has an implicit dependence on time, through \(u(t)\), then the system is said to be stationary. In other words, a dynamic system is <strong>stationary</strong> if</li>
</ul>
<p>\begin{align}
f(x, u(t), t) \equiv f(x, u(t))
\label{eq:stationary}
\end{align}</p>
<ul>
<li>A stationary system \eqref{eq:stationary} that is free is said to be <em>invariant under time translation</em>, i.e.</li>
</ul>
<p>\begin{align}
\Phi(t; x_0, t_0) = \Phi(t + \tau; x_0, t_0 + \tau)
\label{eq:free_stat}
\end{align}
- \(\Phi(t; x_0, t_0)\) is the analytical solution to \eqref{eq:diff_eq}; it is generally interpreted as the solution of \eqref{eq:diff_eq}, with fixed \(u(t)\), going through state \(x_0\) at time \(t_0\) and observed at time \(t\) later on. This is a clearer way of representing the d.e.’s solution as against \(x(t)\), which is popularly used in most text nowadays.</p>
<ul>
<li>
<p>\(\Phi(\cdot)\) is generally referred to the transition function, since it relates the transformation from \(x(t_0)\) to \(x(t)\).</p>
</li>
<li>
<p>For a physical system, \(\Phi\) has to be <em>continuous in all of its arguments.</em>
.</p>
</li>
<li>
<p>If the rate of change \(dE(x)/dx\) of an isolated physical system is negative for every possible state x, except for a single equilibrium state \(x_e\), then the energy will continually decrease until it finally assumes its minimum value \(E(x)\).</p>
</li>
<li>The <strong>first method of Lyapunov</strong> deals with questions of stability using an explicit representation of the solutions of a differential equation
<ul>
<li>Note that the <strong>second method</strong> is more of a historical misnomer, perhaps more accurately described as a philosophical point of view rather than a systematic method. Successful application requires the user’s ingenuity.</li>
</ul>
</li>
<li>
<p>In contrast to popular belief that the energy of a system and a Lyapunov function are the same, they are not the same. Why? Because <strong>the Lyapunov function, \(V(x)\), is not unique</strong>. To quote Kalman, “a system whose energy \(E\) decreases <em>on the average</em>, but not necessarily at each instant, is stable but \(E\) is not necessarily a Lyapunov function.”</p>
</li>
<li>
<p><strong>Lyapunov analysis and optimization</strong>: Suppose a performance index is defined to be the error criterion between a measured and an estimated signal; suppose further that this criterion is integrated w.r.t time, then the performance index is actually a Lyapunov function – provided that the error is not identically zero along any trajectory of the system.</p>
</li>
<li>
<p><strong>Existence, uniqueness, and continuity theorem</strong>:</p>
<p>Let \(f(x, t)\) be continuous in \(x,t\), and satisfy a Lipschitz condition in some region about any state \(x_0\) passing through time \(t_0\):</p>
<p>\begin{align}
R(x_0, t_0) &=
||x - x_0|| \le b(x_0) \nonumber
\end{align}</p>
<p>\begin{align}
R(x_0, t_0) &= ||t - t_0|| \le c(t_0) \quad (b, c) > 0
\end{align}</p>
</li>
</ul>
<p>with the Lipschitz condition satisfied for \((x,t), (y,t)\) \(\in\) \(R(x_0, t_0)\), then it follows that
\begin{align}
||f(x,t) - f(y,t)|| \le k \, ||x-y|| \nonumber
\end{align}</p>
<p>where \(k>0\) depends on \(b, c\). THUS,</p>
<ul>
<li>
<p>there exists a unique solution \(\Phi(t; x_0, t_0)\) of \(dx/dt\), that starts as \(x_0, t_0\) for all \(|t - t_0| \le a(t_0)\),</p>
</li>
<li>
<p>\(a(t_0) \ge \text{ Min (}{c(t_0), b(x(t_0))/M(x_0, t_0)}\), where \(M(x_0, t_0)\) is the maximum assumed by the continuous function \(f(x,t)\) in the closed, bounded set \(R(x_0, t_0)\)</p>
</li>
<li>
<p>in some small neighborhood of \(x_0, t_0\), the solution is continuous in its arguments</p>
</li>
</ul>
<p>Observe that the Lipschitz condition only implies continuity of \(f\) in \(x\) but not necessarily in \(t\); as it is implied by the bounded derivatives in \(x\). Note that the local lipschitz condition required by the theorem only implies desired properties of a solution near \(x_0, t_0\).</p>
<p>The <em>finite escape time</em> (that is the solution leaves any compact set within a finite time) quandary does not allow us to make conclusions surrounding arbitrarily large values of \(t\). The phrase “<strong>finite escape time</strong>” describes the concept that a trajectory escapes to infinity at a finite time. <strong>In order that a differential equation accurately represent a physical system, the possibility of finite escape time has to be mitigated by an explicit assumption to the contrary.</strong> If the Lipschitz condition holds for \(f\) everywhere, then there can be no finite escape time. The proof is easy by integrating both sides of \eqref{eq:diff_eq} and using</p>
<p>\begin{align}
\Phi(t; x_0, t_0) \le ||x_0|| + || \int_{t_0}^{t}f(\Phi(\tau; x_0, t_0), \tau)d\tau ||
\end{align}</p>
<p>\begin{align}
||x_0|| + k \int_{t_0}^{t}f(\Phi(\tau; x_0, t_0), \tau)d\tau
\end{align}</p>
<p>where \(f(\cdot)\) obeys the lipschitz condition,</p>
<p>\begin{align}
||f(x,t) - f(y,t)|| \le k \, ||x-y||. \nonumber
\end{align}</p>
<p>By the Gronwall-Bellman lemma,</p>
<p>\begin{align}
||\Phi(t; x_0, t_0) || \le [\exp \, k (t - t_0)] ||x_0 || \nonumber
\end{align}</p>
<p>which is less than \(\infty \) for any finite \((t - t_0)\).</p>
<p><a name="stab"></a></p>
<h3 id="stability">Stability</h3>
<p>My definitions follow from R.E Kalman’s 1960 seminal paper since they are clearer to understand compared to the myriad of definitions that exist in many texts today. <strong>Stability concerns the deviation about some fixed motion</strong>. So, we will be considering the deviations from the equilibrium state \(x_e\) of a free dynamic system.</p>
<p>Simply put, here is how Kalman defines stability, if \eqref{eq:diff_eq} is slightly perturbed from its equilibrium state at the origin, all subsequent motions remain in a correspondingly small neighborhood of the origin. Harmonic oscillators are a good example of this kind of stability. <strong>Lyapunov</strong> himself defines stability like so:</p>
<ul>
<li>
<table>
<tbody>
<tr>
<td>An equilibrium state \(x_e\) of a free dynamic system ios <em>stable</em> id for every real number \(\epsilon>0\), there exists a real number \(\delta(\epsilon, t_0)>0\) such that \(</td>
<td> </td>
<td>x_0 - x_e</td>
<td> </td>
<td>\le \delta \) implies</td>
</tr>
</tbody>
</table>
</li>
</ul>
<p>\begin{align}
||\Phi(t; x_0, t_0) - x_e|| \le \epsilon \quad \forall \quad t \ge t_0 \nonumber
\end{align}</p>
<p>This is best imagined from the figure below:</p>
<div class="fig figcenter fighighlight">
<img src="/assets/control/stability.png" width="100%" height="450" align="middle" />
<div class="figcaption" align="middle">Fig. 1. The basic concept of stability. Courtesy of R.E. Kalman
</div>
</div>
<p>Put differently, the system trajectory can be kept arbitrarily close to the origin/equilibrioum if we start the trajectory sufficiently close to it. If there is stability at some initial time, \(t_0\), there is stability for any other initial time \(t_1\), provided that all motions are continuous in the initial state.</p>
<ul>
<li>Asymptotic stability: The requirement that we start sufficiently close to the origin and stay in the neighborhood of the origin is a rather limiting one in most practical engineering applications. We would want to require that our motion should return to equilibrium after any small perturbation. Thus, the classical definition of Lyapunov stability is
<ul>
<li>an equilibrium state \(x_e\) of a free dynamic system is <em>asymptotically stable</em> if
<ul>
<li>it is stable and</li>
<li>every motion starting sufficiently near \(x_e\) converges to \(x_e\) as \(t \rightarrow \infty\).</li>
</ul>
</li>
<li>put differently, there is some real constant \(r(t_0)>0\) and to every real number
\(\mu > 0\) there corresponds a real number \(T(\mu, x_0, t_0)\) such that \(||x_0 - x_e|| \le r(t_0)\) implies</li>
</ul>
<p>\begin{align}
||\Phi(T; x_0, t_0)|| \le \mu \quad \forall \quad t \ge t_0 + T \nonumber
\end{align}</p>
<div class="fig figcenter fighighlight">
<img src="/assets/control/asymptotic_stability.png" width="80%" height="350" align="middle" />
<div class="figcaption" align="left">Fig. 1. Definition of asymptotic stability. Courtesy of R.E. Kalman
</div>
</div>
</li>
</ul>
<p>Asymptotic stability is also a local concept since we do not know aforetime how small \(r(t_0)\) should be. For motions starting at the same distance from \(x_e\), none will remain at a larger distance than \(\mu\) from \(x\) at arbitrarily large values of time. Or to use Massera’s definition:</p>
<ul>
<li>An equilibrium state \(x_e\) of a free dynamic system is <em>equiasymptotically stable</em> if
<ul>
<li>it is stable</li>
<li>
<p>every motion starting sufficiently near \(x_e\) converges to \(x\), as \(t \rightarrow \infty\) uniformly in \(x_0\)</p>
</li>
<li>
<p>Interrelations between stability concepts: This I gleaned from Kalman’s 1960 paper on the second method of Lyapunov.</p>
<div class="fig figcenter fighighlight">
<img src="/assets/control/control_concepts.png" width="100%" height="450" align="middle" />
<div class="figcaption" align="middle">Fig. 1. Interrelations between stability concepts. Courtesy of R.E. Kalman
</div>
</div>
</li>
</ul>
</li>
<li>For <em>linear systems</em>, stability is independent of the distance of the initial state from \(x_e\). Nicely defined as such:
<ul>
<li>an equilibrium state \(x_e\) of a free dynamic system is <em>asymptotically (equiasymptotically) stable in the large</em> if
(i) it is stable</li>
</ul>
<p>(ii) every motion converges to \(x_e\) as \(t \rightarrow \infty \), i.e., every motion converges to \(x_e\), uniformluy in \(x_0\) for \(x_0 \le r\), where \(r\) is fixed but arbitrarily large</p>
</li>
</ul>
<p><strong>To be Continued</strong></p>
Sat, 04 Aug 2018 08:10:00 -0500
/control-commons
/control-commonscontrolstabilitynonlinear-controlWhat Good Research is Not<p>This post includes curated links to advice on doing good research from people I respect.</p>
<p>I hope you find time to enjoy reading them.</p>
<ul>
<li>
<p><a href="https://www.cc.gatech.edu/~parikh/citizenofcvpr/static/slides/freeman_how_to_write_papers.pdf">How to write a good research paper ~ Bill Freiman</a></p>
<blockquote>
<p>Many readers will skim over formulas on their first reading of your exposition. Therefore, your sentences should flow smoothly when all but the simplest formulas are replaced by “blah” or some other grunting noise.</p>
</blockquote>
</li>
<li>
<p><a href="http://people.csail.mit.edu/billf/publications/How_To_Do_Research.pdf">How to do good research ~ Bill Freiman</a></p>
<blockquote>
<p>Sometimes it’s useful to think that everyone else is an idiot. This lets you do things that no one else is doing. It’s best not to be too vocal about that. You can say something like “Oh, I just thought I’d try out this direction”</p>
</blockquote>
</li>
<li>
<p><a href="http://people.csail.mit.edu/billf/talks/10minFreeman2013.pdf">Elements of a Successful Graduate Career</a></p>
<blockquote>
<p>I think the most important thing in research is a story – not a theorem or an algorithm – but the story that makes the theorem or algorithm interesting and exciting. It’s important to have an “ear” for a good story… when do the stories make sense, when are they bogus? ~ Tomas Lozano-Perez.</p>
</blockquote>
<blockquote>
<p>The best students are possessed by a problem. They’re independent. They teach their advisors. They don’t do what they’re told…they do something more interesting. ~ Leslie Kaelbling</p>
</blockquote>
<blockquote>
<p>Don’t tell your advisor you’re doing what they advised against until you’ve solved the problem. ~ Manolis Kellis</p>
</blockquote>
<blockquote>
<p>Which brings us to the moral of the story: More important than your thesis topic is who your advisor is. ~ Charles Leiserson</p>
</blockquote>
<blockquote>
<p>Eat, sleep, and breathe a problem until you crack it. Become the world’s foremost expert on your thesis topic. Surpass your advisor. ~ Daniel Jackson</p>
</blockquote>
</li>
<li>
<p><a href="http://www.ai.mit.edu/courses/6.899/papers/ted.htm">On how to write papers, Ted Adelson</a></p>
<blockquote>
<p>Start by stating which problem you are addressing, keeping the audience in mind. They must care about it, which means that sometimes you must tell them why they should care about the problem. Then state briefly what the other solutions are to the problem, and why they aren’t satisfactory. If they were satisfactory, you wouldn’t need to do the work. Then explain your own solution, compare it with other solutions, and say why it’s bettter. At the end, talk about related work where similar techniques and experiments have been used, but applied to a different problem. Since I developed this formula, it seems that all the papers I’ve written have been accepted.</p>
</blockquote>
</li>
</ul>
<h3 id="vladen-koltuns-advice"><a href="https://www.cc.gatech.edu/~parikh/citizenofcvpr/static/slides/koltun_doing_good_research.pdf">Vladen Koltun’s Advice</a></h3>
<ul>
<li>Picking a problem
<ul>
<li>Formulate a larger goal</li>
<li>Personally meaningful</li>
<li>Fits into the scheme of collective progress</li>
</ul>
</li>
<li>Analyze bottlenecks</li>
<li>
<p>Understand the state of the art</p>
</li>
<li>Making a contribution:
<ul>
<li>Read the papers</li>
<li>Look for unwarranted assumptions</li>
<li>What are the limitations? When will this break? How could this be done better?</li>
</ul>
</li>
<li>Reimplement a state-of-the-art technique
<ul>
<li>Reproduce the results</li>
<li>Then bombard it with controlled experiments</li>
<li>Look for surprises, cracks that lead to deeper realizations</li>
</ul>
</li>
<li>Be on the lookout for interesting contributions</li>
<li>Many important findings are not what the researchers set out to find
<ul>
<li>“Scheele happened upon chlorine while trying to isolate manganese; Claude Bernard planned experiments to characterize the destructive agent in sugar but instead discovered the glycogenic function of the liver; and so on”
~ Ramón y Cajal, Letters to a Young Investigator</li>
</ul>
</li>
<li>Publications
<ul>
<li>Quality, not quantity</li>
<li>Do not compromise on methodology or ethics</li>
<li>Be willing to bury drafts and move on</li>
</ul>
</li>
<li>Publication portfolio
<ul>
<li>Most are prosaic</li>
<li>Some are significant</li>
<li>None are sloppy</li>
</ul>
</li>
<li>High standards
<ul>
<li>Bury the weak, boring, and sloppy results</li>
<li>Weak and sloppy work is a drain on the community. Can mislead. Goes against the goal of contributing something useful to the community</li>
<li>Quantity is easy. The community doesn’t need more quantity.</li>
</ul>
</li>
<li>Research over time
<ul>
<li>Research begets research</li>
<li>Keep track of favorite problems, revisit occassionally</li>
<li>Go back to the larger goals</li>
<li>Read. A lot.</li>
<li>Write down ideas. Talk to people.</li>
<li>Quiet time for reading, writing, thinking.</li>
</ul>
</li>
<li>For pete’s sake, get a good work ethic
<ul>
<li>
<p>I do not believe a person can ever leave their business. They ought to think of it by day and dream of it by night. […] if they intend to go forward and do anything, the whistle is only a signal to start thinking over the day’s work in order to discover how it might be done better. […] The person who has the largest capacity for work and thought is the person who is bound to succeed.
~ Henry Ford, My Life and Work</p>
</li>
<li>
<p>In science as in the lottery, luck favors those who wager the most – that is, by another analogy, those who are tilling constantly the ground in their garden.
~ Ramón y Cajal, Letters to a Young Investigator</p>
</li>
<li>
<p>Successful people exhibit more activity, more energy, than most people do. They look more places, they work harder, they think longer than less successful people. Knowledge and ability are much like compound interest – the more you do the more you can do, and the more opportunities are open for you.
~ Hamming, Striving for Greatness in All You Do</p>
</li>
</ul>
</li>
</ul>
<h3 id="david-mermin"><a href="http://www.ai.mit.edu/courses/6.899/papers/mermin.pdf">David Mermin</a></h3>
<p>Always punctuate your equations. Math is prose. Number all equations in your text. It helps your readers.</p>
Thu, 02 Aug 2018 09:21:00 -0500
/good-research
/good-researchresearchgood-researchNeural Networks and Adaptive Control
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<ul>
<li>
<p><a href="#table-contents">Table of Contents</a></p>
</li>
<li>
<p><a href="#intro">Intro</a></p>
</li>
<li>
<p><a href="#history">History Teaser</a></p>
</li>
<li>
<p><a href="#adaptive-systems">Adaptive Systens</a></p>
<ul>
<li>
<p><a href="#nonlinear">Nonlinear neuro-control</a></p>
</li>
<li>
<p><a href="#brouhaha">Adaptive Neuro-Control: The Reconstruction Error Brouhaha</a></p>
</li>
<li>
<p><a href="#rec-example">Reconstruction Error Python Example</a></p>
</li>
</ul>
</li>
</ul>
<p><a name="table-contents"></a></p>
<h3 id="intro">Intro</h3>
<p>It’s been more than a year since my last post. I am sorry. I blame the publish or perish academic mantra. Now that I am toward the end of my degree, I shall try to keep up to date by elucidating on topics that capture my imagination right here. Before we dive into the proper topic, allow me the opportunity to delve a little deeper into the history and and how the development of adaptive control theory became a field of study.</p>
<p><a name="history"></a></p>
<h4 id="history-teaser">History Teaser</h4>
<p>Adaptive control research was motivated in the ’50s by the problem of designing autopilots whose parameters changed over a wide operating range in speeds and altitudes. Classical fixed-gain controllers could not solve the frequent parameter variations in such systems. Therefore people developed gain scheduling techniques with auxiliary measurements of airspeed in controlling aircrafts. With gain scheduling came basic methods for adjusting the adaptation mechanism in model reference systems – the idea was to develop a self-tuning controller that adapted for parameter variations in a closed-loop reference model scheme. Adjustment mechanisms developed included sensitivity rules such as the M.I.T. rule, which performed reasonably well under some conditions. Rudolf Kalman in 1958 rigorously analyzed the self-tuning controller and established the explicit identification of the controller parameters of a linear SISO (Single-Input, Single-Output) plant so that these could be used to tune an optimal linear quadratic (LQ) controller. In the 60’s, Parks [1966], demonstrated use of Lyapunov analysis in establishing the stability and convergence of adaptive systems. Advances in system identification enhanced the way update laws were determined for model reference schemes. Stochastic control and dynamic programming coupled with Lyapunov stability laws placed a firm footing on proving convergence for adaptive control systems. The ’70s era witnessed a resurgence in the complete proofs of stability for model reference adaptive schemes e.g. Lyapunov state space proofs from Narendra, Lin and Valavani, and Morse. In the discrete time deterministic and stochastic domains, stability proofs also appeared about this time. Then came Rohr’s example in the ‘80’s where the assumptions of stability were found to be very sensitive to the presence of unmodeled dynamics (e.g. ignored high-frequency parasitic modes in order not to complicate controller design). Researchers started working on the robustness of adaptive schemes and their sensitivity to transient behaviors. The extension of adaptive control to linear time-varying parameters was a major obstacle until the ’80s when basic robustness questions were answered. Tactics such as dead-zone modification, dynamic normalizing signal together with leakage or parameter projection were used to deal with a great deal of parameter variations. This class included slowly-varying parameters as well as infrequent jumps in parameter values. In several cases, the error from time-varying signals were reduced through proper parameterizations of the time-varying plant model used in the control design.</p>
<p><a name="adaptive-systems"></a></p>
<h4 id="on-adaptive-systems">On adaptive systems</h4>
<p>If we relax the restrictive assumptions that govern the implementation of adaptive control on physical systems, adaptive control can deal with any size of parametric uncertainty, as well as the dynamic uncertainties that arise from neglected dynamics if correct robust algorithms are used. Most of the stability results on adaptive systems that appeared in the ’80s dealt mostly with cases where no modeling errors were present – a very restrictive assumption <sup id="fnref:Ioannou1"><a href="#fn:Ioannou1" class="footnote">1</a></sup> <sup>&</sup> <sup id="fnref:Ioannou2"><a href="#fn:Ioannou2" class="footnote">2</a></sup>. While LTI methods can be used in understanding the dynamics of robust modification laws to adaptive systems e.g. dynamic normalizing signal that limits the rate of adaptation to be finite and small relative to level of dynamic uncertainty, adaptive control designed for LTI plants give rise to closed-loop systems that are nonlinear <sup id="fnref:Sastry"><a href="#fn:Sastry" class="footnote">3</a></sup>. Thus, traditional methods for analyzing stability such as poles, zeros, gain and phase margins make little sense for analyzing such nonlinear systems. The limitation of estimated controller parameters to assume large values eliminates the possibility of high gain control <sup id="fnref:Sastry:1"><a href="#fn:Sastry" class="footnote">3</a></sup> as high gain or high speed control can increase instability due to the high bandwidth that the controller gets subjected to. Therefore, people focused on the development of robust adaptive control systems, where closed-loop stability properties were guaranteed not just in the presence of large parametric uncertainty, but also in the presence of modeling errors that involved <strong>additive disturbances</strong> and <strong>unmodeled dynamics</strong>. Even then, these methods made assumptions about the nature of the uncertainties in such systems by assuming the bound on the uncertainty was known aforetime. However, the bounds on the allowable dynamic uncertainties cannot be calculated as easily as in the nonadaptive case because of the nonlinear nature of the adaptive system coupled with the fact that the plant parameters are deemed unknown.</p>
<p>Techniques such as backstepping and parameter-tuning functions appeared in literature in the ’90s for Lyapunov stability and estimation schemes (mostly from Prof. Kokotovic’s group, <sup id="fnref:Kokotovic1"><a href="#fn:Kokotovic1" class="footnote">4</a></sup> <sup>&</sup> <sup id="fnref:Kokotovic2"><a href="#fn:Kokotovic2" class="footnote">5</a></sup>) and they proved to be quite good control design strategies. However, these studies assumed nonlinearities that were known ahead of time – assumptions that make adaptive control very difficult to implement in the real world. Nonlinear techniques based on Lyapunov analysis and passivity arguments plus linear systems theory were used in establishing the stability/robustness margins that are not so easy to compute as in the LTI case.</p>
<blockquote>
<p>Techniques such as backstepping and parameter-tuning functions appeared in literature in the ’90s (mostly from Prof. Kokotovic’s group, <sup id="fnref:Kokotovic1:1"><a href="#fn:Kokotovic1" class="footnote">4</a></sup> <sup>&</sup> <sup id="fnref:Kokotovic2:1"><a href="#fn:Kokotovic2" class="footnote">5</a></sup>) for Lyapunov stability and estimation schemes and they proved to be quite good control design strategies.</p>
</blockquote>
<p>In the linear time-varying case, stability margins, bandwidth margins, bandwidth, frequency domain characteristics, poles, zeros do not make much sense even for time-varying parameters unless approximations are made using the assumption of slowly varying parameters, etc (See <sup id="fnref:Bellman"><a href="#fn:Bellman" class="footnote">6</a></sup>’s Applied Dynamic Programing Book esp. chapter on numerical approximations and why calculus of variations is not sufficient for real-world problems).</p>
<p><a name="nonlinear"></a></p>
<h3 id="nonlinear-neuro-control">Nonlinear neuro-control</h3>
<p>In nonlinear systems, it is not only the parameters that are nonlinear (e.g. simple Riemann integral functionals), but also the functions that enter through the arguments of the right hand side of an ode (<em>the so-called problem of Bolza<sup id="fnref:1"><a href="#fn:1" class="footnote">7</a></sup> or the problem of Mayer <sup id="fnref:2"><a href="#fn:2" class="footnote">8</a></sup>, which are both special cases of the Riemann-Stieltjes integral <sup id="fnref:3"><a href="#fn:3" class="footnote">9</a></sup> readily come to mind</em> ). Adaptive control was designed to stabilize system parameters by adapting for nonlinear <em>parameters</em> and <strong>NOT</strong> nonlinear functions. The extension of adaptive controllers to nonlinear systems from LTI and LTV systems is therefore a complicated one. There are two general cases of adopting adaptive control to nonlinear systems:</p>
<ul>
<li>nonlinear systems whose nonlinear functions are known but unknown parameters appear linearly.
<ul>
<li>easy: check! Techniques from feedback linearization, backstepping and such are good for such approaches</li>
</ul>
</li>
<li>the nonlinear functions are assumed known by multiplying nonlinear basis functions with unknown parameters to be determined.
<ul>
<li>welcome to control theory!</li>
</ul>
</li>
</ul>
<p>This second option falls under categories where the basis functions are typically deduced from <strong>function approximation</strong> parameters (or weights as they are called these days) and they are assumed to appear linear-in-the-parameters of the nonlinear system. This linear-in-the-parameters property is fundamental for developing analytical stability results with large regions of attraction.</p>
<p>However, most nonlinear systems do not have such linear-in-the-parameters structure. Therefore approximation techniques such as these simplified ones call for a greater application of the mind. Over the last several years, neural networks have developed as an approximation technique for unknown nonlinearities. Although from a mathematical control standpoint, the neural networks are just one subset of many class of function approximators that have been used in controlling nonlinear processes. Other approximators include polynomial functions, radial basis functions, spline functions, and fuzzy logic systems (as a side note, the Sendai railway system in Japan is controlled by fuzzy set membership rules and its <a href="http://skisko.blogspot.com/2005/06/fuzzy-logic-and-its-practical-use-in.html">efficiency</a> has been said to be comparable to that of the blue railway line in the Los Angeles metro system).</p>
<p>It is 2018 and it is certainly no doubt that neural networks have found much use in controlling very uncertain, nonlinear, and complex systems. If you are in a foreign country and you find yourself using google translate, there is a decent chance that a giant composite neural network in the backend is doing the heavy-lifting for you. So also in image recognition and music composition among others neural networks have solved problems that were once thought impossible due to the great computational resources required. The question is how can we harness the role of neural networks in control of large processes and still guarantee stability as opposed to say, dumb reinforcement learning (which basically optimizes an index of performance without regard to stability)?</p>
<!-- ### The case for numerically-stable neuro-adaptve control -->
<p>To paraphrase the legendary Karl Astrom, “adaptive systems have witnessed the formalization of methods” for designing control and automation algorithms in linear and mildly nonlinear systems. There are, however, pertinent nonlinear problems that adaptive systems have not solved. More so, there are quite a few restrictive assumptions on the network reconstruction error that may mitigate the efficacy of an effective neuro-controller such as (i) the inadequacy of the online approximator to exactly match an uncertain nonlinear function despite the selection of optimal weights (i.e. the so-called ideal matching conditions are not satisfied).</p>
<p><a name="brouhaha"></a></p>
<h4 id="adaptive-neuro-control-the-reconstruction-error-brouhaha">Adaptive Neuro-Control: The Reconstruction Error Brouhaha</h4>
<p>To illustrate the way the reconstruction error can make the life of a control designer really miserable, I shall be borrowing the example from <sup id="fnref:Polycarpou"><a href="#fn:Polycarpou" class="footnote">10</a></sup><sup>’s</sup> IEEE TAC 1996 paper on “Stable Adaptive Neural Control Scheme for Nonlinear Systems”.</p>
<p>Suppose that we have a second-order system,</p>
<p>\begin{align}
\dot{x_1} &= x_2 + f^\star(x_1) \quad \nonumber \newline
\dot{x_2} & = u
\label{eq:second_order_ode}
\end{align}</p>
<p>where \(f^\star\) is an unknown smooth function. We seek to drive the system output \(y = x_1 \) to a small neighborhood of the origin. Without loss of generality, we shall denote the estimate of the smooth function \(f\) as</p>
<p>\begin{align}
f^\star(x_1) = f(x_1) + \phi(x_1)
\end{align}</p>
<p>where \(\phi\) is an unknown function denoting the system uncertainty (could also be modeling errors). We will be turning off the adaptation in our neural network by requiring the neural network to approximate the unknown uncertainty \(\phi(x_1)\) rather than the overall dynamic system \(f\). We thus end up with a nominal controller which, for example, could be a linear approximation of \(f(x_1)\) for linear control methods.</p>
<p>Let us consider the online approximation of \(\phi\) by linearly parameterized radial basis functions with fixed centers and widths. It follows that we can rewrite \eqref{eq:second_order_ode} as</p>
<p>\begin{align}
\dot{x_1} = x_2 + f(x_1) + \theta^{\star^T} \zeta(x_1) + \delta(x_1)
\end{align}</p>
<p>where \(\zeta: \mathbb{R}\rightarrow \mathbb{R}^n\) is a known vector of smooth basis functions, \(\theta^\star \in \mathbb{R}^n\) is an unknown weight vector which is chosen to represent \(\theta\) such that it minimizes \(\delta(x_1)\) for all \(x_1 \in \Omega\), where \(\Omega \subset \mathbb{R}\) is a compact region, i.e.,</p>
<p>\begin{align}
\theta^\star := \arg \min {\sup_{x_1 \in \Omega} | \phi(x_1) - \theta^T \zeta(x_1)|};
\end{align}</p>
<p>\(\delta\) denotes the network reconstruction error, which we will interpret as</p>
<p>\begin{align}
\delta(x_1) = \phi(x_1) - \theta^{\star^T} \zeta(x_1).
\end{align}</p>
<p>The network reconstruction error is very crucial in representing the <em>minimum possible deviation</em> from the unknown function \(\phi\) and the I/O of the function approximator. Generally, by the <em>universal approximation theorem for neural networks<sup id="fnref:Funahashi"><a href="#fn:Funahashi" class="footnote">11</a></sup></em>, one can make \(\delta\) arbitrarily small on a compact set by making the number of parameters (or weights) i.e. \(n\) really large.</p>
<ul>
<li>Assumption I: On the compact region \(\Omega \subset R\), \begin{align}
|\delta(x_1)| \le \psi^\star \quad \forall \, x_1 \in \Omega,
\label{eq:error_bound}
\end{align}</li>
</ul>
<p>where \(\psi^\star \ge 0\) is an unknown bound.</p>
<p>What becomes clear from \eqref{eq:error_bound} is that \(\psi^\star\) is not unique owing to any \(\bar{\psi}^\star > \psi^\star \). So let us define \(\psi^\star\) to be the smallest (nonnegative) constant such that \eqref{eq:error_bound} is satisfied.</p>
<p>We will be showing <em>semi-global</em> stability for the system in \eqref{eq:second_order_ode} in the next subsection for values of \(x_1(t) \in \Omega \) where the the set \(\Omega\) and bounding parameter \(\psi^\star\) can be arbitrarily large. When \(x_1\) in \eqref{eq:error_bound} holds for all values in the real space, we have <em>global stability</em>.</p>
<h4 id="proof-of-semi-global-stability">Proof of semi-global stability</h4>
<p>This section is not too important if you do not care for proofs but it will help in forming the conclusions we will be making in the next subsection.</p>
<p>We could change coordinates as follows:</p>
<p>\begin{align}
z_1 &= x_1 \nonumber \newline
z_2 &= x_2 - \alpha (x_1, \theta, \psi),
\label{eq:diff_eq}
\end{align}</p>
<p>where, \(\alpha (x_1, \theta, \psi) = -x_1 - f(x_1) - \theta^T \zeta(x_1) - \beta_1(x_1, \psi)\), and \(\beta_1(\cdot)\) is a functional to be shortly defined; suppose further that we set the weighting estimation error and the <em>adaptive bounding parameter</em> error as \(\tilde{\theta} = \theta - \theta^\star\), and \(\tilde{\psi} = \psi - \psi_m^\star\) respectively, where \({\psi_m}^\star := \text{ max } \, { [\psi^\star, \psi^0] }\) such that \(\psi^0 \ge 0 \), then we can then define a lyapunov function as follows:</p>
<p>\begin{align}
V = \frac{1}{2}(z_1^2 + z_2^2 + \tilde{\theta}^T \Gamma^{-1} \tilde{\theta} + \gamma^{-1}\tilde{\psi}^2),
\end{align}</p>
<p>where \(\Gamma\) is a (symmetric) positive definite matrix – the adaptation gain for the vector \(\theta\), and \(\gamma > 0\) is the adaptation gain for the basis functions \(\psi\). We find that the time derivative of the lyapunov function satisfies,</p>
<p>\begin{align}
\dot{V} = z_1 \dot{z}_1 + z_2 \dot{z}_2 + \tilde{\theta}^T \Gamma^{-1} \dot{\theta} + \gamma^{-1}\tilde{\psi}\dot{\psi}
\end{align}</p>
<p>so that</p>
<p>\begin{align}
\dot{V} &= z_1 \dot{x}_1 + z_2 \left(\dot{x}_2 - \frac{\partial \alpha}{\partial x_1} \dot{x_1} - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right) + \tilde{\theta}^T \Gamma^{-1} \dot{\theta} + \gamma^{-1}\tilde{\psi}\dot{\psi} \newline
%
&:= z_1 \left(x_2 + f(x_1) + {\theta^\star}^T \zeta(x_1) + \delta(x_1) \right) +
z_2 \left(u - \frac{\partial \alpha}{\partial x_1} \dot{x_1} - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right) \nonumber \newline
& + \qquad \qquad \tilde{\theta}^T \Gamma^{-1} \dot{\theta} + \gamma^{-1}\tilde{\psi}\dot{\psi}
\end{align}</p>
<p>Abusing notation and dropping the templated arguments, we find that,</p>
<p>\begin{align}
\dot{V} = z_1 \left(z_2 + \alpha + f + {\theta^\star}^T\zeta + \delta \right) +
& z_2 \left[u - \frac{\partial \alpha}{\partial x_1} (x_2 + f + {\theta^\star}^T\zeta + \delta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] \nonumber \newline
& + \tilde{\theta}^T \Gamma^{-1} \dot{\theta} + \gamma^{-1}\tilde{\psi}\dot{\psi}
\end{align}</p>
<p>which translates to</p>
<p>\begin{align}
\dot{V} &= z_1 z_2 + z_1\left(-z_1 - f - \theta^T\zeta + f + {\theta^\star}^T\zeta + \delta\right) +
\nonumber \newline
& \qquad z_2 \left[u - \frac{\partial \alpha}{\partial x_1} (z_2 + \alpha + f + {\theta^\star}^T\zeta + \delta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] + \nonumber \newline
& \qquad \tilde{\theta}^T \Gamma^{-1} \dot{\theta} + \gamma^{-1}\tilde{\psi}\dot{\psi} \label{eq:lyap_inter} \newline
%
&:= {z_1}^2 + z_1 z_2 + z_2\left[u - \frac{\partial \alpha}{\partial x_1} (z_2 - z_1 - \hat{\theta}^T \zeta - \beta_1 + \delta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] + \nonumber \newline
& \qquad \tilde{\theta}^T \Gamma^{-1} \dot{\theta} + \gamma^{-1}\tilde{\psi}\dot{\psi} \nonumber \newline
\end{align}</p>
<p>which is a result of substituting the expression for \(\alpha\) in \eqref{eq:lyap_inter}. Therefore,</p>
<p>\begin{align}
\dot{V} &= {z_1}^2 + z_1 z_2 + z_2\left[u - \frac{\partial \alpha}{\partial x_1} (z_2 - z_1 - \beta_1 + \delta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] + \nonumber \newline
& \qquad \tilde{\theta}^T \Gamma^{-1} \left[\dot{\theta} - \Gamma \zeta z_1 - \Gamma \zeta z_2 \frac{\partial \alpha}{\partial x_1} \right] - z_1 \left(\beta_1- \delta \right) + \gamma^{-1}\tilde{\psi}\dot{\psi} \newline
%
&:= {z_1}^2 + z_1 z_2 + z_2\left[u - \frac{\partial \alpha}{\partial x_1} (x_2 + f + \theta^T \zeta + \delta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] + \nonumber \newline
& \qquad \tilde{\theta}^T \Gamma^{-1} \left[\dot{\theta} - \Gamma \zeta (z_1 - z_2 \frac{\partial \alpha}{\partial x_1}) \right] - z_1 \left(\beta_1- \delta \right) + \gamma^{-1}\tilde{\psi}\dot{\psi} \newline
%
&:= {z_1}^2 + z_1 z_2 + z_2\left[u - \frac{\partial \alpha}{\partial x_1} (x_2 + f + \theta^T \zeta + \delta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] + \nonumber \newline
& \qquad \tilde{\theta}^T \Gamma^{-1} \left[\dot{\theta} - \Gamma \zeta (z_1 - z_2 \frac{\partial \alpha}{\partial x_1}) \right] - z_1 \left(\beta_1- \delta \right) + \gamma^{-1}\tilde{\psi}\dot{\psi}
\end{align}</p>
<p>Finally,</p>
<p>\begin{align}
\dot{V} &= {z_1}^2 + z_1 z_2 + z_2\left[u - \frac{\partial \alpha}{\partial x_1} (x_2 + f + \theta^T \zeta ) - \frac{\partial \alpha}{\partial \theta}\dot{\theta} - \frac{\partial \alpha}{\partial \psi}\dot{\psi} \right] + \nonumber \newline
& \qquad \tilde{\theta}^T \Gamma^{-1} \left[\dot{\theta} - \Gamma \zeta (z_1 - z_2 \frac{\partial \alpha}{\partial x_1}) \right] - \frac{\partial \alpha}{\partial x_1} \delta - z_1 \left(\beta_1- \delta \right) + \gamma^{-1}\tilde{\psi}\dot{\psi}
\label{eq:lyap_final}
\end{align}</p>
<p>Equation \eqref{eq:lyap_final} is pivotal since it will help us prove the stability of the neuro-adaptive system under consideration. We would want all terms in that equation to be negative along the trajectories of the solution to \eqref{eq:diff_eq}.</p>
<blockquote>
<p>Equation \eqref{eq:lyap_final} is pivotal since it will help us prove the stability of the neuro-adaptive system under consideration. We would want all terms in that equation to be negative along the trajectories of the solution to \eqref{eq:diff_eq}.</p>
</blockquote>
<p>If we select the following control law,</p>
<p>\begin{align}
u &= -z_1 - z_2 + \frac{\partial \alpha}{\partial x_1} (x_2 + f + \theta^T \zeta ) + \frac{\partial \alpha}{\partial \theta}\dot{\theta} + \frac{\partial \alpha}{\partial \psi}\dot{\psi} - \beta_2(x_1, x_2, \theta, \psi),
\end{align}</p>
<p>where \(\beta_2(x_1, x_2, \theta, \psi)\) is some function that is to be later on defined, then it follows that
the time derivative of the Lyapunov function becomes,</p>
<p>\begin{align}
\dot{V} &= -{z_1}^2 - {z_2}^2 + \tilde{\theta}^T \Gamma^{-1} \left[\dot{\theta} - \Gamma \zeta (z_1 - z_2 \frac{\partial \alpha}{\partial x_1}) \right] + \Lambda
\label{eq:lyap_lambda}
\end{align}</p>
<p>with \(\Lambda\) denoting,</p>
<p>\begin{align}
\Lambda(x_1, x_2, \theta, \psi) = - z_1(\beta_1 - \delta) - z_2(\beta_2 + \frac{\partial \alpha}{\partial x_1} \delta) + \gamma^{-1}\tilde{\psi}\dot{\psi}
\end{align}</p>
<p>Lyapunov stability requires the time derivative of \(V(\cdot)\) to be negative definite outside of the origin. By a close examination of \eqref{eq:lyap_lambda}, we see that save the last two terms, \(\dot{V}\) is already \(< 0 \) when \(z_{1,2} \neq 0\). To knock out the third term, we’ll set the adaptation update law for \(\theta\) as,</p>
<p>\begin{align}
\dot{\theta} = \Gamma \zeta (z_1 - z_2 \frac{\partial \alpha}{\partial x_1}).
\end{align}</p>
<p>Furthermore, to prevent parameter drift of the network parameters, we’ll add the standard \(\sigma\)-modification leakage term to the equation in the foregoing as follows:</p>
<p>\begin{align}
\dot{\theta} = \Gamma \zeta \left[z_1 - z_2 \frac{\partial \alpha}{\partial x_1} - \sigma (\theta - \theta^0)\right],
\end{align}</p>
<p>where \(\sigma > 0\) and \(\theta^0\) are constants to be chosen by the user. We can then define the adaptation law for \(\psi\) in terms of \(\beta_1, \beta_2\) like so,</p>
<p>\begin{align}
\beta_1 &= \psi \omega_1 \nonumber \newline
\beta_2 &= \psi \omega_2
\end{align}</p>
<p>where for a small positive constant \(\epsilon\), we have that</p>
<p>\begin{align}
\omega_1 (x_1) &= \text{tanh}(\frac{z_1}{\epsilon}) \nonumber \newline
\omega_2 (x_1, x_2, \theta, \psi) &= p \, \text{tanh}(\frac{z_2 \, p}{\epsilon}) \nonumber \newline
p(x_1, x_2, \psi) &= |\dfrac{\partial \alpha}{\partial x_1}|.
\end{align}</p>
<p>We will introduce the inequality \( \underline{\psi}_m^\star \le \delta \le \bar{\psi}_m^\star \) so that</p>
<p>\begin{align}
\Lambda &= -z_1 \psi \omega_1 + z_1 \delta - z_2 \psi \omega_2 - z_2 \frac{\partial \alpha}{\partial x_1}\delta + \gamma^{-1} \tilde{\psi}\dot{\psi} \nonumber \newline
& \le -z_1(\tilde{\psi} + \psi_m^\star) \omega_1 + |z_1| \psi_m^\star - z_2(\tilde{\psi} + \psi_m^\star) \omega_2 + |z_2| p \psi_m^\star + \psi^{-1} \tilde{\psi} \dot{\psi}.
\label{eq:lambda}
\end{align}</p>
<p>Sorting \eqref{eq:lambda}, we find that</p>
<p>\begin{align}
\Lambda \le \psi_m^\star(|z_1| - z_1\omega_1) + \psi_m^\star(|pz_2| - z_2\omega_2) + \gamma^{-1}\tilde{\psi}[\dot{\psi} - \gamma(z_1\omega_1 + z_2 \omega_2)].
\end{align}</p>
<p><strong>This section is currently under development. Please check back in a few days</strong></p>
<h4 id="the-case-for-stable-adaptive-large-scale-neuro-control">The case for stable adaptive large-scale neuro-control</h4>
<!-- For severe nonlinearities (e.g. systems that possess sub-harmonics and cascades to chaos [^Billings]), the uncertainties and unknowns are unknown nonlinear functions coupled with parameters that appear nonlinear-in-the-parameters of a system [^Ioannou-tut]. Existing adaptive system methods cannot be applied to these problems due to the lack of universal eigenfunctions or Lyapunov exponents [^Ogunfunmi].
Current approaches for solving unknown system nonlinearities resort to linear approximations about nominal trajectories so as to obtain locally stable solutions [^Lavretsky]. Mildly nonlinear system harmonics are treated with Volterra kernels and wavelet analyses. However, linear approximations about nominal local trajectories provide no global stabilizing control law guarantees. And Volterra kernels fail in systems with nonlinear sub-harmonics. A crucial component of these methods are that they assume a structure about the underlying nonlinearity. Therefore, we still rely on linearized nominal approximations when we attempt to control complex dynamical systems such as aircrafts, chemical plants and oil and gas manufacturing processes etc [3].
In order to advance the state-of-the-art, and extend our solutions to challenging problems such as natural language processing, dynamic computer vision segmentation, consistent and safe autonomous driving, efficient distributed automation and manufacturing processes, and climate prediction, we must solve the nonlinear problem and device intelligent adaptive controllers.
The most successful severely nonlinear model estimators of today are deep learning models, often trained with back-propagation. Given the intractability of the solvability of nonlinear control problems (e.g. with the Hamilton-Jacobi-Bellman equation for \\(n>3\\), for an \\(n\\)-dimensional state), deep networks have found use as efficient function approximators that compactly parameterize high-dimensional control laws(e.g. in deep reinforcement learning). However, these deep neuro-estimators and neuro-controllers lack formal robustness and stability guarantees. These methods have the following drawbacks, namely:
+ they place an emphasis on depth (hence their capacity for large memory consumption) rather than sparsity, making them over-parameterized systems, that are very sensitive to noise and disturbance;
+ being black boxes, they lack stabilizing guarantees in trained policies so that trained deep neuro-controllers are often unstable and exhibit brittleness in the real-world [^Ogunmolu]
+ they place an overwhelming emphasis on memorization of static dataset as opposed to adaptability and analyzability as new data samples become available;
+ so far, neuro-adaptive control methods that guarantee Lyapunov stability have not been applied to the class of problems where the curse-of-dimensionality is a challenge[7].
### My neuro-adaptive journey
Therefore, to achieve adaptive systems' original goals, we must revisit these persisting problems and devise sparse, adaptive neuro-estimators, and neuro-control laws.
I focus on the approximation properties of deep networks for representing complex systems. I also research the Lyapunov stability and synthesis of complex nonlinear control problems. My goal is to find solutions that assure optimality, guarantee robustness and stability for complex autonomous behaviors. Leveraging on information theory and learning-based approaches, I research difficult to automate large-scale autonomous control problems. I approach this problem in strides. Currently, in my PhD research, I study these nonlinearities in automating patient motion correction during clinical radiotherapy.
### Efforts in crossing the Rubicon
For completeness' sake, I am attaching the full slides of my recent talk on this subject here. These slides deal with understanding the the role of neural networks in adaptive control theory. It is based on a talk Nick gave at Google Robotics last year as well as my talk at PFN this year. It does more justice to the subject than I could ever rewrite in markdown :). Enjoy and feel free to ask questions in the comments section!
[Soft-Neuro-Adapt](/assets/presentations/google.pdf)
-->
<h3 id="references">References</h3>
<div class="footnotes">
<ol>
<li id="fn:Ioannou1">
<p>Ioannou, P. A. and Sun, J. Robust Adaptive Control. Englewood Cliffs, NJ: Prentice-Hall, 1995. <a href="#fnref:Ioannou1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:Ioannou2">
<p>Ioannou, P. A. and Datta, A. “Robust adaptive control: A unified approach,” Proc. IEEE, vol. 79, no. 12, pp. 1736-1768, 1991. <a href="#fnref:Ioannou2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:Sastry">
<p>Sastry, Shankar, and Marc Bodson. Adaptive control: stability, convergence and robustness. Courier Corporation, 2011. <a href="#fnref:Sastry" class="reversefootnote">↩</a> <a href="#fnref:Sastry:1" class="reversefootnote">↩<sup>2</sup></a></p>
</li>
<li id="fn:Kokotovic1">
<p>I. Kanellakopoulos, P. V. Kokotovic, and A. S. Morse, “Systematic design of adaptive controllers for feedback linearizable systems,” IEEE Trans. Automat. Contr., vol. 36, no. 11, pp. 1241-1253, 1991. <a href="#fnref:Kokotovic1" class="reversefootnote">↩</a> <a href="#fnref:Kokotovic1:1" class="reversefootnote">↩<sup>2</sup></a></p>
</li>
<li id="fn:Kokotovic2">
<p>M. Krstic and P. V. Kokotovic, “Adaptive nonlinear design with controller-identifier separation and swapping,” IEEE Trans. Automat. Contr., vol. 40, no. 3, pp. 426440, 1995. <a href="#fnref:Kokotovic2" class="reversefootnote">↩</a> <a href="#fnref:Kokotovic2:1" class="reversefootnote">↩<sup>2</sup></a></p>
</li>
<li id="fn:Bellman">
<p>Bellman, R.E., Dreyfus, S.E. Applied Dynamic Programming, United States Air Force Project RAND. May 1962 <a href="#fnref:Bellman" class="reversefootnote">↩</a></p>
</li>
<li id="fn:1">
<p>The problem of Bolza involves finding the extremum of a function of the end-point b, as in \(J(y) = \int_{a}^{b} g(z(x), y(x), x) dx + h(z(b), y(b), b)\) with \(x\) and \(y\) subject to \( \dfrac{dz}{dx} = H(z, y, x), \qquad z(0) = c_1 \) <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>The problem of Mayer in the calculus of variations attempts to find the extremum of a function of the end point \(b\), \(J(y) = h(z(b), y(b), b) \) <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>The Riemann-Stieltjes integral is describable by \(J(y) = \int_{a}^{b} g(z(x), y(x), x) dG(x)\) <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:Polycarpou">
<p><a href="https://doi.org/10.1109/9.486648">Polycarpou, M. M. (1996). Stable adaptive neural control scheme for nonlinear systems. IEEE Transactions on Automatic Control, 41(3), 447–451.</a> <a href="#fnref:Polycarpou" class="reversefootnote">↩</a></p>
</li>
<li id="fn:Funahashi">
<p>Funahashi, Ken-Ichi (1989). On the approximate realization of continuous mappings by neural networks Neural Networks. Elsevier, 1989. <a href="#fnref:Funahashi" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>
Mon, 30 Jul 2018 21:19:00 -0500
/neuro-adaptive-control
/neuro-adaptive-controlcontroladaptive-controlWhat's behind IEEE RAS Best Conference Papers?<p>Through the looking-glass, tired of working my brain out while familiarizing myself with a giant codebase I was reverse-engineering to prove a theory for an upcoming conference, I started asking myself why I was doing what I was doing? Just to get a paper out? Or to make a great contribution to science and my field? I googled something along the lines of “<em>How to write a best IEEE RAS conference paper</em>”. What I found was very interesting as I came across this <a href="/assets/ieee-best-paper/best_paper.pdf">IEEE Student Activities Committee Paper</a>. It offers
great insight into what constitutes writing a paper that merits IEEE Robotics and Automation Society best conference awards.</p>
<p>I hope you enjoy reading it as much as I did.</p>
Sat, 24 Jun 2017 04:15:00 -0500
/ieee-best-papers
/ieee-best-papersbest-paper,IEEE,RASUnderstanding Generative Adversarial Networks
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<h4 id="introduction">INTRODUCTION</h4>
<p>Deep learning strives to discover a probability distribution on the underlying nonlinear system. This probability distribution is usually discovered using composite mappings of neurons stacked into layers (with many such layers composing a deep network model), providing an abstraction of underlying low-level features for ease of inference at a much higher-level.</p>
<p>The most successful deep learning models have been discriminative, supervised ones that map a high-dimensional, sensory-rich input to an output using the backpropagation and/or dropout algorithms.</p>
<p>In order to learn in an unsupervised approach, one has to find a model that is capable of approximating the underlying probabilistic distribution. This is, by no means, difficult due to the intractability of probabilistic computations that arise in
maximum likelihood estimates, importance sampling and Gibb’s distributions. In such deep generative contexts, it is difficult to leverage on the scalability of piecewise linear models used in discriminative algorithms in an unsupervised context.</p>
<h4 id="deciphering-generative-adversarial-networks">Deciphering Generative Adversarial Networks</h4>
<p>Since such deep generative models are difficult to train, the question to ask then is, “is it possible to train a generative model from a classical discriminative classification setting?” If we have labeled data available, one can train a differentiable function that maps from an input space to the output space of a Borel measurable set in a supervised setting; this differentiable function can be thought of as fitting a probability distribution on a generative model.</p>
<p>What happens if we pit the generative model against a counterfeiter (an adversary) whose goal is to score the <em>likelihood</em> (not exactly, but we’ll see the definition shortly) that a sample drawn from the probability distribution of the generative model is real or fake?
You see, things get interesting here. If the adversary samples from the latent space of the generator, then it improves its likelihood of passing off as a real model. This is the central idea behind <strong>generative adversarial networks</strong>.
Interesting, huh?</p>
<h5 id="prior-work">Prior Work</h5>
<p>Deep Deterministic Networks are directed graphical models. To generate stochastic distributions that represent an underlying model, people have used undirected graphical models with hidden variables such as restricted Boltzmann machines (<a href="http://dl.acm.org/citation.cfm?id=104290">RBMs</a>, <a href="https://www.ncbi.nlm.nih.gov/pubmed/16764513">RBMs</a>) and deep Boltzmann machines (<a href="#dbms">DBMs</a>). Unless the underlying <em>partition function</em> within such models are trivial, the product of normalized potential functions and the gradients used are intractable, making such methods limiting.</p>
<p><!-- With Monte Carlo Markov Chain methods, however, one can approximate such functions. --></p>
<p>With deep belief networks, the gradient of a log likelihood expectation is computed with two different techniques: the expectations of <em>data-dependent</em> modes are <em>estimated</em> with variational approximations while the expectations of <em>data-independent</em> modes are approximated by <em>persistent Markov chains</em>.</p>
<p>Noise contrastive estimates (<a href="http://proceedings.mlr.press/v9/gutmann10a/gutmann10a.pdf">NCEs</a>) and generative stochastic networks (<a href="https://arxiv.org/pdf/1306.1091.pdf">GSNs</a>) are the closer of the generative adversarial network ancestors. NCEs do not approximate or propose a bound on the log likelihood estimate but require the learned probability distribution to be analytically specified up to a normalization constant. But with deep models containing many latent variables, providing a normalized constant for the probability distribution is almost an impossible task.</p>
<p>GSNs belong to those class of algorithms that instead of bounding the log likelihood estimate, they train a generative model with backpropagation to draw samples from a desired distribution in a parameterized Markov chain manner. Adversarial nets, on the contrast, do not require parameterized Markov chains for sampling from a data distribution but instead use piecewise linear units (ReLUs, MaxOuts etc) to improve the performance of backpropagation.</p>
<h4 id="inside-gans">Inside GANs</h4>
<p>The original <a href="https://arxiv.org/abs/1406.2661">adversarial net</a> was designed using two multilayer perceptrons as the modeling framework. It’s defined as follows:</p>
<p><strong>Generator:</strong></p>
<ul>
<li>suppose we have a data set, \(\textbf{x} \sim p_g(\centerdot)\)
<ul>
<li>where \(p_g\) represents the parameterization of the probability distribution of data \(\textbf{x}\);</li>
</ul>
</li>
<li>
<p>suppose also that we have a noisy dataset \(\textbf{z}\) with a prior defined as \(p_z(\textbf{z})\)</p>
</li>
<li>let \(G(\textbf{z};\theta_g)\) be a differentiable mapping from \(\textbf{z}\)’s state space to \(\textbf{x}\)’s state space parameterized by \(\theta_g\)
<ul>
<li>i.e. \(G_{\theta_g}: \mathbb{R}^{n_z} \rightarrow \mathbb{R}^{n_x}\)</li>
</ul>
</li>
<li><strong>define \(G(\textbf{z};\theta_g)\) as the generator</strong></li>
</ul>
<p><strong>Discriminator:</strong></p>
<ul>
<li>
<p>let us define \(D(\textbf{x})\) as the probability that the data \(\textbf{x}\) is drawn from the data state space and not its parameterized probability distribution \(p_g(\centerdot)\)</p>
</li>
<li>
<p>let \(D(\textbf{x}; \theta_d): \mathbb{R}^{n_x} \rightarrow \mathbb{R}^1\) represent the differentiable function that parameterizes the probability \(D(\textbf{x})\)</p>
</li>
<li>
<p><strong>define \(D(\textbf{x}; \theta_d)\) as the discriminator</strong></p>
</li>
</ul>
<p><strong>GANs Algorithm</strong></p>
<p>GANs <strong><em>simulataneously</em></strong> train two <u>different</u> multilayer perceptrons in a minimax scenario:</p>
<ul>
<li>one maximizes the probability of assigning the correct label to both training examples and samples drawn from \(G(z; \theta_g)\),
<ul>
<li>i.e. train to \(\max_D \text{ log } D(x)\);</li>
<li>note that this is done with supervised learning</li>
</ul>
</li>
<li>the second perceptron trains \(G(\textbf{z};\theta_g)\) to minimize the \(\text{log }[1 - D(G(\textbf{z}))]\)</li>
</ul>
<p>Essentially, we are playing a two-player minimax game with value function \(V(G(\centerdot), D(\centerdot))\):</p>
<p>\begin{equation}
\min_G \max_D V(D,G) = \mathbb{E}_{x \sim p(x)} \left[\text{ log } D(\textbf{x}) \right] + \mathbb{E}_{\textbf{z} \sim p(\textbf{z})} \left[\text{ log } \left(1 - D(G(\textbf{z})\right)\right]
\end{equation}</p>
<p>One may wonder why we are taking the logarithms of the differentiable functions: my gut is the authors used the logarithms to hasten the optimization process since differentiating the logarithm of a high cost takes less time than the bare-bones cost function itself \(i.e. (\textbf{O}(\text{ log } (n)) < < \textbf{ O } (n))\). If the functions are matrix variables, we could take their log determinants.</p>
<div class="fig figcenter fighighlight">
<img src="assets/Gans/fig1.jpg" width="20%" height="60%" border="0" />
<img src="assets/Gans/fig2.jpg" width="20%" height="60%" border="0" />
<img src="assets/Gans/fig3.jpg" width="20%" height="60%" border="0" />
<img src="assets/Gans/fig4.jpg" width="20%" height="60%" border="0" />
<div class="figcaption" align="left">
Fig.1: Intuition behind Generative Adversarial Networks (reproduced from the original <a href="https://arxiv.org/abs/1406.2661">paper).</a>.
</div>
</div>
<p>Essentially, GANs involve simultaneously:</p>
<ul>
<li>
<p>generating samples from a data with probability distribution, \(p_{\text{data}}(x)\) (black, dotted system in Fig 1)</p>
</li>
<li>generating samples from a generative distribution, \(p_g(G)\) (green, solid system in Fig 1),
<ul>
<li>\(z\) is typically sampled uniformly from a uniform Gaussian noise distribution (lower horizontal line in Fig 1).</li>
</ul>
</li>
<li>updating a discriminant, \(D\) (blue, dashed system in Fig 1),
<ul>
<li>this distinguishes between samples from \(p_{\text{data}}(x)\) from samples from \(p_g(G)\). \(\textbf{z}\)</li>
</ul>
</li>
<li>the horizontal line is part of the domain of \(\textbf{x}\) such that
<ul>
<li>\(\textbf{x} =G(\textbf{z})\) in the upward arrows imposes the non-uniform distribution \(p_g\) on transformed samples.</li>
</ul>
</li>
<li>
<p>when the underlying probability distribution is very dense, \(G(\centerdot)\) contracts and it expands in regions of low density in $p_g$.</p>
</li>
<li>
<p>near optimum, \(p_g\) is similar to $p_\text{data}$ and $D$ is a partially accurate classifier (fig1a).</p>
</li>
<li>
<p>in the inner loop of the algorithm, at convergence, \(D\) discriminates samples from data, yielding \(D^\star(\textbf{x}) = \frac{p_\text{data}(\textbf{x})}{ p_\text{data}(\textbf{x}) + p_g(\textbf{x})}\) (fig1b) .</p>
</li>
<li>
<p>updating \(G\), the gradients of \(D\) guides \(G(\textbf{z})\) to flow to regions that are more likely to be classified as data (fig1c) .</p>
</li>
<li>after multiple steps of backprop, provided that $G$ and $D$ have enough capacity, they will reach a <em>saddle point</em> where neither can improve, since $p_g = p_\text{data}$ (fig1d) .</li>
</ul>
<p>At equilibrium, discriminator is unable to differentiate between the two
distributions, i.e. \(D(\textbf{x}) = \frac{1}{2}\).</p>
<p>GANs allow us to train a discriminator as an unsupervised “density estimator” which outputs a low value for an original data and a high value for fictitious data. By pitting the adversary against the discriminator, the generator parameterizes a nonlinear manifold of a dynamical system, maps it to a point on the data manifold, and the discriminator develops internal dynamics that is capable of solving a difficult unsupervised learning problem.</p>
<h4 id="training-gans">Training GANS</h4>
<p>The algorithm is as detailed below:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">N</span> <span class="n">do</span>
<span class="k">for</span> <span class="n">steps</span> <span class="ow">in</span> <span class="n">k</span> <span class="n">do</span>
<span class="n">obtain</span> <span class="n">m</span> <span class="n">noise</span> <span class="n">samples</span> <span class="p">{</span><span class="n">z</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="o">...</span><span class="p">,</span> <span class="n">z</span><span class="p">(</span><span class="n">m</span><span class="p">)}</span> <span class="k">from</span> <span class="n">generator</span> <span class="n">noise</span> <span class="n">prior</span> <span class="n">p</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="n">obtain</span> <span class="n">m</span> <span class="n">minibatch</span> <span class="n">examples</span> <span class="p">{</span><span class="n">x</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="o">...</span><span class="p">,</span> <span class="n">x</span><span class="p">(</span><span class="n">m</span><span class="p">)}</span> <span class="k">from</span> <span class="n">data</span> <span class="n">generating</span> <span class="n">distribution</span> <span class="n">p</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="n">update</span> <span class="n">discriminator</span> <span class="n">by</span> <span class="n">ascending</span> <span class="n">its</span> <span class="n">stochastic</span> <span class="n">gradient</span><span class="p">:</span>
</code></pre></div></div>
<p>\begin{equation}
\nabla_{\theta_d}\frac{1}{m}\sum_{i=1}^{m} [\text{log } D(x(i)) + \text{ log } (1 - D(G(z{i})))]. \nonumber
\end{equation}</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">end</span> <span class="k">for</span>
<span class="n">obtain</span> <span class="n">m</span> <span class="n">noise</span> <span class="n">samples</span> <span class="p">{</span><span class="n">z</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="o">...</span><span class="p">,</span> <span class="n">z</span><span class="p">(</span><span class="n">m</span><span class="p">)}</span> <span class="k">from</span> <span class="n">generator</span> <span class="n">noise</span> <span class="n">prior</span> <span class="n">p</span><span class="p">(</span><span class="n">z</span><span class="p">)</span>
<span class="n">update</span> <span class="n">generator</span> <span class="n">by</span> <span class="n">descending</span> <span class="n">its</span> <span class="n">stochastic</span> <span class="n">gradient</span><span class="p">:</span>
</code></pre></div></div>
<p>\begin{equation}
\nabla_{\theta_g}\frac{1}{m}\sum_{i=1}^{m} \text{ log } (1 - D(G(z{i}))). \nonumber
\end{equation}</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">end</span> <span class="k">for</span>
</code></pre></div></div>
Sun, 11 Jun 2017 06:21:00 -0500
/gans
/gansgansmachine-learningOn the necessary and sufficient conditions for optimal controllers
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<!-- ### <center>Optimal Controllers: </center> -->
<p>This post deals with understanding the necessary and sufficient conditions, fundamental Lipschitz continuity assumptions and the terminal boundary conditions imposed on the Hamilton-Jacobi equation to assure that the problem of minimizing an integral performance index is well-posed.</p>
<h4 id="problem-statement">Problem Statement</h4>
<p>Suppose we have the following nonlinear dynamical system</p>
<p>\begin{equation} \label{eq:system}
\dot{x} =f(x, u, t), \qquad \qquad x(t_0) = x_0
\end{equation}</p>
<p>which starts at state, \(x_0\) and time, \(t_0\).</p>
<h5 id="assumption-i"><strong>Assumption I</strong></h5>
<p>If the function \(f(\centerdot)\) is
continuously differentiable in all its arguments, then the initial value problem (IVP) of \eqref{eq:system} has a <u>unique solution</u> on a finite time interval; this is a sufficient assumption (Khalil, 1976).</p>
<h5 id="assumption-ii"><strong>Assumption II</strong></h5>
<p>\(T\) is sufficiently small enough to reside within the time interval where the system’s solutions are defined.</p>
<p>Qualitatively, our goal is to <strong>optimally</strong> control the system when it starts in a state \(x_0\), at time \(t_0\), to a neighborhood of the terminal manifold \(T\), whilst exerting as minimal a control energy as possible. Quantitatively, we can define this goal in terms of an index of performance evaluation defined thus:</p>
<p>\begin{equation} \label{eq:cost}
J = J(x(t_0, u(\centerdot), t_0) = \int\limits_{t=t_0}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))
\end{equation}</p>
<p>where \(J\) is evaluated along the trajectories of the system \(x(t)\), based on an applied control \(u(\centerdot)|_{t_0 \le t \le T} \).
With \(L\left(x(\tau), u(\tau), \tau\right)\) as the instantaneous cost and \(V(x(T))\) as the terminal cost (which are nonnegative funtions of their arguments), we can think of \(J\) as the total amount of actions we take (controls) and the state energy utilized in bearing the states from \(x_0\) to a neighborhood of the terminal manifold \(V(x(T)) = 0\).</p>
<p>The question to ask then is that given the cost of performance index \(J\), how do we find a control law \(u^\star\) that is optimal along a unique state trajectory, \(x^\star\), in the interval \([t_0, T]\)? This optimal cost would be the minimum of all the possible costs that we could possibly incur when we implement the optimal control law \(u^\star\). Mathematically, we can express this cost as:</p>
<p>\begin{gather}
J^\star(x(t_0), t_0) = \int\limits_{t=t_0}^{T} L \left(x^{\star}(\tau), u^\star(\tau), \tau \right) d\tau + V(x^\star(T)) <br />
= \min_{ u_{[t_0, T]}} J(x_0, u, t_0)
\end{gather}</p>
<p>Therefore, the optimal cost is a function of the starting state and time so that we can write:</p>
<p>\begin{equation}
J^\star(x(t_0), t_0) = \min_{ u_{[t_0, T]}} J(x(t_0), u(\centerdot), t_0) = \min_{ u_{[t_0, T]}} \int\limits_{t_0}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))
\end{equation}</p>
<p>Now, assume that we start at an arbitrary initial condition \(x\), at time \(t\), it follows that the optimal cost-to-go from \(x(t)\) to \(x(T)\) is (abusing notation and dropping the templated arguments in \(J\)):</p>
<p>\begin{equation} \label{eq:cost-to-go}
J^\star(x, t) = \min_{ u_{[t, T]}} \left[\int\limits_{t}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau\right] + V(x(T))
\end{equation}</p>
<p>Things get a little bit interesting when we splice up the integral kernel in \eqref{eq:cost-to-go} along two different time-paths, namely:</p>
<p>\begin{equation} \label{eq:spliced}
J^\star(x, t) = \min_{ u_{[t, T]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + \int\limits_{t_1}^{t_2} L\left(x(\tau), u(\tau), \tau\right)d\tau\right] + V(x(T))
\end{equation}</p>
<p>We can split the minimization over two time intervals, e.g.,</p>
<p>\begin{equation} \label{eq:two_mins}
J^\star(x, t) = \min_{ u_{[t, t_1]}} \min_{ u_{[t_1, t_2]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + \int\limits_{t_1}^{t_2} L\left(x(\tau), u(\tau), \tau\right)d\tau\right] + V(x(T))
\end{equation}</p>
<p>Equation \eqref{eq:two_mins} gives the beautiful intuition that one can divide the integration into two or more time slices, solve the optimal control problem for each time slice and in the overall, minimize the effective cost function \(J\) of the overall system. This in essence is a statement of <a href="https://en.wikipedia.org/wiki/Richard_E._Bellman">Richard E. Bellman</a>’s principle of optimality:</p>
<blockquote>
<p>Bellman’s Principle of Optimality: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.</p>
</blockquote>
<blockquote>
<p>– Bellman, Richard. Dynamic Programming, 1957, Chap. III.3.</p>
</blockquote>
<p>With the principle of optimality, the problem takes a more intuitive meaning, namely that the cost to go from \(x\) at time \(t\) to a terminal state \(x(T)\) can be computed by minimizing the sum of the cost to go from \(x = x(t)\) to \(x_1 = x(t_1)\) and then, the optimal cost-to-go from \(x_1\) onwards.</p>
<p>Therefore, \eqref{eq:two_mins} can be restated as:</p>
<p>\begin{equation} \label{eq:two_mins_sep}
J^\star(x, t) = \min_{ u_{[t, t_1]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + \underbrace{\min_{ u_{[t_1, t_2]}} \int\limits_{t_1}^{t_2} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))}_{J^\star(x_1, \, t_1)} \right]
\end{equation}</p>
<p>\(J^\star(x_1, \, t_1)\) in \eqref{eq:two_mins_sep} can be seen as the optimal cost-to-go from \(x_1\) to \(x(T)\), with the overall cost given by</p>
<p>\begin{equation} \label{eq:optimal_pre}
J^\star(x, t) = \min_{ u_{[t, t_1]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + J^\star(x_1, \, t_1) \right]
\end{equation}</p>
<p>Replacing \(t_1\) by \(t + \delta t\) and with the assumption that \(J^\star(x, t)\) is differentiable, we can expand \eqref{eq:optimal_pre} into a first-order Taylor series around \((\delta t, x)\) as follows:</p>
<p>\begin{equation} \label{eq:taylor}
J^\star(x, t) = \min_{ u_{[t, t + \delta]}} \left[ L\left(x, u, \tau\right)\delta t + J^\star(x, \, t) + \left(\dfrac{\partial J^\star(x, t)}{\partial t}\right) \delta t + \left(\dfrac{\partial J^\star(x, t)}{\partial x} \right) \delta x + o(\delta) \right]
\end{equation}</p>
<p>where \(o(\delta)\) denotes higher order terms satisfying \(\lim_{\delta \rightarrow 0}\dfrac{o(\delta)}{\delta} = 0\).</p>
<p>Refactoring \eqref{eq:taylor}, we find that</p>
<p>\begin{equation} \label{eq:hamiltonian_pre}
\dfrac{\partial J^\star(x, t)}{\partial t} = -\min_{ u_{[t, t + \delta]}} \left[ L\left(x, u, \tau\right) + \left(\dfrac{\partial J^\star(x, t)}{\partial x} \right) \underbrace{\dot{x}(\centerdot)}_{f(x,u,t)} \right]
\end{equation}</p>
<p>We shall define the components in the square column of the above equation as the <strong>Hamiltonian</strong>, \(H(\centerdot)\) such that \eqref{eq:hamiltonian_pre} can be thus rewritten:</p>
<p>\begin{equation} \label{eq:hamiltonian}
\dfrac{\partial J^\star(x, t)}{\partial t} = -\min_{ u_{[t, t + \delta]}} H\left(x, \nabla_x J^\star (x, t), u, t \right)
\end{equation}</p>
<p>Based on the smoothness assumption of all function arguments in \eqref{eq:system},
when the linear sensitivity of the Hamiltonian to changes in \(u\) is zero, then
\(\nabla H_u\) <strong>must</strong> vanish at the optimal point i.e.,</p>
<p>\begin{equation} \label{eq:hamiltonian_deri}
\nabla H_u(x, \nabla_x J^\star (x, t), u, t) = 0
\end{equation}</p>
<p>ensuring that we satisfy the <strong>local optimality</strong> property of the controller. In addition, if the Hessian of the Hamiltonian is positive definite along the trajectories of the solution, i.e.,</p>
<p>\begin{equation} <br />
\dfrac{\partial^2 H}{\partial^2 u} > 0
\end{equation}</p>
<p>then we have the sufficient condition for global optimality. These conditions are referred to as the <a href="https://en.wikipedia.org/wiki/Legendre%E2%80%93Clebsch_condition">Legendre-Clebsch</a> conditions, essentially guaranteeing that over a singular arc, the Hamiltonian is minimized.</p>
<p>You begin to see the beauty of optimal control in that \eqref{eq:hamiltonian_deri} allows us to translate the complicated functional minimization integral of \eqref{eq:cost} into a minimization problem that can be solved by ordinary calculus.</p>
<p>If we let</p>
<p>\begin{equation} <br />
H^\star(x, \nabla_x J^\star (x, t), t) = \min_u \left[H(x, \nabla_x J^\star (x, t), u, t)\right]
\end{equation}</p>
<p>then it follows that solving \eqref{eq:hamiltonian_deri} for the optimal \(u = u^\star\) and putting the result in \eqref{eq:hamiltonian}, one obtains the <em><strong>Hamilton-Jacobi-Bellman</strong></em> pde whose solution is the optimal cost \(J^\star(x(t), t)\) such that</p>
<p>\begin{equation} \label{eq:optimal_cost}
\dfrac{\partial J^\star(x, t)}{\partial t} = -H^\star \left(x, \nabla_x J^\star (x, t), u, t \right)
\end{equation}</p>
<p>We can introduce a boundary condition that assures that the cost function of \eqref{eq:cost} is well-posed viz,</p>
<p>\begin{equation} \label{eq:boundary_cost}
J^\star(x(T), T) = V(x(T))
\end{equation}</p>
<p>Taken together, equations \eqref{eq:optimal_cost} allows us to analytically solve for the instanteneous <code class="highlighter-rouge">kinetic energy</code> of the cost function in \eqref{eq:cost} and \eqref{eq:boundary_cost} allows us to solve for the boundary condition that assure the sufficiency of an optimal control law to exist. If we can solve for \(u^\star\) from \(J^\star(x,t)\), then \eqref{eq:boundary_cost} must constitute the optimal control policy for the nonlinear dynamical system in \eqref{eq:system} given the cost index \eqref{eq:cost}.</p>
<h3 id="conclusions">Conclusions</h3>
<p>Notice that the optimal policy \(u^\star(t)\) is basically an open-loop control strategy. Why so? \(u^\star\) was derived as a function of time \(t\). As a result, the strategy may not be robust to uncertainties and may be very sensitive. For practical applications, we generally want to have a feedback control policy that is state dependent in order to guarantee robustness to parametric variations and achieve robust stability and performance. Such a \(u = u^\star(x)\) would be helpful in analyzing the stability of states and convergence of system dynamics to equilibrium for all future times. Will post such methods in the future.</p>
<h3 id="summary">Summary</h3>
<table>
<thead>
<tr>
<th style="text-align: left">Properties</th>
<th style="text-align: right">Equations</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Dynamics:</td>
<td style="text-align: right">\(\dot{x} =f(x, u, t), \quad x(t_0) = x_0 \)</td>
</tr>
<tr>
<td style="text-align: left">Cost:</td>
<td style="text-align: right">\(J(x,u,\tau) = \int\limits_{t=t_0}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))\)</td>
</tr>
<tr>
<td style="text-align: left">Optimal cost :</td>
<td style="text-align: right">\(J^\star(x,t) = \min_{u[t,T]}J\)</td>
</tr>
<tr>
<td style="text-align: left">Hamiltonian:</td>
<td style="text-align: right">\(H(x,u,t) = L\left(x, u, \tau\right) + \left(\dfrac{\partial J^\star(x, t)}{\partial x} \right) f(x,u,t)\)</td>
</tr>
<tr>
<td style="text-align: left">Optimal Control:</td>
<td style="text-align: right">\(u^\star(t) = H^\star(x,u,t) = \nabla H_u(x,u,t)\)</td>
</tr>
<tr>
<td style="text-align: left">HJB Equation:</td>
<td style="text-align: right">\(-\dfrac{\partial J^\star(x,t)}{\partial t} = H^\star(x, \nabla_x J^\star(x,t),t) )\) and \(J^\star(x(T), T) = V(x(T))\)</td>
</tr>
</tbody>
</table>
<h3 id="further-readings">Further Readings</h3>
<p><a href="https://web.archive.org/web/20050110161049/http://www.wu-wien.ac.at/usr/h99c/h9951826/bellman_dynprog.pdf">Richard Bellman: On The Birth Of Dynamic Programming</a></p>
<p><a href="https://www.amazon.com/Optimal-Control-Quadratic-Methods-Engineering/dp/0486457664">Optimal Control: Linear Quadratic Methods</a></p>
Sun, 04 Jun 2017 08:28:00 -0500
/optimal-control
/optimal-controlcontroloptimal-controlShould I use ROS or MuJoCo?<p>This was my answer to a question posted in an email thread to our research group’s email lists. The question goes like this:</p>
<p><br />
<br />
<strong>QUESTION</strong>
<br />
<strong>__</strong><strong>__</strong><strong>__</strong><strong>__</strong><strong>__</strong><strong>__</strong><em>__</em><br />
From: XXX@uni-x.edu <br />
Sent: Sunday, May 14, 2017 9:29 AM <br />
To: XXX@lists.uni-x.edu <br />
Subject: RE: [robotec] MuJoCo <br /></p>
<p>From the documentation, looks like MuJoCo is faster and therefore better to simulate computational intensive controllers like MPC. Gazebo provides other engines as well and seems like is more popular in the ROS community. How to choose between the available options? Which one would you recommend?</p>
<p>Thanks,<br />
XXX</p>
<p><strong>Answer</strong></p>
<p><strong>TL; DR:</strong>
If you do not care for accuracy of simulated controller numerical results, if you are not simulating parallel linkages, if you do not need code parallelization (i.e. your computation is not crazy intensive) or if you are not simulating contact and friction, I would choose ROS. Easy to use and straightforward to build fairly complex models.</p>
<h1 id="proper-long-answer"><strong>Proper (Long) Answer</strong></h1>
<p>ROS and Gazebo (OSRF tools) are indeed popular in the robotics community like you mentioned and they have their pros. It took me a while to see their limit when using them for research purposes.</p>
<p><strong>Definition</strong></p>
<p><code class="highlighter-rouge">ROS = Plumbing + Tools + Capabilities + Ecosystem</code></p>
<p><code class="highlighter-rouge">Plumbing</code>: ROS provides publish-subscribe messaging infrastructure designed to support the quick and easy construction of distributed computing systems.</p>
<p><code class="highlighter-rouge">Tools</code>: ROS provides an extensive set of tools for configuring, starting, introspecting, debugging, visualizing, logging, testing, and stopping distributed computing systems.</p>
<p><code class="highlighter-rouge">Capabilities</code>: ROS provides a broad collection of libraries that implement useful robot functionality, with a focus on mobility, manipulation, and perception.</p>
<p><code class="highlighter-rouge">Ecosystem</code>: ROS is supported and improved by a large community, with a strong focus on integration and documentation. ros.org is a one-stop-shop for finding and learning about the thousands of ROS packages that are available from developers around the world. answers.ros.org is a rich online community of ros packages users from around the world asking questions and getting help on how to use ROS.</p>
<p>So getting a simple dynamics kicking should not be a lot of hassle as the documentation is rich and the online community is very active in supporting newbies.</p>
<p>In the early days, the plumbing, tools, and capabilities were tightly coupled, which has both advantages and disadvantages. On the one hand, by making strong assumptions about how a particular component will be used, developers are able to quickly and easily build and test complex integrated systems. On the other hand, users are given an “all or nothing” choice: to use an interesting ROS component, you pretty much had to jump in to using all of ROS.</p>
<p>8+ years in after Andrew Ng and co. conceived the platform, the core system has matured considerably, and developers are hard at work refactoring code to separate plumbing from tools from capabilities, so that each may be used in isolation. In particular, people are aiming for important libraries that were developed within ROS to become available to non-ROS users in a minimal-dependency fashion (e.g. OMPL and PCL libraries).</p>
<p>Disclaimer: Borrowed from Brian Gerkey’s/my answer to a similar quora question about a year ago.</p>
<p>For serially linked robot arms and other non-parallel linkages, ROS is a great simulation tool and “middleware”. However, there are bottlenecks with ROS.</p>
<p>What ROS calls URDF (Universal Robot Description Format), which is the abstraction tool for rigid body dynamics, is not universal in any sense of the word. URDF models written in ROS are out-of-the-box incompatible with Gazebo, its sister physics engine (see this <a href="http://answers.gazebosim.org/question/14891/conversion-from-urdf-to-sdf-using-gzsdf-issues/">question/wiki</a>). More so, state representation in OSRF tools such as ROS is represented in a tree-like manner. I learnt this late last year when simulating parallel linkages. The internal ROS XML parser interprets constructed linkages as a deep binary tree and not graphs. This makes simulating parallel linkages almost (actually) impossible. Repeat, actually impossible. They have a fix for this in Gazebo SDF but it is not straightforward. So developers spend a huge chunk of time migrating code from one OSRF framework to another.</p>
<p>Good controller algorithm formulations are based on numerical optimization (think MPC, differential dynamic programming, sampling-based motion-planning or reinforcement learning). Gazebo was designed around the ODE (Open Dynamics Engine) and Bullet physics engines which provide the states in over-complete Cartesian coordinates and enforce joint constraints via numerical optimization. This is good enough for disconnected bodies with few joint constraints but becomes a pain for complex dynamics such as humanoids or simulating human-robot interactions. Running complex simulations for huge candidate evaluations of humanoids can run into months using ROS (e.g. Todorov’s de novo synthesis). Whereas MUJOCO is optimized for parallel processing, distributed evaluation of possible controllers from which a candidate controller is chosen.</p>
<p>ODE simulators optimize the controller to the engine. This makes the controller cheat during simulations in ways that mean generated control laws may be physically unrealizable. Speed and accuracy? Controller optimization with MoveIt! (a motion planning framework from OSRF) is mostly done in a single threaded code without the advantage of explicit parallelization of code to make e.g. IK solutions faster. Implementation of concurrency and multithreading is left to the user (this is a big no-no for someone not interested in software engineering).</p>
<p>ROS is strictly written based on the assumption that the user is running a Linux kernel. So users not familiar with Linux are thrown aback when they first get exposed to it. With MUJOCO, you do not need Linux or OSX as it works on Windows OS just fine. MUJOCO also use an XML parser to interpret links and joints and so it is able to read ROS URDFs and xacro files okay. But it doesn’t work the other way (see <a href="http://www.mujoco.org/forum/index.php?threads/ros-gazebo-integration.3371/">this answer</a> from Todorov)</p>
<p>MPC implementations are elegant only when the model is accurate. Unexpected poor performance of an MPC controller will often be due to poor modeling assumptions (Rossiter). If the simulation engine emphasizes simulation stability over control law precision, we have a problem. And this is my problem with Gazebo and ROS generally. I read it somewhere in one of Todorov’s papers (can’t remember where I found it) that the floating point ops of MUJOCO were unit tested to <code class="highlighter-rouge">>>355</code> decimal points. The OSRF community may be good at community based software engineering for robotics but you have to give Todorov the credit. He had the patience and tenacity to develop such a robust software for control simulation. People stopped paying attention to floating point operational precision back in the late 80’s/90’s.</p>
<p>What’s more? MUJOCO allows you to write your models in C. Engineers are head over boots for Matlab but I am all for a program or modeling software that stays close to ones and zeros as much as possible. It means being less dumbfounded when things do not work as you envisioned and greater flexibility in being the master and architect of your creation.</p>
Sun, 14 May 2017 08:28:00 -0500
/mujoco-ros
/mujoco-rosQ&Arosmujoco