Lekan MoluRerum Cognoscere Causas
http://scriptedonachip.com/
Wed, 10 Apr 2024 21:16:01 +0000Wed, 10 Apr 2024 21:16:01 +0000Jekyll v3.9.5Optimality vs. Stability of Feedback Control Systems.
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<ul>
<li><a href="#table-o-conts">Table of Contents</a></li>
<li><a href="#intro">Introduction</a></li>
<li><a href="#case">Stability vs optimality</a></li>
<li><a href="#issues">Optimality</a></li>
</ul>
<p><a name="intro"></a></p>
<h3 id="intro">Intro</h3>
<p>The question of the connection between optimality and stability is a curious thing. On the one hand, we are led to believe that if we can find an optimal control law, that can execute a plan, with the least amount of energy possible, then we are satisfied to know we have fulfilled the specifications posed in our objective function. But consider: an optimal controller is not necessarily a stable controller for a system. Why so? I lay out my case in the next section.</p>
<p><a name="case"></a></p>
<h3 id="stability-vs-optimality">Stability vs optimality</h3>
<p>Systems under the influence of optimal control laws enjoy a nice set of properties, provided that the associated cost functional enforces a constraint that is desirable on the state and control. LQ optimal control systems have nice gain and phase margins coupled with reduced sensitivity and I understand that there are similar properties that have been shown for nonlinear systems. Optimal control has the attractive property that the control effort is not wasted in mitigating the effects of nonlinearities as it chooses among a set of policies (or stabilizing control sequences) that yield a desirable effect on the system. The intractability of the HJB equation however makes optimal control as a synthesis tool for nonlinear problems a painful one.</p>
<p>Enter Lyapunov stability. Lyapunov defines classical stability as the system’s behavior near an equilibrium point such that there exists a real number \(\delta(\epsilon, t_0)>0\) for every real \(\epsilon > 0\) for which the state of the system is bound – essentially a local stability concept, a scalar bound that expresses how far away a system could ever get from the equilibrium (based on how far away it started). As Engineers, we do not want to limit ourselves to this local stability context. We want every motion starting sufficiently close to the equilibrium state to converge to the equilibrium as time approaches <em>ad infinitum</em>. Asymptotic stability captures this need. But again, asymptotic stability is as well a local concept since we do not know <em>a priori</em> how much magnitude we want for the bound. Enter equiasymptotic stability in the large. For an \(r>0\) that is fixed and arbitrarily large, we find that as \(t \rightarrow \infty\), all motions converge to the equilibrium uniformly in the initial state from which they start for \(|x_0| \le r \).</p>
<p>Note that all these definitions merely impose a constraint on the behavior of the states as they evolve over the trajectories of the system. That begs the question, can a control law be stable, yet not optimal (or vice versa)? I think so. Why?</p>
<p><a name="issues"></a></p>
<h3 id="optimality">Optimality</h3>
<blockquote>
<p>This section has an update based on what I found from Freeman and Kokotovic’s 1996 Paper in the Int’l J. Optimal & Control. Please skip to the <a href="#updatedOptStab">updated part</a></p>
</blockquote>
<p>Optimality, as Bellman would have us think, deals with reaching the goal state with as minimal an energy as possible. I would think that the principle of optimality and Lyapunov stability have a fundamental disconnect. It seems to me that we may find an optimal control law that is not stable (i.e it’s V(x) gradient function does not strictly decrease along the trajectories of the solution to the dynamic system’s differential equation).</p>
<p><del>To buttress this fact, consider that the concepts of stability and optimality appeared in the consciousness of control theorists at two distinct and disconnected eras (or so to say) in history. On the one hand, Lyapunov’s thesis got published in the Soviet union in the 1890’s but his work was not available in English until 1947. Even so, western researchers did not adequately grasp its usefulness until Kalman’s 1960 seminal paper on the second method of Lyapunov. Meanwhile, Bellman’s last formal work on DP and applied DP did not become published until 1962. What is more intriguing is that not anywhere in Bellman’s stability tests (as far as I can tell from what I have read from his books) did he use Lyapunov analyses’ rigor to establish the stability of his principle of optimality methods. Kalman, remarked in his paper in 1960 that few researchers were aware of Lyapunov methods. We can make a fairly accurate “guesstimation” that had Bellman been aware of Lyapunov’s analyses earlier, it might have creeped into his optimality analyses.</del></p>
<p><del>I had an exchange with someone about this a while ago, and I am quoting the caveats they expressed in their agreement with my observation below.</del></p>
<blockquote>
<p><del>1) If optimality is concerned only with the cost from initial condition to final condition, a control law that makes the system unstable might be desirable as unstable systems tend to be very fast.</del></p>
</blockquote>
<blockquote>
<p><del>2) The problem is what happens when you reach the final condition? An unstable system will not stop there, but will overshoot the goal and go off to infinity. So you must have the ability to switch to a stabilizing controller when you reach the goal.</del></p>
</blockquote>
<blockquote>
<p><del>An example is in fighter aircraft. I understand that they become unstable during certain maneuvers such as tight turns so they can move very fast, but then “catch” themselves and stabilize before going too far from the equilibrium.</del></p>
</blockquote>
<p><a name="updatedOptStab"></a>
<strong>UPDATE [Aug 28, 2018]</strong></p>
<p>Most of the discussions below are drawn from Freeman and Kokotovic’s <sup id="fnref:Freeman_Kokotovic" role="doc-noteref"><a href="#fn:Freeman_Kokotovic" class="footnote" rel="footnote">1</a></sup> 1996 work on <em>point-wise min-norm control laws for robust control lyapunov functions</em>.</p>
<p>They provide an optimality-based method for choosing a <em>stabilizing</em> control law once an rclf is known without resorting to cancellation or domination of nonlinear terms, which do not necessarily possess the desirable properties of optimality and may lead to poor robustness and wasted control effort.</p>
<p>The value function for a meaningful optimal stabilization problem is a Lyapunov function for the closed-loop system.</p>
<ul>
<li>
<p>Every meaningful value function is a Lyapunov function (Freeman and Kokotovic, 1996). Every Lyapunov function for every stable closed-loop system is also a value function for a meaningful optimal stabilization problem.</p>
</li>
<li>
<p>Every Lyapunov function is a meaningful value function</p>
</li>
</ul>
<p>Both bullets above are important since the first point helps with the analysis of the stability of an optimal feedback control system, while the second link will have implications for their synthesis.</p>
<ul>
<li>
<p>Every robust control lyapunov function (rclf) is a meaningful upper value function</p>
<ul>
<li>Every rclf solves the Hamilton Jacobi Isaacs equation associated with a meaningful game. For a known rclf, a feedback law that is optimal w.r.t a meaningful cost functional can be constructed. Matter-of-factly, this can be accomplished without solving the HJI equation for the upper value function or without constructing a cost functional as the optimal feedback can be directly calculated from the rclf without recourse to the HJI equation. Such control laws are called <em>pointwise min-norm</em> control laws and each one inherits the desirable properties of optimality because <em>every pointwise min-norm control law is optimal for a meaningful game</em>.</li>
</ul>
</li>
</ul>
<p>Essentially, this task is an <em>inverse optimal stabilization problem</em> where for LTI systems, the solution involves choosing a candidate value function and then constructing a meaningful cost functional in order to make the HJB equation valid. For open-loop stable nonlinear systems, one can find a solution by choosing the candidate value function as a Lyapunov function for the open-loop system. For openloop <em>unstable</em> systems, one can choose a candidate value function as a clf for the system. In Freeman and Kokotovic’s <sup id="fnref:Freeman_Kokotovic:1" role="doc-noteref"><a href="#fn:Freeman_Kokotovic" class="footnote" rel="footnote">1</a></sup>, actually the authors solve the inverse optimal <em>robust</em> stabilization problem for systems with disturbances and showed that evert rclf is an upper value function for a meaningful differential game.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:Freeman_Kokotovic" role="doc-endnote">
<p>Freeman, R. A., & Kokotovic, P. V. (1996). Inverse Optimality in Robust Stabilization. SIAM Journal on Control and Optimization, 34(4), 1365–1391. https://doi.org/10.1137/S0363012993258732 <a href="#fnref:Freeman_Kokotovic" class="reversefootnote" role="doc-backlink">↩</a> <a href="#fnref:Freeman_Kokotovic:1" class="reversefootnote" role="doc-backlink">↩<sup>2</sup></a></p>
</li>
</ol>
</div>
Wed, 22 Aug 2018 13:10:00 +0000
http://scriptedonachip.com/opti-stable
http://scriptedonachip.com/opti-stablecontrolstabilityoptimalityControl Commons
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<ul>
<li><a href="#table-o-conts">Table of Contents</a></li>
<li><a href="#intro">Introduction</a></li>
<li><a href="#defs">Definitions, Theorems, Lemmas etc</a></li>
<li><a href="#nlnr">Nonlinear Control Theory</a></li>
<li><a href="#stab">Stability</a></li>
</ul>
<p><a name="intro"></a></p>
<h3 id="intro">Intro</h3>
<p>Here are a few control theorems, concepts and diagrams that I think every control student should know. I keep updating this post, so please check back from time to time.</p>
<p><a name="defs"></a></p>
<h3 id="definitions-theorems-lemmas-and-such">Definitions, Theorems, Lemmas and such.</h3>
<p><a name="nlnr"></a></p>
<h4 id="nonlinear-control-theory">Nonlinear Control Theory</h4>
<ul>
<li>A differential equation of the form</li>
</ul>
<p>\begin{align}
dx/dt = f(x, u(t), t), \quad -\infty < t < +\infty
\label{eq:diff_eq}
\end{align}</p>
<p>is said to be free (or unforced) if \(u(t) \equiv 0\) for all \(t\). That is \eqref{eq:diff_eq} becomes</p>
<p>\begin{align}
dx/dt = f(x, t), \quad -\infty < t < +\infty
\label{eq:unforced}
\end{align}</p>
<ul>
<li>If the differential equation in \eqref{eq:diff_eq} does not have an explicit dependence on time, but has an implicit dependence on time, through \(u(t)\), then the system is said to be stationary. In other words, a dynamic system is <strong>stationary</strong> if</li>
</ul>
<p>\begin{align}
f(x, u(t), t) \equiv f(x, u(t))
\label{eq:stationary}
\end{align}</p>
<ul>
<li>A stationary system \eqref{eq:stationary} that is free is said to be <em>invariant under time translation</em>, i.e.</li>
</ul>
<p>\begin{align}
\Phi(t; x_0, t_0) = \Phi(t + \tau; x_0, t_0 + \tau)
\label{eq:free_stat}
\end{align}
- \(\Phi(t; x_0, t_0)\) is the analytical solution to \eqref{eq:diff_eq}; it is generally interpreted as the solution of \eqref{eq:diff_eq}, with fixed \(u(t)\), going through state \(x_0\) at time \(t_0\) and observed at time \(t\) later on. This is a clearer way of representing the d.e.’s solution as against \(x(t)\), which is popularly used in most text nowadays.</p>
<ul>
<li>
<p>\(\Phi(\cdot)\) is generally referred to the transition function, since it relates the transformation from \(x(t_0)\) to \(x(t)\).</p>
</li>
<li>
<p>For a physical system, \(\Phi\) has to be <em>continuous in all of its arguments.</em>
.</p>
</li>
<li>
<p>If the rate of change \(dE(x)/dx\) of an isolated physical system is negative for every possible state x, except for a single equilibrium state \(x_e\), then the energy will continually decrease until it finally assumes its minimum value \(E(x)\).</p>
</li>
<li>The <strong>first method of Lyapunov</strong> deals with questions of stability using an explicit representation of the solutions of a differential equation
<ul>
<li>Note that the <strong>second method</strong> is more of a historical misnomer, perhaps more accurately described as a philosophical point of view rather than a systematic method. Successful application requires the user’s ingenuity.</li>
</ul>
</li>
<li>
<p>In contrast to popular belief that the energy of a system and a Lyapunov function are the same, they are not the same. Why? Because <strong>the Lyapunov function, \(V(x)\), is not unique</strong>. To quote Kalman, “a system whose energy \(E\) decreases <em>on the average</em>, but not necessarily at each instant, is stable but \(E\) is not necessarily a Lyapunov function.”</p>
</li>
<li>
<p><strong>Lyapunov analysis and optimization</strong>: Suppose a performance index is defined to be the error criterion between a measured and an estimated signal; suppose further that this criterion is integrated w.r.t time, then the performance index is actually a Lyapunov function – provided that the error is not identically zero along any trajectory of the system.</p>
</li>
<li>
<p><strong>Existence, uniqueness, and continuity theorem</strong>:</p>
<p>Let \(f(x, t)\) be continuous in \(x,t\), and satisfy a Lipschitz condition in some region about any state \(x_0\) passing through time \(t_0\):</p>
<p>\begin{align}
R(x_0, t_0) &=
||x - x_0|| \le b(x_0) \nonumber
\end{align}</p>
<p>\begin{align}
R(x_0, t_0) &= ||t - t_0|| \le c(t_0) \quad (b, c) > 0
\end{align}</p>
</li>
</ul>
<p>with the Lipschitz condition satisfied for \((x,t), (y,t)\) \(\in\) \(R(x_0, t_0)\), then it follows that
\begin{align}
||f(x,t) - f(y,t)|| \le k \, ||x-y|| \nonumber
\end{align}</p>
<p>where \(k>0\) depends on \(b, c\). THUS,</p>
<ul>
<li>
<p>there exists a unique solution \(\Phi(t; x_0, t_0)\) of \(dx/dt\), that starts as \(x_0, t_0\) for all \(|t - t_0| \le a(t_0)\),</p>
</li>
<li>
<p>\(a(t_0) \ge \text{ Min (}{c(t_0), b(x(t_0))/M(x_0, t_0)}\), where \(M(x_0, t_0)\) is the maximum assumed by the continuous function \(f(x,t)\) in the closed, bounded set \(R(x_0, t_0)\)</p>
</li>
<li>
<p>in some small neighborhood of \(x_0, t_0\), the solution is continuous in its arguments</p>
</li>
</ul>
<p>Observe that the Lipschitz condition only implies continuity of \(f\) in \(x\) but not necessarily in \(t\); as it is implied by the bounded derivatives in \(x\). Note that the local lipschitz condition required by the theorem only implies desired properties of a solution near \(x_0, t_0\).</p>
<p>The <em>finite escape time</em> (that is the solution leaves any compact set within a finite time) quandary does not allow us to make conclusions surrounding arbitrarily large values of \(t\). The phrase “<strong>finite escape time</strong>” describes the concept that a trajectory escapes to infinity at a finite time. <strong>In order that a differential equation accurately represent a physical system, the possibility of finite escape time has to be mitigated by an explicit assumption to the contrary.</strong> If the Lipschitz condition holds for \(f\) everywhere, then there can be no finite escape time. The proof is easy by integrating both sides of \eqref{eq:diff_eq} and using</p>
<p>\begin{align}
\Phi(t; x_0, t_0) \le ||x_0|| + || \int_{t_0}^{t}f(\Phi(\tau; x_0, t_0), \tau)d\tau ||
\end{align}</p>
<p>\begin{align}
||x_0|| + k \int_{t_0}^{t}f(\Phi(\tau; x_0, t_0), \tau)d\tau
\end{align}</p>
<p>where \(f(\cdot)\) obeys the lipschitz condition,</p>
<p>\begin{align}
||f(x,t) - f(y,t)|| \le k \, ||x-y||. \nonumber
\end{align}</p>
<p>By the Gronwall-Bellman lemma,</p>
<p>\begin{align}
||\Phi(t; x_0, t_0) || \le [\exp \, k (t - t_0)] ||x_0 || \nonumber
\end{align}</p>
<p>which is less than \(\infty \) for any finite \((t - t_0)\).</p>
<p><a name="stab"></a></p>
<h3 id="stability">Stability</h3>
<p>My definitions follow from R.E Kalman’s 1960 seminal paper since they are clearer to understand compared to the myriad of definitions that exist in many texts today. <strong>Stability concerns the deviation about some fixed motion</strong>. So, we will be considering the deviations from the equilibrium state \(x_e\) of a free dynamic system.</p>
<p>Simply put, here is how Kalman defines stability, if \eqref{eq:diff_eq} is slightly perturbed from its equilibrium state at the origin, all subsequent motions remain in a correspondingly small neighborhood of the origin. Harmonic oscillators are a good example of this kind of stability. <strong>Lyapunov</strong> himself defines stability like so:</p>
<ul>
<li>
<table>
<tbody>
<tr>
<td>An equilibrium state \(x_e\) of a free dynamic system ios <em>stable</em> id for every real number \(\epsilon>0\), there exists a real number \(\delta(\epsilon, t_0)>0\) such that \(</td>
<td> </td>
<td>x_0 - x_e</td>
<td> </td>
<td>\le \delta \) implies</td>
</tr>
</tbody>
</table>
</li>
</ul>
<p>\begin{align}
||\Phi(t; x_0, t_0) - x_e|| \le \epsilon \quad \forall \quad t \ge t_0 \nonumber
\end{align}</p>
<p>This is best imagined from the figure below:</p>
<div class="fig figcenter fighighlight">
<img src="/assets/control/stability.png" width="100%" height="450" align="middle" />
<div class="figcaption" align="middle">Fig. 1. The basic concept of stability. Courtesy of R.E. Kalman
</div>
</div>
<p>Put differently, the system trajectory can be kept arbitrarily close to the origin/equilibrioum if we start the trajectory sufficiently close to it. If there is stability at some initial time, \(t_0\), there is stability for any other initial time \(t_1\), provided that all motions are continuous in the initial state.</p>
<ul>
<li>Asymptotic stability: The requirement that we start sufficiently close to the origin and stay in the neighborhood of the origin is a rather limiting one in most practical engineering applications. We would want to require that our motion should return to equilibrium after any small perturbation. Thus, the classical definition of Lyapunov stability is
<ul>
<li>an equilibrium state \(x_e\) of a free dynamic system is <em>asymptotically stable</em> if
<ul>
<li>it is stable and</li>
<li>every motion starting sufficiently near \(x_e\) converges to \(x_e\) as \(t \rightarrow \infty\).</li>
</ul>
</li>
<li>put differently, there is some real constant \(r(t_0)>0\) and to every real number
\(\mu > 0\) there corresponds a real number \(T(\mu, x_0, t_0)\) such that \(||x_0 - x_e|| \le r(t_0)\) implies</li>
</ul>
<p>\begin{align}
||\Phi(T; x_0, t_0)|| \le \mu \quad \forall \quad t \ge t_0 + T \nonumber
\end{align}</p>
<div class="fig figcenter fighighlight">
<img src="/assets/control/asymptotic_stability.png" width="80%" height="350" align="middle" />
<div class="figcaption" align="left">Fig. 1. Definition of asymptotic stability. Courtesy of R.E. Kalman
</div>
</div>
</li>
</ul>
<p>Asymptotic stability is also a local concept since we do not know aforetime how small \(r(t_0)\) should be. For motions starting at the same distance from \(x_e\), none will remain at a larger distance than \(\mu\) from \(x\) at arbitrarily large values of time. Or to use Massera’s definition:</p>
<ul>
<li>An equilibrium state \(x_e\) of a free dynamic system is <em>equiasymptotically stable</em> if
<ul>
<li>it is stable</li>
<li>
<p>every motion starting sufficiently near \(x_e\) converges to \(x\), as \(t \rightarrow \infty\) uniformly in \(x_0\)</p>
</li>
<li>
<p>Interrelations between stability concepts: This I gleaned from Kalman’s 1960 paper on the second method of Lyapunov.</p>
<div class="fig figcenter fighighlight">
<img src="/assets/control/control_concepts.png" width="100%" height="450" align="middle" />
<div class="figcaption" align="middle">Fig. 1. Interrelations between stability concepts. Courtesy of R.E. Kalman
</div>
</div>
</li>
</ul>
</li>
<li>For <em>linear systems</em>, stability is independent of the distance of the initial state from \(x_e\). Nicely defined as such:
<ul>
<li>an equilibrium state \(x_e\) of a free dynamic system is <em>asymptotically (equiasymptotically) stable in the large</em> if
(i) it is stable</li>
</ul>
<p>(ii) every motion converges to \(x_e\) as \(t \rightarrow \infty \), i.e., every motion converges to \(x_e\), uniformluy in \(x_0\) for \(x_0 \le r\), where \(r\) is fixed but arbitrarily large</p>
</li>
</ul>
<p><strong>To be Continued</strong></p>
Sat, 04 Aug 2018 13:10:00 +0000
http://scriptedonachip.com/control-commons
http://scriptedonachip.com/control-commonscontrolstabilitynonlinear-controlWhat Good Research is Not<p>This post includes curated links to advice on doing good research from people I respect.</p>
<p>I hope you find time to enjoy reading them.</p>
<ul>
<li>
<p><a href="https://www.cc.gatech.edu/~parikh/citizenofcvpr/static/slides/freeman_how_to_write_papers.pdf">How to write a good research paper ~ Bill Freiman</a>.</p>
<blockquote>
<p>Many readers will skim over formulas on their first reading of your exposition. Therefore, your sentences should flow smoothly when all but the simplest formulas are replaced by “blah” or some other grunting noise.</p>
</blockquote>
</li>
<li>
<p><a href="http://people.csail.mit.edu/billf/publications/How_To_Do_Research.pdf">How to do good research ~ Bill Freiman</a>.</p>
<blockquote>
<p>Sometimes it’s useful to think that everyone else is an idiot. This lets you do things that no one else is doing. It’s best not to be too vocal about that. You can say something like “Oh, I just thought I’d try out this direction”.</p>
</blockquote>
</li>
<li>
<p><a href="http://people.csail.mit.edu/billf/talks/10minFreeman2013.pdf">Elements of a Successful Graduate Career</a>.</p>
<blockquote>
<p>I think the most important thing in research is a story – not a theorem or an algorithm – but the story that makes the theorem or algorithm interesting and exciting. It’s important to have an “ear” for a good story… when do the stories make sense, when are they bogus? ~ Tomas Lozano-Perez.</p>
</blockquote>
<blockquote>
<p>The best students are possessed by a problem. They’re independent. They teach their advisors. They don’t do what they’re told…they do something more interesting. ~ Leslie Kaelbling.</p>
</blockquote>
<blockquote>
<p>Don’t tell your advisor you’re doing what they advised against until you’ve solved the problem. ~ Manolis Kellis.</p>
</blockquote>
<blockquote>
<p>Which brings us to the moral of the story: More important than your thesis topic is who your advisor is. ~ Charles Leiserson.</p>
</blockquote>
<blockquote>
<p>Eat, sleep, and breathe a problem until you crack it. Become the world’s foremost expert on your thesis topic. Surpass your advisor. ~ Daniel Jackson.</p>
</blockquote>
</li>
<li>
<p><a href="http://www.ai.mit.edu/courses/6.899/papers/ted.htm">On how to write papers, Ted Adelson</a>.</p>
<blockquote>
<p>Start by stating which problem you are addressing, keeping the audience in mind. They must care about it, which means that sometimes you must tell them why they should care about the problem. Then state briefly what the other solutions are to the problem, and why they aren’t satisfactory. If they were satisfactory, you wouldn’t need to do the work. Then explain your own solution, compare it with other solutions, and say why it’s bettter. At the end, talk about related work where similar techniques and experiments have been used, but applied to a different problem. Since I developed this formula, it seems that all the papers I’ve written have been accepted.</p>
</blockquote>
</li>
</ul>
<h3 id="vladen-koltuns-advice"><a href="https://www.cc.gatech.edu/~parikh/citizenofcvpr/static/slides/koltun_doing_good_research.pdf">Vladen Koltun’s Advice</a></h3>
<ul>
<li>Picking a problem
<ul>
<li>Formulate a larger goal.</li>
<li>Personally meaningful.</li>
<li>Fits into the scheme of collective progress.</li>
</ul>
</li>
<li>Analyze bottlenecks.</li>
<li>
<p>Understand the state of the art.</p>
</li>
<li>Making a contribution:
<ul>
<li>Read the papers.</li>
<li>Look for unwarranted assumptions.</li>
<li>What are the limitations? When will this break? How could this be done better?</li>
</ul>
</li>
<li>Reimplement a state-of-the-art technique
<ul>
<li>Reproduce the results.</li>
<li>Then bombard it with controlled experiments.</li>
<li>Look for surprises, cracks that lead to deeper realizations.</li>
</ul>
</li>
<li>Be on the lookout for interesting contributions.</li>
<li>Many important findings are not what the researchers set out to find
<ul>
<li>“Scheele happened upon chlorine while trying to isolate manganese; Claude Bernard planned experiments to characterize the destructive agent in sugar but instead discovered the glycogenic function of the liver; and so on.”
~ Ramón y Cajal, Letters to a Young Investigator</li>
</ul>
</li>
<li>Publications
<ul>
<li>Quality, not quantity.</li>
<li>Do not compromise on methodology or ethics.</li>
<li>Be willing to bury drafts and move on.</li>
</ul>
</li>
<li>Publication portfolio
<ul>
<li>Most are prosaic.</li>
<li>Some are significant.</li>
<li>None are sloppy.</li>
</ul>
</li>
<li>High standards
<ul>
<li>Bury the weak, boring, and sloppy results.</li>
<li>Weak and sloppy work is a drain on the community. Can mislead. Goes against the goal of contributing something useful to the community.</li>
<li>Quantity is easy. The community doesn’t need more quantity.</li>
</ul>
</li>
<li>Research over time
<ul>
<li>Research begets research.</li>
<li>Keep track of favorite problems, revisit occassionally.</li>
<li>Go back to the larger goals.</li>
<li>Read. A lot.</li>
<li>Write down ideas. Talk to people.</li>
<li>Quiet time for reading, writing, thinking.</li>
</ul>
</li>
<li>For pete’s sake, get a good work ethic
<ul>
<li>
<p>I do not believe a person can ever leave their business. They ought to think of it by day and dream of it by night. […] if they intend to go forward and do anything, the whistle is only a signal to start thinking over the day’s work in order to discover how it might be done better. […] The person who has the largest capacity for work and thought is the person who is bound to succeed.
~ Henry Ford, My Life and Work.</p>
</li>
<li>
<p>In science as in the lottery, luck favors those who wager the most – that is, by another analogy, those who are tilling constantly the ground in their garden.
~ Ramón y Cajal, Letters to a Young Investigator.</p>
</li>
<li>
<p>Successful people exhibit more activity, more energy, than most people do. They look more places, they work harder, they think longer than less successful people. Knowledge and ability are much like compound interest – the more you do the more you can do, and the more opportunities are open for you.
~ Hamming, Striving for Greatness in All You Do.</p>
</li>
</ul>
</li>
</ul>
<h3 id="david-mermin"><a href="http://www.ai.mit.edu/courses/6.899/papers/mermin.pdf">David Mermin</a>.</h3>
<p>Always punctuate your equations. Math is prose. Number all equations in your text. It helps your readers.</p>
<h3 id="more-advice">More Advice.</h3>
<h4 id="daniel-liberzon-research-quotes"><a href="http://liberzon.csl.illinois.edu/quote-research.html">Daniel Liberzon Research Quotes</a>.</h4>
<h4 id="how-to-write-mathematics"><a href="https://sites.math.washington.edu/~lind/Resources/Halmos.pdf">How to write Mathematics</a>.</h4>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+ [Alternative downloads website](downloads/Halmos.pdf)
</code></pre></div></div>
<h4 id="daniel-liberzon---how-to-write-a-good-paper"><a href="http://liberzon.csl.illinois.edu/writing-guidelines.html">Daniel Liberzon - How to write a good paper</a>.</h4>
<h4 id="daniel-liberzon---how-to-peer-review"><a href="http://liberzon.csl.illinois.edu/peer-review.pdf">Daniel Liberzon - How to peer review</a>.</h4>
<h4 id="george-m-whitesides--writing-a-paper"><a href="https://onlinelibrary.wiley.com/doi/pdf/10.1002/adma.200400767">George M. Whitesides – Writing a Paper</a>.</h4>
<h4 id="dmitri-bertsekas--ten-simple-rules-for-mathematical-writing"><a href="http://newslab.ece.ohio-state.edu/for%20students/resources/tenrules.pdf">Dmitri Bertsekas – Ten Simple Rules for Mathematical Writing</a>.</h4>
<h4 id="don-knuth-mathematical-writing"><a href="http://jmlr.csail.mit.edu/reviewing-papers/knuth_mathematical_writing.pdf">Don Knuth –Mathematical Writing</a>.</h4>
<h4 id="don-knuth--the-elements-of-mathematical-writing"><a href="https://www.mendeley.com/viewer/?fileId=36fa79c8-f907-0861-18b9-db563e2ef45f&documentId=7933d91e-f248-3103-9132-4657a82411f2">Don Knuth – The Elements of Mathematical Writing</a>.</h4>
<h4 id="n-david-mermin--whats-wrong-with-these-equations"><a href="https://www.mendeley.com/viewer/?fileId=6a8199f8-2d25-e7ef-28ac-579137975e93&documentId=779c442f-635a-3d6a-b64f-7ee8315d8aa2">N. David Mermin – What’s Wrong With These Equations</a>.</h4>
Thu, 02 Aug 2018 14:21:00 +0000
http://scriptedonachip.com/good-research
http://scriptedonachip.com/good-researchresearchgood-researchWhat's behind IEEE RAS Best Conference Papers?<p>Through the looking-glass, tired of working my brain out while familiarizing myself with a giant codebase I was reverse-engineering to prove a theory for an upcoming conference, I started asking myself why I was doing what I was doing? Just to get a paper out? Or to make a great contribution to science and my field? I googled something along the lines of “<em>How to write a best IEEE RAS conference paper</em>”. What I found was very interesting as I came across this <a href="/assets/ieee-best-paper/best_paper.pdf">IEEE Student Activities Committee Paper</a>. It offers
great insight into what constitutes writing a paper that merits IEEE Robotics and Automation Society best conference awards.</p>
<p>I hope you enjoy reading it as much as I did.</p>
Sat, 24 Jun 2017 09:15:00 +0000
http://scriptedonachip.com/ieee-best-papers
http://scriptedonachip.com/ieee-best-papersbest-paper,IEEE,RASOn the necessary and sufficient conditions for optimal controllers
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<!-- ### <center>Optimal Controllers: </center> -->
<p>This post deals with understanding the necessary and sufficient conditions, fundamental Lipschitz continuity assumptions and the terminal boundary conditions imposed on the Hamilton-Jacobi equation to assure that the problem of minimizing an integral performance index is well-posed.</p>
<h4 id="problem-statement">Problem Statement</h4>
<p>Suppose we have the following nonlinear dynamical system</p>
<p>\begin{equation} \label{eq:system}
\dot{x} =f(x, u, t), \qquad \qquad x(t_0) = x_0
\end{equation}</p>
<p>which starts at state, \(x_0\) and time, \(t_0\).</p>
<h5 id="assumption-i"><strong>Assumption I</strong></h5>
<p>If the function \(f(\centerdot)\) is
continuously differentiable in all its arguments, then the initial value problem (IVP) of \eqref{eq:system} has a <u>unique solution</u> on a finite time interval; this is a sufficient assumption (Khalil, 1976).</p>
<h5 id="assumption-ii"><strong>Assumption II</strong></h5>
<p>\(T\) is sufficiently small enough to reside within the time interval where the system’s solutions are defined.</p>
<p>Qualitatively, our goal is to <strong>optimally</strong> control the system when it starts in a state \(x_0\), at time \(t_0\), to a neighborhood of the terminal manifold \(T\), whilst exerting as minimal a control energy as possible. Quantitatively, we can define this goal in terms of an index of performance evaluation defined thus:</p>
<p>\begin{equation} \label{eq:cost}
J = J(x(t_0, u(\centerdot), t_0) = \int\limits_{t=t_0}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))
\end{equation}</p>
<p>where \(J\) is evaluated along the trajectories of the system \(x(t)\), based on an applied control \(u(\centerdot)|_{t_0 \le t \le T} \).
With \(L\left(x(\tau), u(\tau), \tau\right)\) as the instantaneous cost and \(V(x(T))\) as the terminal cost (which are nonnegative funtions of their arguments), we can think of \(J\) as the total amount of actions we take (controls) and the state energy utilized in bearing the states from \(x_0\) to a neighborhood of the terminal manifold \(V(x(T)) = 0\).</p>
<p>The question to ask then is that given the cost of performance index \(J\), how do we find a control law \(u^\star\) that is optimal along a unique state trajectory, \(x^\star\), in the interval \([t_0, T]\)? This optimal cost would be the minimum of all the possible costs that we could possibly incur when we implement the optimal control law \(u^\star\). Mathematically, we can express this cost as:</p>
<p>\begin{gather}
J^\star(x(t_0), t_0) = \int\limits_{t=t_0}^{T} L \left(x^{\star}(\tau), u^\star(\tau), \tau \right) d\tau + V(x^\star(T)) <br />
= \min_{ u_{[t_0, T]}} J(x_0, u, t_0)
\end{gather}</p>
<p>Therefore, the optimal cost is a function of the starting state and time so that we can write:</p>
<p>\begin{equation}
J^\star(x(t_0), t_0) = \min_{ u_{[t_0, T]}} J(x(t_0), u(\centerdot), t_0) = \min_{ u_{[t_0, T]}} \int\limits_{t_0}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))
\end{equation}</p>
<p>Now, assume that we start at an arbitrary initial condition \(x\), at time \(t\), it follows that the optimal cost-to-go from \(x(t)\) to \(x(T)\) is (abusing notation and dropping the templated arguments in \(J\)):</p>
<p>\begin{equation} \label{eq:cost-to-go}
J^\star(x, t) = \min_{ u_{[t, T]}} \left[\int\limits_{t}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau\right] + V(x(T))
\end{equation}</p>
<p>Things get a little bit interesting when we splice up the integral kernel in \eqref{eq:cost-to-go} along two different time-paths, namely:</p>
<p>\begin{equation} \label{eq:spliced}
J^\star(x, t) = \min_{ u_{[t, T]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + \int\limits_{t_1}^{t_2} L\left(x(\tau), u(\tau), \tau\right)d\tau\right] + V(x(T))
\end{equation}</p>
<p>We can split the minimization over two time intervals, e.g.,</p>
<p>\begin{equation} \label{eq:two_mins}
J^\star(x, t) = \min_{ u_{[t, t_1]}} \min_{ u_{[t_1, t_2]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + \int\limits_{t_1}^{t_2} L\left(x(\tau), u(\tau), \tau\right)d\tau\right] + V(x(T))
\end{equation}</p>
<p>Equation \eqref{eq:two_mins} gives the beautiful intuition that one can divide the integration into two or more time slices, solve the optimal control problem for each time slice and in the overall, minimize the effective cost function \(J\) of the overall system. This in essence is a statement of <a href="https://en.wikipedia.org/wiki/Richard_E._Bellman">Richard E. Bellman</a>’s principle of optimality:</p>
<blockquote>
<p>Bellman’s Principle of Optimality: An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.</p>
</blockquote>
<blockquote>
<p>– Bellman, Richard. Dynamic Programming, 1957, Chap. III.3.</p>
</blockquote>
<p>With the principle of optimality, the problem takes a more intuitive meaning, namely that the cost to go from \(x\) at time \(t\) to a terminal state \(x(T)\) can be computed by minimizing the sum of the cost to go from \(x = x(t)\) to \(x_1 = x(t_1)\) and then, the optimal cost-to-go from \(x_1\) onwards.</p>
<p>Therefore, \eqref{eq:two_mins} can be restated as:</p>
<p>\begin{equation} \label{eq:two_mins_sep}
J^\star(x, t) = \min_{ u_{[t, t_1]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + \underbrace{\min_{ u_{[t_1, t_2]}} \int\limits_{t_1}^{t_2} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))}_{J^\star(x_1, \, t_1)} \right]
\end{equation}</p>
<p>\(J^\star(x_1, \, t_1)\) in \eqref{eq:two_mins_sep} can be seen as the optimal cost-to-go from \(x_1\) to \(x(T)\), with the overall cost given by</p>
<p>\begin{equation} \label{eq:optimal_pre}
J^\star(x, t) = \min_{ u_{[t, t_1]}} \left[\int\limits_{t}^{t_1} L\left(x(\tau), u(\tau), \tau\right)d\tau + J^\star(x_1, \, t_1) \right]
\end{equation}</p>
<p>Replacing \(t_1\) by \(t + \delta t\) and with the assumption that \(J^\star(x, t)\) is differentiable, we can expand \eqref{eq:optimal_pre} into a first-order Taylor series around \((\delta t, x)\) as follows:</p>
<p>\begin{equation} \label{eq:taylor}
J^\star(x, t) = \min_{ u_{[t, t + \delta]}} \left[ L\left(x, u, \tau\right)\delta t + J^\star(x, \, t) + \left(\dfrac{\partial J^\star(x, t)}{\partial t}\right) \delta t + \left(\dfrac{\partial J^\star(x, t)}{\partial x} \right) \delta x + o(\delta) \right]
\end{equation}</p>
<p>where \(o(\delta)\) denotes higher order terms satisfying \(\lim_{\delta \rightarrow 0}\dfrac{o(\delta)}{\delta} = 0\).</p>
<p>Refactoring \eqref{eq:taylor}, we find that</p>
<p>\begin{equation} \label{eq:hamiltonian_pre}
\dfrac{\partial J^\star(x, t)}{\partial t} = -\min_{ u_{[t, t + \delta]}} \left[ L\left(x, u, \tau\right) + \left(\dfrac{\partial J^\star(x, t)}{\partial x} \right) \underbrace{\dot{x}(\centerdot)}_{f(x,u,t)} \right]
\end{equation}</p>
<p>We shall define the components in the square column of the above equation as the <strong>Hamiltonian</strong>, \(H(\centerdot)\) such that \eqref{eq:hamiltonian_pre} can be thus rewritten:</p>
<p>\begin{equation} \label{eq:hamiltonian}
\dfrac{\partial J^\star(x, t)}{\partial t} = -\min_{ u_{[t, t + \delta]}} H\left(x, \nabla_x J^\star (x, t), u, t \right)
\end{equation}</p>
<p>Based on the smoothness assumption of all function arguments in \eqref{eq:system},
when the linear sensitivity of the Hamiltonian to changes in \(u\) is zero, then
\(\nabla H_u\) <strong>must</strong> vanish at the optimal point i.e.,</p>
<p>\begin{equation} \label{eq:hamiltonian_deri}
\nabla H_u(x, \nabla_x J^\star (x, t), u, t) = 0
\end{equation}</p>
<p>ensuring that we satisfy the <strong>local optimality</strong> property of the controller. In addition, if the Hessian of the Hamiltonian is positive definite along the trajectories of the solution, i.e.,</p>
<p>\begin{equation} <br />
\dfrac{\partial^2 H}{\partial^2 u} > 0
\end{equation}</p>
<p>then we have the sufficient condition for global optimality. These conditions are referred to as the <a href="https://en.wikipedia.org/wiki/Legendre%E2%80%93Clebsch_condition">Legendre-Clebsch</a> conditions, essentially guaranteeing that over a singular arc, the Hamiltonian is minimized.</p>
<p>You begin to see the beauty of optimal control in that \eqref{eq:hamiltonian_deri} allows us to translate the complicated functional minimization integral of \eqref{eq:cost} into a minimization problem that can be solved by ordinary calculus.</p>
<p>If we let</p>
<p>\begin{equation} <br />
H^\star(x, \nabla_x J^\star (x, t), t) = \min_u \left[H(x, \nabla_x J^\star (x, t), u, t)\right]
\end{equation}</p>
<p>then it follows that solving \eqref{eq:hamiltonian_deri} for the optimal \(u = u^\star\) and putting the result in \eqref{eq:hamiltonian}, one obtains the <em><strong>Hamilton-Jacobi-Bellman</strong></em> pde whose solution is the optimal cost \(J^\star(x(t), t)\) such that</p>
<p>\begin{equation} \label{eq:optimal_cost}
\dfrac{\partial J^\star(x, t)}{\partial t} = -H^\star \left(x, \nabla_x J^\star (x, t), u, t \right)
\end{equation}</p>
<p>We can introduce a boundary condition that assures that the cost function of \eqref{eq:cost} is well-posed viz,</p>
<p>\begin{equation} \label{eq:boundary_cost}
J^\star(x(T), T) = V(x(T))
\end{equation}</p>
<p>Taken together, equations \eqref{eq:optimal_cost} allows us to analytically solve for the instanteneous <code class="language-plaintext highlighter-rouge">kinetic energy</code> of the cost function in \eqref{eq:cost} and \eqref{eq:boundary_cost} allows us to solve for the boundary condition that assure the sufficiency of an optimal control law to exist. If we can solve for \(u^\star\) from \(J^\star(x,t)\), then \eqref{eq:boundary_cost} must constitute the optimal control policy for the nonlinear dynamical system in \eqref{eq:system} given the cost index \eqref{eq:cost}.</p>
<h3 id="conclusions">Conclusions</h3>
<p>Notice that the optimal policy \(u^\star(t)\) is basically an open-loop control strategy. Why so? \(u^\star\) was derived as a function of time \(t\). As a result, the strategy may not be robust to uncertainties and may be very sensitive. For practical applications, we generally want to have a feedback control policy that is state dependent in order to guarantee robustness to parametric variations and achieve robust stability and performance. Such a \(u = u^\star(x)\) would be helpful in analyzing the stability of states and convergence of system dynamics to equilibrium for all future times. Will post such methods in the future.</p>
<h3 id="summary">Summary</h3>
<table>
<thead>
<tr>
<th style="text-align: left">Properties</th>
<th style="text-align: right">Equations</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">Dynamics:</td>
<td style="text-align: right">\(\dot{x} =f(x, u, t), \quad x(t_0) = x_0 \)</td>
</tr>
<tr>
<td style="text-align: left">Cost:</td>
<td style="text-align: right">\(J(x,u,\tau) = \int\limits_{t=t_0}^{T} L\left(x(\tau), u(\tau), \tau\right)d\tau + V(x(T))\)</td>
</tr>
<tr>
<td style="text-align: left">Optimal cost :</td>
<td style="text-align: right">\(J^\star(x,t) = \min_{u[t,T]}J\)</td>
</tr>
<tr>
<td style="text-align: left">Hamiltonian:</td>
<td style="text-align: right">\(H(x,u,t) = L\left(x, u, \tau\right) + \left(\dfrac{\partial J^\star(x, t)}{\partial x} \right) f(x,u,t)\)</td>
</tr>
<tr>
<td style="text-align: left">Optimal Control:</td>
<td style="text-align: right">\(u^\star(t) = H^\star(x,u,t) = \nabla H_u(x,u,t)\)</td>
</tr>
<tr>
<td style="text-align: left">HJB Equation:</td>
<td style="text-align: right">\(-\dfrac{\partial J^\star(x,t)}{\partial t} = H^\star(x, \nabla_x J^\star(x,t),t) )\) and \(J^\star(x(T), T) = V(x(T))\)</td>
</tr>
</tbody>
</table>
<h3 id="further-readings">Further Readings</h3>
<p><a href="https://web.archive.org/web/20050110161049/http://www.wu-wien.ac.at/usr/h99c/h9951826/bellman_dynprog.pdf">Richard Bellman: On The Birth Of Dynamic Programming</a></p>
<p><a href="https://www.amazon.com/Optimal-Control-Quadratic-Methods-Engineering/dp/0486457664">Optimal Control: Linear Quadratic Methods</a></p>
Sun, 04 Jun 2017 13:28:00 +0000
http://scriptedonachip.com/optimal-control
http://scriptedonachip.com/optimal-controlcontroloptimal-controlShould I use ROS or MuJoCo?<p>This was my answer to a question posted in an email thread to our research group’s email lists. The question goes like this:</p>
<p><br />
<br />
<strong>QUESTION</strong>
<br />
<strong>__</strong><strong>__</strong><strong>__</strong><strong>__</strong><strong>__</strong><strong>__</strong><em>__</em><br />
From: XXX@uni-x.edu <br />
Sent: Sunday, May 14, 2017 9:29 AM <br />
To: XXX@lists.uni-x.edu <br />
Subject: RE: [robotec] MuJoCo <br /></p>
<p>From the documentation, looks like MuJoCo is faster and therefore better to simulate computational intensive controllers like MPC. Gazebo provides other engines as well and seems like is more popular in the ROS community. How to choose between the available options? Which one would you recommend?</p>
<p>Thanks,<br />
XXX</p>
<p><strong>Answer</strong></p>
<p><strong>TL; DR:</strong>
If you do not care for accuracy of simulated controller numerical results, if you are not simulating parallel linkages, if you do not need code parallelization (i.e. your computation is not crazy intensive) or if you are not simulating contact and friction, I would choose ROS. Easy to use and straightforward to build fairly complex models.</p>
<h1 id="proper-long-answer"><strong>Proper (Long) Answer</strong></h1>
<p>ROS and Gazebo (OSRF tools) are indeed popular in the robotics community like you mentioned and they have their pros. It took me a while to see their limit when using them for research purposes.</p>
<p><strong>Definition</strong></p>
<p><code class="language-plaintext highlighter-rouge">ROS = Plumbing + Tools + Capabilities + Ecosystem</code></p>
<p><code class="language-plaintext highlighter-rouge">Plumbing</code>: ROS provides publish-subscribe messaging infrastructure designed to support the quick and easy construction of distributed computing systems.</p>
<p><code class="language-plaintext highlighter-rouge">Tools</code>: ROS provides an extensive set of tools for configuring, starting, introspecting, debugging, visualizing, logging, testing, and stopping distributed computing systems.</p>
<p><code class="language-plaintext highlighter-rouge">Capabilities</code>: ROS provides a broad collection of libraries that implement useful robot functionality, with a focus on mobility, manipulation, and perception.</p>
<p><code class="language-plaintext highlighter-rouge">Ecosystem</code>: ROS is supported and improved by a large community, with a strong focus on integration and documentation. ros.org is a one-stop-shop for finding and learning about the thousands of ROS packages that are available from developers around the world. answers.ros.org is a rich online community of ros packages users from around the world asking questions and getting help on how to use ROS.</p>
<p>So getting a simple dynamics kicking should not be a lot of hassle as the documentation is rich and the online community is very active in supporting newbies.</p>
<p>In the early days, the plumbing, tools, and capabilities were tightly coupled, which has both advantages and disadvantages. On the one hand, by making strong assumptions about how a particular component will be used, developers are able to quickly and easily build and test complex integrated systems. On the other hand, users are given an “all or nothing” choice: to use an interesting ROS component, you pretty much had to jump in to using all of ROS.</p>
<p>8+ years in after Andrew Ng and co. conceived the platform, the core system has matured considerably, and developers are hard at work refactoring code to separate plumbing from tools from capabilities, so that each may be used in isolation. In particular, people are aiming for important libraries that were developed within ROS to become available to non-ROS users in a minimal-dependency fashion (e.g. OMPL and PCL libraries).</p>
<p>Disclaimer: Borrowed from Brian Gerkey’s/my answer to a similar quora question about a year ago.</p>
<p>For serially linked robot arms and other non-parallel linkages, ROS is a great simulation tool and “middleware”. However, there are bottlenecks with ROS.</p>
<p>What ROS calls URDF (Universal Robot Description Format), which is the abstraction tool for rigid body dynamics, is not universal in any sense of the word. URDF models written in ROS are out-of-the-box incompatible with Gazebo, its sister physics engine (see this <a href="http://answers.gazebosim.org/question/14891/conversion-from-urdf-to-sdf-using-gzsdf-issues/">question/wiki</a>). More so, state representation in OSRF tools such as ROS is represented in a tree-like manner. I learnt this late last year when simulating parallel linkages. The internal ROS XML parser interprets constructed linkages as a deep binary tree and not graphs. This makes simulating parallel linkages almost (actually) impossible. Repeat, actually impossible. They have a fix for this in Gazebo SDF but it is not straightforward. So developers spend a huge chunk of time migrating code from one OSRF framework to another.</p>
<p>Good controller algorithm formulations are based on numerical optimization (think MPC, differential dynamic programming, sampling-based motion-planning or reinforcement learning). Gazebo was designed around the ODE (Open Dynamics Engine) and Bullet physics engines which provide the states in over-complete Cartesian coordinates and enforce joint constraints via numerical optimization. This is good enough for disconnected bodies with few joint constraints but becomes a pain for complex dynamics such as humanoids or simulating human-robot interactions. Running complex simulations for huge candidate evaluations of humanoids can run into months using ROS (e.g. Todorov’s de novo synthesis). Whereas MUJOCO is optimized for parallel processing, distributed evaluation of possible controllers from which a candidate controller is chosen.</p>
<p>ODE simulators optimize the controller to the engine. This makes the controller cheat during simulations in ways that mean generated control laws may be physically unrealizable. Speed and accuracy? Controller optimization with MoveIt! (a motion planning framework from OSRF) is mostly done in a single threaded code without the advantage of explicit parallelization of code to make e.g. IK solutions faster. Implementation of concurrency and multithreading is left to the user (this is a big no-no for someone not interested in software engineering).</p>
<p>ROS is strictly written based on the assumption that the user is running a Linux kernel. So users not familiar with Linux are thrown aback when they first get exposed to it. With MUJOCO, you do not need Linux or OSX as it works on Windows OS just fine. MUJOCO also use an XML parser to interpret links and joints and so it is able to read ROS URDFs and xacro files okay. But it doesn’t work the other way (see <a href="http://www.mujoco.org/forum/index.php?threads/ros-gazebo-integration.3371/">this answer</a> from Todorov)</p>
<p>MPC implementations are elegant only when the model is accurate. Unexpected poor performance of an MPC controller will often be due to poor modeling assumptions (Rossiter). If the simulation engine emphasizes simulation stability over control law precision, we have a problem. And this is my problem with Gazebo and ROS generally. I read it somewhere in one of Todorov’s papers (can’t remember where I found it) that the floating point ops of MUJOCO were unit tested to <code class="language-plaintext highlighter-rouge">>>355</code> decimal points. The OSRF community may be good at community based software engineering for robotics but you have to give Todorov the credit. He had the patience and tenacity to develop such a robust software for control simulation. People stopped paying attention to floating point operational precision back in the late 80’s/90’s.</p>
<p>What’s more? MUJOCO allows you to write your models in C. Engineers are head over boots for Matlab but I am all for a program or modeling software that stays close to ones and zeros as much as possible. It means being less dumbfounded when things do not work as you envisioned and greater flexibility in being the master and architect of your creation.</p>
Sun, 14 May 2017 13:28:00 +0000
http://scriptedonachip.com/mujoco-ros
http://scriptedonachip.com/mujoco-rosQ&ArosmujocoTwanging git pull, push and clone<ul>
<li><a href="#table-o-conts">Table of Contents</a></li>
<li><a href="#intro">Introduction</a></li>
<li><a href="#pullpush">Pulling/Pushing git remotes from LAN/WAN repos</a></li>
<li><a href="#clone">Cloning git remotes from LAN/WAN repos</a></li>
</ul>
<p><a name="intro"></a></p>
<h3 id="introduction">Introduction</h3>
<p>Git is a useful tool for remote/online work collaboration, as well as social coding. It is useful being able to share one’s work among different computers using native git commands such as <code class="language-plaintext highlighter-rouge">merge</code>, <code class="language-plaintext highlighter-rouge">fetch</code>, <code class="language-plaintext highlighter-rouge">push</code>, <code class="language-plaintext highlighter-rouge">clone</code>, or <code class="language-plaintext highlighter-rouge">pull</code> without resolving to using <code class="language-plaintext highlighter-rouge">ssh</code>, or <code class="language-plaintext highlighter-rouge">scp</code> which are without the benefits of <code class="language-plaintext highlighter-rouge">diff</code> and <code class="language-plaintext highlighter-rouge">merge</code> strategies of <code class="language-plaintext highlighter-rouge">git</code>. More so, not everyone enjoys exposing their incomplete work/code to a remote repo for the sake of fetching to local <code class="language-plaintext highlighter-rouge">origins</code> on different computers. This post is meant to show how to go about these git ops strategies without going through a remote e.g. an http[s] server.</p>
<p><a name="pullpush"></a></p>
<h4 id="pullingpushing-git-remotes-from-a-lanwan-repo">Pulling/Pushing git remotes from a LAN/WAN repo</h4>
<p>As an example, suppose we have a repo named <code class="language-plaintext highlighter-rouge">sensors</code> in the <code class="language-plaintext highlighter-rouge">Documents</code> directory of a computer with username and group name <code class="language-plaintext highlighter-rouge">drumpf@dissembler</code> and we have a few commits ahead of a tracking repo on a computer named <code class="language-plaintext highlighter-rouge">robots@killem</code>, we can fetch and merge our recent commits on <code class="language-plaintext highlighter-rouge">drumpf@dissembler</code> into <code class="language-plaintext highlighter-rouge">robots@killem</code> as follows:</p>
<p>We could use <code class="language-plaintext highlighter-rouge">ssh</code>, <code class="language-plaintext highlighter-rouge">http[s]</code>, <code class="language-plaintext highlighter-rouge">ftp[s]</code> or <code class="language-plaintext highlighter-rouge">rsync</code> transport protocols. To pull updates from <code class="language-plaintext highlighter-rouge">drump@dissembler:~/Documents/sensors.git</code> to <code class="language-plaintext highlighter-rouge">robots@killem:~/Documents/sensors.git</code> repo, we would do one of the following:</p>
<ul>
<li>
<p>via ssh:</p>
<pre class="terminal"><code>robots@killem:~/Documents/sensors$ git pull ssh://drumpf@dissembler:/~/Documents/sensors.git</code></pre>
</li>
<li>
<p>via https:</p>
<pre class="terminal"><code>robots@killem:~/Documents/sensors$ git pull http[s]://drumpf@dissembler:/robots/killem/Documents/sensors.git</code></pre>
</li>
<li>
<p>via ftp</p>
<pre class="terminal"><code>robots@killem:/home/drumpf/Documents/sensors$ git pull ftp[s]://drumpf@dissembler:/robots/killem/Documents/sensors.git</code></pre>
</li>
<li>
<p>via rsync</p>
<pre class="terminal"><code>robots@killem:/home/drumpf/Documents/sensors$ git pull rsync://drumpf@dissembler:/~/Documents/sensors.git</code></pre>
</li>
</ul>
<p>Note that we have used user expansion for both <code class="language-plaintext highlighter-rouge">ssh</code> and <code class="language-plaintext highlighter-rouge">git</code>. <code class="language-plaintext highlighter-rouge">ftp[s]</code> and <code class="language-plaintext highlighter-rouge">rsync</code> do not allow user expansion when pulling, pushing or cloning, so the full path to the repo has to be specified.
The <code class="language-plaintext highlighter-rouge">https</code> syntax has no authentication and can be dangerous on unsecured networks. If the group names of the computers are not advertised by <code class="language-plaintext highlighter-rouge">/etc/hosts</code>, you can use the ip address of the computer in place of the host names. Note that <code class="language-plaintext highlighter-rouge">ftp[s]</code> can be used for fetching while <code class="language-plaintext highlighter-rouge">rsync</code> can be used for both fetching and pushing. Both are not very efficient, however, and they are actually deprecated; so you should refrain from using them as much as you can.</p>
<p>All the commands above would also work for <code class="language-plaintext highlighter-rouge">git push</code>.</p>
<p><strong>SCP-like syntaxes are valid as well:</strong></p>
<pre class="terminal"><code>scp [user@]host.ng:path/to/repo.git/</code></pre>
<p>but note that the first character after the first column must not be a slash to help distinguish a local path from an ssh url</p>
<p>All of the above commands also support cloning <code class="language-plaintext highlighter-rouge">git</code> repos from one directory to another on the same host or between workstations on the same <code class="language-plaintext highlighter-rouge">LAN/WAN</code>. All that would need to change would be to replace the <code class="language-plaintext highlighter-rouge">LAN/WAN</code> hostname with the path we are cloning from. See examples below:</p>
<p><a name="clone"></a></p>
<h4 id="cloning-git-remotes-from-a-lanwan-repo">Cloning git remotes from a LAN/WAN repo</h4>
<p>The procedure is the same as above save we replace pull/push with clone, e.g</p>
<ul>
<li>
<pre class="terminal"><code>git clone ssh://[you@]remote.ng[:port]/path/to/repo.git/</code></pre>
</li>
<li>
<pre class="terminal"><code>git clone git://remote.ng[:port]/path/to/repo.git/</code></pre>
</li>
<li>
<pre class="terminal"><code>git clone http[s]://remote.ng[:port]/path/to/repo.git/</code></pre>
</li>
<li>
<pre class="terminal"><code>git clone ftp[s]://remote.ng[:port]/path/to/repo.git/</code></pre>
</li>
<li>
<pre class="terminal"><code>git clone rsync://remote.ng/path/to/repo.git/</code></pre>
</li>
</ul>
<p>If when doing any of the operations specified so far, the transport protocol is not specified, no problem! Git assumes a remote url transport protocol if it does not know what the remote address is. So we could for example do</p>
<pre class="terminal"><code>robots@killem:~/Documents/sensors$ git push transport::address</code></pre>
<p>where <code class="language-plaintext highlighter-rouge">address</code> is the path to the repo on the LAN/WAN and transport is replaced by <code class="language-plaintext highlighter-rouge">https</code>.</p>
<p>An alternative scp-like syntax is also valid when using the ssh protocol:</p>
<ul>
<li>
<pre class="terminal"><code>git clone [you@]remote.ng:path/to/repo.git/</code></pre>
</li>
</ul>
<p>Just as is the case for <code class="language-plaintext highlighter-rouge">pull/push</code>, <code class="language-plaintext highlighter-rouge">https</code> is not secure and should be used with caution.</p>
Fri, 12 May 2017 11:37:00 +0000
http://scriptedonachip.com/git-twangs
http://scriptedonachip.com/git-twangsgitUbuntu-16.04 and Cuda-8.0 Install Guide<h4 id="introduction">Introduction</h4>
<p>NVIDIA libraries are notorious for breaking Xserver particularly in the ubuntu Linux distro. Here’s my installation guide on how to do a clean install without breaking display drivers. Hope it helps.</p>
<h4 id="installation">Installation</h4>
<p>Pull Ubuntu 8.0 from <a href="https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run">here</a></p>
<ul>
<li>Add a <code class="language-plaintext highlighter-rouge">blacklist-nouveau.conf</code> file to your <code class="language-plaintext highlighter-rouge">etc/modprobe.d</code> directory like so:</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">sudo touch</span> /etc/modprobe.d/blacklist-nouveau.conf
</code></pre></div></div>
<ul>
<li>Add the following contents to the file you just created using your fave editor:</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>blacklist nouveau
options nouveau <span class="nv">modeset</span><span class="o">=</span>0
</code></pre></div></div>
<ul>
<li>Turn off X server</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nb">sudo </span>service lightdm stop
</code></pre></div></div>
<ul>
<li>
<p>Install Cuda 8.0</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">cd</code> to the directory where the cuda install file was stored and run it with admin rights e.g.</li>
</ul>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo</span> ./cuda_8.0.61_375.26_linux.run
</code></pre></div> </div>
<ul>
<li>
<p>Accept the EULA Licence agreement</p>
</li>
<li>
<p>Accept yes for NVIDIA drivers install</p>
</li>
<li>
<p>Accept yes for cuda-8.0 and cuda symlink</p>
</li>
<li>
<p>Decline the installation of OpenGL Libraries (this breaks Xserver)</p>
</li>
<li>
<p>Install Samples</p>
</li>
<li>
<p>Decline the installation of nvidia-xconfig (you wouldn’t need it)</p>
</li>
<li>
<p>Reboot your system after installation</p>
</li>
</ul>
</li>
</ul>
<p>Voila! We’re set to start developing with cuda.</p>
Fri, 28 Apr 2017 10:32:00 +0000
http://scriptedonachip.com/cuda-ubuntu-fix
http://scriptedonachip.com/cuda-ubuntu-fixubuntulinuxcudaPyTorch and rospy interoperability<h3 id="table-of-contents"><a href="#table-of-contents">Table of Contents</a>:</h3>
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#nonlinear">Background</a></li>
<li><a href="#problem-formulation">Importing Torch into ROSPY</a></li>
<li><a href="#solution">Solution</a></li>
</ul>
<p><a name="introduction"></a></p>
<h4 id="introduction">Introduction</h4>
<p>Yesterday was mighty nightmarish in the life of this developer. I had trained a conv-net meant to classify an object I was trying to recognize and later on manipulate using vision-based control. Since <code class="language-plaintext highlighter-rouge">PYTORCH</code> had tensor computation with strong GPU acceleration and differential backprop capabilities based on the <code class="language-plaintext highlighter-rouge">torch auto-grad</code> system, I took advantage of its python compatibility since it would mean I could easily write my control code in <code class="language-plaintext highlighter-rouge">rospy</code> or <code class="language-plaintext highlighter-rouge">roscpp</code> and publish vision/control topics that reduces interoperability issues when working with different Linux processes. Only that I didn’t anticipate <code class="language-plaintext highlighter-rouge">Python 2</code> and <code class="language-plaintext highlighter-rouge">Python 3</code> module import problems way ahead of time. I would give more background below.</p>
<p><a name="nonlinear"></a></p>
<h4 id="background">Background</h4>
<p>For the record, I run <code class="language-plaintext highlighter-rouge">ROS 1.x (indigo bare bones)</code> on a <code class="language-plaintext highlighter-rouge">ubuntu 14.04 machine</code> with a 32GB RAM. The <code class="language-plaintext highlighter-rouge">pytorch</code> developers encourage users to install <code class="language-plaintext highlighter-rouge">Torch</code> with <code class="language-plaintext highlighter-rouge">conda</code> and typically use <code class="language-plaintext highlighter-rouge">python3</code> since <code class="language-plaintext highlighter-rouge">python 2</code> will be phased out in the near future. So, I had been using <code class="language-plaintext highlighter-rouge">pytorch</code> in a <code class="language-plaintext highlighter-rouge">conda</code> environment that both had a <code class="language-plaintext highlighter-rouge">python 2</code> and <code class="language-plaintext highlighter-rouge">python 3</code> environment. I could easily switch environments by turning on or off whichever python version I wanted. For details on how to do this, see this <a href="https://conda.io/docs/py2or3.html">doc</a> from the folks at conda.</p>
<p>So far, everything was working great. For <code class="language-plaintext highlighter-rouge">ros</code> applications that does not involve image processing classes such as <code class="language-plaintext highlighter-rouge">CvBridge</code>, I was able to get <code class="language-plaintext highlighter-rouge">ros</code> and <code class="language-plaintext highlighter-rouge">pytorch</code> to talk in <code class="language-plaintext highlighter-rouge">python3</code> despite <code class="language-plaintext highlighter-rouge">python3</code> being unofficially supported for <code class="language-plaintext highlighter-rouge">ROS 1.x</code> (see this <a href="https://github.com/ros2/ros2/wiki">github wiki</a>). Getting this to work involves pip installing the necessary <code class="language-plaintext highlighter-rouge">ros</code> dependencies in <code class="language-plaintext highlighter-rouge">python3</code> using this <a href="https://github.com/lakehanne/RAL2017/blob/master/requirements.txt">requirements.txt file</a>. This <a href="https://github.com/lakehanne/RAL2017/blob/master/pyrnn/src">github repo page</a> shows how I do this.</p>
<p>Anyways, so I trained a conv net model in <code class="language-plaintext highlighter-rouge">pytorch</code>, no big deal. I had a <code class="language-plaintext highlighter-rouge">roscpp</code> node in running on a different workstation, but within the same <code class="language-plaintext highlighter-rouge">ros</code> network broadcasting <code class="language-plaintext highlighter-rouge">sensor_msgs/Image</code> <code class="language-plaintext highlighter-rouge">RGB</code> images on a designated topic. Given what I know, it should be easy subscribing to the image topic and forwarding the video stream through the pre-trained neural network model to obtain classification results. But boy was I wrong.</p>
<p><a name="problem-formulation"></a></p>
<h4 id="importing-torch-into-rospy">Importing <code class="language-plaintext highlighter-rouge">torch</code> into <code class="language-plaintext highlighter-rouge">rospy</code></h4>
<p>When you install <code class="language-plaintext highlighter-rouge">pytorch</code> with <code class="language-plaintext highlighter-rouge">conda</code>, it typically places the installation relative to your <code class="language-plaintext highlighter-rouge">anaconda</code> install path. For me this was in <code class="language-plaintext highlighter-rouge">/home/$USER/anaconda3</code>. So to be able to import <code class="language-plaintext highlighter-rouge">Torch</code> and use <code class="language-plaintext highlighter-rouge">rospy</code>’s’ <code class="language-plaintext highlighter-rouge">CvBridge</code> class simultaneously, I installed the following modules: <code class="language-plaintext highlighter-rouge">netifaces</code>, <code class="language-plaintext highlighter-rouge">catkin_pkgs</code> and <code class="language-plaintext highlighter-rouge">rospkg</code> via <code class="language-plaintext highlighter-rouge">pip</code> while in the <code class="language-plaintext highlighter-rouge">python3</code> <code class="language-plaintext highlighter-rouge">conda</code> environment. Then I tried to import the convnet model from a different module’s class into a <code class="language-plaintext highlighter-rouge">rospy</code> module I had written.</p>
<blockquote>
<p>to be able to import <code class="language-plaintext highlighter-rouge">Torch</code> and use <code class="language-plaintext highlighter-rouge">rospy's</code> <code class="language-plaintext highlighter-rouge">CvBridge</code> simultaneously, I installed the following modules: <code class="language-plaintext highlighter-rouge">netifaces</code>, <code class="language-plaintext highlighter-rouge">catkin_pkgs</code> and <code class="language-plaintext highlighter-rouge">rospkg</code> via <code class="language-plaintext highlighter-rouge">pip</code></p>
</blockquote>
<p>Say <code class="language-plaintext highlighter-rouge">convnet.py</code> model had entries like so:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">import</span> <span class="nn">torch</span>
<span class="kn">import</span> <span class="nn">torch.nn</span> <span class="k">as</span> <span class="n">nn</span>
<span class="k">class</span> <span class="nc">ResNet</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">convModel</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">arg1</span><span class="p">,</span> <span class="n">arg2</span><span class="p">):</span>
<span class="s">'''
define some conv models
'''</span>
<span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="s">'''
do stuff with conv layers
'''</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">fc</span><span class="p">(</span><span class="n">prev_layer</span><span class="p">(</span><span class="n">x</span><span class="p">))</span>
</code></pre></div></div>
<p>and <code class="language-plaintext highlighter-rouge">process_images.py</code> file had an import statement like so</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kn">from</span> <span class="nn">convnet</span> <span class="kn">import</span> <span class="n">ResNet</span>
<span class="s">'''
do stuff with imported model
'''</span>
</code></pre></div></div>
<p>I got weird errors like</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="o">>></span> <span class="n">Python</span> <span class="mf">3.6</span><span class="p">.</span><span class="mi">0</span> <span class="p">(</span><span class="n">default</span><span class="p">,</span> <span class="n">Oct</span> <span class="mi">26</span> <span class="mi">2016</span><span class="p">,</span> <span class="mi">20</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">19</span><span class="p">)</span>
<span class="p">[</span><span class="n">GCC</span> <span class="mf">4.8</span><span class="p">.</span><span class="mi">4</span><span class="p">]</span> <span class="n">on</span> <span class="n">linux2</span>
<span class="n">Type</span> <span class="s">"help"</span><span class="p">,</span> <span class="s">"copyright"</span><span class="p">,</span> <span class="s">"credits"</span> <span class="ow">or</span> <span class="s">"license"</span> <span class="k">for</span> <span class="n">more</span> <span class="n">information</span><span class="p">.</span>
<span class="o">>></span> <span class="n">No</span> <span class="n">module</span> <span class="n">named</span> <span class="n">Torch</span>
</code></pre></div></div>
<p>Huh? Country boy comes to town. But the <code class="language-plaintext highlighter-rouge">convnet.py</code> model imports Torch okay. I figured the problem must be because I installed <code class="language-plaintext highlighter-rouge">pytorch</code> with the <code class="language-plaintext highlighter-rouge">python3</code> version. And so I pulled the <code class="language-plaintext highlighter-rouge">python2</code> version of <code class="language-plaintext highlighter-rouge">pytorch</code> from Soumith’s channel.</p>
<p>Now when I import, it says stuff like <code class="language-plaintext highlighter-rouge">convnet module xx compiled with a different Torch version</code>. What the heck <code class="language-plaintext highlighter-rouge">pytorch</code>?</p>
<p><a name="solution"></a></p>
<h4 id="solution">Solution</h4>
<p>At this moment, I stepped out for a walk, and caught a brainchild. What if I do away with the <code class="language-plaintext highlighter-rouge">conda</code> build of <code class="language-plaintext highlighter-rouge">pytorch</code> and instead install <code class="language-plaintext highlighter-rouge">pytorch</code> from source or <code class="language-plaintext highlighter-rouge">PyPI</code>?</p>
<p>It turns out that this is the most error-less prone way to import <code class="language-plaintext highlighter-rouge">pytorch</code> models into a <code class="language-plaintext highlighter-rouge">rospy</code> file or indeed a <code class="language-plaintext highlighter-rouge">python2</code> file. To do this, I temporarily moved my <code class="language-plaintext highlighter-rouge">anaconda3</code> folder out of <code class="language-plaintext highlighter-rouge">bash</code>’s native path, pulled the latest <code class="language-plaintext highlighter-rouge">pytorch</code> commit from github and then installed with <code class="language-plaintext highlighter-rouge">python setup.py install</code>.</p>
<p>Now when I try out the above commands, everything works well.</p>
<blockquote>
<p>It turns out that this is the most error-less prone way to import Pytorch models into a rospy file or indeed a python2 file. To do this, I temporarily moved my <code class="language-plaintext highlighter-rouge">anaconda3</code> folder out of bash’s native path, pulled the latest pytorch commit from github and then installed with <code class="language-plaintext highlighter-rouge">python setup.py install</code>.</p>
</blockquote>
<p>So my two cents to the robotics community running neural net models in <code class="language-plaintext highlighter-rouge">pytorch</code> or <code class="language-plaintext highlighter-rouge">tensorflow</code> and using such models in <code class="language-plaintext highlighter-rouge">rospy</code> or equivalent environments is to always go for the source installation whenever and if possible. You would save yourself a lot of headache and time-waste.</p>
Thu, 27 Apr 2017 09:15:00 +0000
http://scriptedonachip.com/pytorch-ros
http://scriptedonachip.com/pytorch-rospytorchrostorchBackpropagation and convex programming in MRAS systems
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
TeX: { equationNumbers: { autoNumber: "AMS" } }
});
</script>
<!--Mathjax Parser -->
<script type="text/javascript" async="" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<h3 id="table-of-contents"><a href="#table-of-contents">Table of Contents</a>:</h3>
<ul>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#nonlinear">Nonlinear (Multivariable) Model Reference Adaptive Systems</a></li>
<li><a href="#problem-formulation">Solving Quadratic Programming in a Backprop setting</a></li>
<li><a href="#slack-variables">Slack Variables</a></li>
<li><a href="#limitation-backprop">QP Layer as the last layer in backpropagation</a></li>
<li><a href="#initialization">QP Algorithm</a></li>
<li><a href="#example-codes">Example Code</a></li>
<li><a href="#acknowledgements">Acknowlegment</a></li>
</ul>
<p><a name="introduction"></a></p>
<h2 id="introduction">Introduction</h2>
<p>The backpropagation algorithm is very useful for general optimization tasks, particularly in neural network function approximators and deep learning applications. Great progress in nonlinear function approximation has been made due to the effectiveness of the backprop algorithm. Whereas in traditional control applications, we typically use feedback regulation to stabilize the states of the system, in model reference adaptive control systems, we want to specify an index of performance to determine the “goodness” of our adaptation. An auxiliary dynamic system called the <strong>reference model</strong> is used in generating this index of performance (IP). The reference model specifies in terms of the input and states of the model a given index of performance and a comparison check determines appropriate control laws by comparing the given IP and measured IP based on the outputs of the adjustable system to that of the reference model system. This is called the <strong>error state space</strong>.</p>
<p><a name="nonlinear"></a></p>
<h3 id="nonlinear-model-reference-adaptive-systems">Nonlinear Model Reference Adaptive Systems</h3>
<p>With nonlinear systems, the unknown nonlinearity, say \(f(.)\), is usually approaximated with a function approximator such as a single hidden layer neural network. To date, the state-of-the-art used in adjusting the weights of a neural network is the <a href="https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/">backpropagation algorithm</a>. The optimization in classical backprop is unrolled end-to-end so that the complexity of the network increases when we want to add an <i>argmin differentiation layer</i> before the final neural network layer. The final layer determines the controller parameters or generates the control laws used in adjusting the plant behavior. Fitting the control laws into actuator constraints such as model predictive control schemes allow is not explicitly formulated when using the backprop algorithm; ideally, we would want to fit a quadratic convex layer to compute controller parameters exactly. We cannot easily fit a convex optimization layer into the backprop algorithm using classical gradient descent because the explicit Jacobians of the gradients of the system’s energy function with respect to system parameters is not exactly formulated (but rather are ordered derivatives which fluctuate about the global/local minimum when the weights of the network converge).</p>
<blockquote>
<p>The final layer determines the controller parameters or generates the control laws used in adjusting the plant behavior. Fitting the control laws into actuator constraints such as model predictive control schemes allow is not explicitly formulated when using the backprop algorithm; ideally, we would want to fit a quadratic convex layer to compute controller parameters exactly.</p>
</blockquote>
<p>To generate control laws such as torques to control a motor arm in a multi-dof robot-arm for example, we would want to define a quadratic programming layer as the last layer of our neural network optimization algorithm so that effective control laws that exactly fit into actuator saturation limits are generated. Doing this requires a bit of tweaking of the backprop algorithm on our part.</p>
<p><a name="problem-formulation"></a></p>
<h3 id="solving-quadratic-programming-in-a-backprop-setting">Solving Quadratic Programming in a Backprop setting</h3>
<p>When trying to construct a controller for a regulator, or an MRAS system, we may imagine that the control law determination is a search process for a control scheme that takes an arbitrary nonzero initial state to a zero state, ideally in a short amount of time. If the system is controllable, then we may require the controller taking the system, from state \(x(t_0)\) to the zero state at time \(T\). If \(T\) is closer to \(t_0\) than not, more control effort would be required to bear states to \(t_0\). This would ensure the transfer of states. In most engineering systems, an upper bound is set on the magnitudes of the variables for pragmatic purposes. It therefore becomes impossible to take \(T\) to \(0\) without exceeding the control bounds. Unless we are ready to tolerate high gain terms in the controller parameters, the control is not feasible for finite T. So what do we do? To meet the practical bounds manufacturers place on physical actuators, it suffices to manually formulate these bounds as constraints into the control design objectives.</p>
<p>Model predictive controllers have explicit ways of incorporating these constraints into the control design. There are no rules for tuning the parameters of an MRAC system so that the control laws generated in our adjustment mechanism are scaled into the bounds of the underlying actuator.</p>
<p>Since most controller hardware constraints are specified in terms of lower and upper bounded saturation, the QP problem formulated below is limited to inequality constraints. For equality-constrained QP problems, <a href="https://stanford.edu/~boyd/papers/pdf/code_gen_impl.pdf">Mattingley and Boyd</a>, <a href="http://www.seas.ucla.edu/~vandenbe/publications/coneprog.pdf">Vanderberghe’s CVX Optimization</a>, or Brandon Amos’ <a href="https://arxiv.org/pdf/1703.00443.pdf">ICML submission</a> offer good treatments.</p>
<blockquote>
<p>There are no rules for tuning the parameters of an MRAC system so that the control laws generated in our adjustment mechanism are scaled into the bounds of the underlying actuator.</p>
</blockquote>
<p>We define the standard QP canonical form problem with inequality contraints thus:</p>
<p>\begin{align}
\text{minimize} \quad \frac{1}{2}x^TQx + q^Tx
\label{eq:orig}
\end{align}</p>
<p>subject to</p>
<p>\begin{align}
G x \le h \nonumber
\end{align}</p>
<p>where \(Q \succeq \mathbb{S}^n_+ \) (i.e. a symmetric, positive semi-definite matrix) \(\in \mathbb{R}^n, q \in \mathbb{R}^n, G \in \mathbb{R}^{p \times n}, \text{ and } h \in \mathbb{R}^p \). Suppose we have our convex quadratic optimization problem in canonical form, we can use primal-dual interior point methods (PDIPM) to find an optimal solution to such a problem. PDIPMs are the state-of-the-art in solving such problems. Primal-dual methods with Mehrota predictor-corrector are consistent for reliably solving QP embedded optimization problems within 5-25iterations, without warm-start (<a href="https://stanford.edu/~boyd/papers/pdf/code_gen_impl.pdf">Boyd and Mattingley, 2012</a>).</p>
<p><a name="slack-variables"></a></p>
<h3 id="slack-variables">Slack Variables</h3>
<p>Given \eqref{eq:orig}, one can introduce slack variables, \(s \in \mathbb{R}^p\) as follows,</p>
<p>\begin{align}
\text{minimize} \quad \frac{1}{2}x^TQx + q^Tx
\label{eq:orig1}
\end{align}</p>
<p>subject to</p>
<p>\begin{align}
\quad G x + s = h, \qquad s \ge 0 \nonumber
\end{align}</p>
<p>where \(x \in \mathbb{R}^n, s \in \mathbb{R}^p\). If we let a dual variable \(z \in \mathbb{R}^p \) be associated with the inequality constraint, then we can define the KKT conditions for \eqref{eq:orig1} as</p>
\[Gx + s = h, \quad s \ge 0 \\
z \ge 0 \\
Qx + q + G^T = 0 \\ \\
z_i s_i = 0, i = 1, \ldots, p.\]
<p>More formally, if we write the Lagrangian of system \eqref{eq:orig} as</p>
<p>\begin{align}
L(z, \lambda) = \frac{1}{2}x^TQx + q^Tx +\lambda^T(Gz -h)
\label{eq:Lagrangian}
\end{align}</p>
<p>it follows that the KKT for <a href="https://www.cs.cmu.edu/~ggordon/10725-F12/slides/16-kkt.pdf">stationarity, primal feasibility and complementary slackness</a> are,</p>
<p>\begin{align}
Q x^\ast + q + G^T \lambda^\ast = 0 ,
\label{eq:KKTLagrangian}
\end{align}</p>
\[K \left(\lambda^\ast\right) \left(G x^\ast - h\right) = 0\]
<p>where \(K(\cdot) = \textbf{diag}(k) \) is an operator that creates a matrix diagonal of the entries of the vector \(k\). Computing the time-derivative of \eqref{eq:KKTLagrangian}, we find that</p>
<p>\begin{align}
dQ x^* + Q dx + dq + dG^T \lambda^* + G^T d\lambda = 0
\label{eq:KKTDiff}
\end{align}</p>
\[K(\lambda^*)\left(G x^* - h\right) = 0\]
<p><a name="limitation-backprop"></a></p>
<h3 id="qp-layer-as-the-last-layer-in-backpropagation">QP Layer as the last layer in backpropagation</h3>
<p>Vectorizing \eqref{eq:KKTDiff}, we find</p>
\[\begin{bmatrix}
Q & G^T \\
K(\lambda^\ast) G & K(dGx^\ast - h) \\
\end{bmatrix}
\begin{bmatrix}
dx \\
d\lambda \\
\end{bmatrix}
=
\begin{bmatrix}
-dQ x^\ast - dq - dG^T \lambda^\ast \\
-K(\lambda^\ast) dG x^\ast + DK(\lambda^\ast) dh \\
\end{bmatrix}\]
<p>so that the Jacobians of the variables to be optimized can be formed with respect to the states of the system. Finding \(\dfrac{\partial J}{\partial h^*}\), for example, would involve passing \(dh\) as identity and setting other terms on the rhs in the equation above to zero. After solving the equation, the desired Jacobian would be \(dz\). With backpropagation, however, the explicit Jacobians are useless since the gradients of the network parameters are computed using chain rule for <i>ordered derivatives</i> i.e.</p>
\[\dfrac{\partial ^+ J}{ \partial h_i} = \dfrac{\partial J}{ \partial h_i} + \sum_{j > i} \dfrac{\partial ^+ J}{\partial h_j} \dfrac{ {\partial} h_j}{ \partial h_i}\]
<p>where the derivatives with superscripts denote <i>ordered derivatives</i> and those with subscripts denote ordinary partial derivatives. The simple partial derivatives denote the direct effect of \(h_i\) on \(h_j\) through the <i>linear set of equations </i> that determine \(h_j\). To illustrate further, suppose that we have a system of equations given by</p>
\[x_2 = 3 x_1 \\
x_3 = 5 x_1 + 8 x_2\]
<p>The ordinary partial derivatives of \(x_3\) with respect to \(x_1\) would be \(5\). However, the ordered derivative of \(x_3\) with respect to \(x_1\) would be \(29\) (because of the indirect effect by way of \(x_2\)).</p>
<p>So with the backprop algorithm, we would form the left matrix-vector product with a previous backward pass vector, \(\frac{\partial J}{\partial x^\ast} \in \mathbb{R}^n \); this is mathematically equivalent to \(\frac{\partial J}{ \partial x^\ast} \cdot \frac{\partial x^\ast}{ \partial h} \). Therefore, computing the solution for the derivatives of the optimization variables \(dx, d\lambda\), we have through the matrix inversion of \eqref{eq:KKTDiff},</p>
\[\begin{bmatrix}
dx \\ d\lambda
\end{bmatrix}
=
\begin{bmatrix}
Q & G^T K(\lambda^\ast) \\
G & K(Gx^\ast - h)
\end{bmatrix}^{-1}
=
\begin{bmatrix}
{\dfrac{dJ}{dx^\ast}}^T \\ 0
\end{bmatrix}.\]
<p>The relevant gradients with respect to every QP paramter is given by</p>
\[\dfrac{\partial J}{\partial q} = d_x, \qquad \dfrac{\partial J}{ \partial h} = -K(\lambda^\ast) d_\lambda \\
\dfrac{\partial J}{\partial Q} = \frac{1}{2}(d_x x^T + x d_x^T), \qquad \dfrac{\partial J}{\partial G} = K(\lambda^\ast)(d_\lambda z^T + \lambda d_z^T
)\]
<p><a name="initialization"></a></p>
<h3 id="qp-initialization">QP Initialization</h3>
<p>For the primal problem,</p>
\[\text{minimize} \quad \frac{1}{2}x^T Q x + p^T x + (\frac{1}{2}\|s\|^2_2) \\
\text{ subject to } \quad Gx + s = h \\\]
<p>with \(x\) and \(s\) as variables to be optimized, the corresponding dual problem is,</p>
\[\text{maximize} \quad -\frac{1}{2}w^T Q w - h^T z + (\frac{1}{2}\|z\|^2_2) \\
\text{ subject to } \quad Qw + G^T z + q = 0 \\\]
<p>with variables \(w\) and \(z\) to be optimized.</p>
<p><a name="optimization-steps"></a></p>
<h3 id="optimization-steps">Optimization Steps</h3>
<ul>
<li>When the primal and dual starting points \(\hat{x}, \hat{s}, \hat{y}, \hat{z} \) are unknown, they can be initialized as proposed by Vanderberghe in <a href="http://www.seas.ucla.edu/~vandenbe/publications/coneprog.pdf">cvxopt</a> namely, we solve the following linear equations</li>
</ul>
\[\begin{bmatrix}
G & -I \\
Q & G^T
\end{bmatrix}
\begin{bmatrix}
z \\
x \\
\end{bmatrix}
=
\begin{bmatrix}
h \\
-q \\
\end{bmatrix}\]
<p>with the assumption that \(\hat{x} = x,\hat{y} = y\).</p>
<p>The initial value of \(\hat{s}\) is computed from the residual \(h - Gx = -z\), as</p>
\[\hat{s} = \begin{cases}
-z \qquad \text{ if } \alpha_p < 0 \qquad else \\
-z + (1+\alpha_p)\textbf{e}
\end{cases}\]
<p>for \(\alpha_p = \text{ inf } { \alpha | -z + \alpha \textbf{e} \succeq 0 } \).</p>
<p>Similarly, \(z\) at the first iteration is computed as follows</p>
\[\hat{z} = \begin{cases}
z \qquad \text{ if } \alpha_d < 0 \qquad else \\
z + (1+\alpha_d)\textbf{e}
\end{cases}\]
<p>for \(\alpha_d = \text{ inf } { \alpha | z + \alpha \textbf{e} \succeq 0 } \).</p>
<p>Note \(\textbf{e}\) is identity.</p>
<ul>
<li>
<p>Following Boyd and Mattingley’s convention, we can compute the afiine scaling directions by solving the system,</p>
\[\begin{bmatrix}
G &I &0\\
0 &K(z) & K(s) \\
Q &0 &G^T
\end{bmatrix}
\begin{bmatrix}
\Delta z^{aff} \\
\Delta s^{aff} \\
\Delta x^{aff}
\end{bmatrix}
=
\begin{bmatrix}
-Gx - s + h \\
-K(s)z \\
-G^Tz + Qx + q
\end{bmatrix}\]
<p>with \( K(s) \text{ as } \textbf{diag}(s) \text{ and } K(z) \text{ as } \textbf{diag(z)} \)</p>
</li>
<li>
<p>The centering-plus-corrector directions can be used to efficiently compute the primal and sualvariables by solving</p>
\[\begin{bmatrix}
G &I &0\\
0 &K(z) & K(s) \\
Q &0 &G^T
\end{bmatrix}
\begin{bmatrix}
\Delta z^{cc} \\
\Delta s^{cc} \\
\Delta x^{cc}
\end{bmatrix}
=
\begin{bmatrix}
0 \\
\sigma \mu \textbf{e} - K(\Delta s^{aff}) \Delta z^{aff} \\
0
\end{bmatrix}\]
<p>where</p>
<p>\begin{align}
\alpha = \left(\dfrac{(s+ \alpha \Delta s^{aff})^T(z + \alpha \Delta z^{aff})}{s^Tz}\right)^3 \nonumber
\end{align}</p>
<p>and the step size \(\alpha = \text{sup} {\alpha \in [0, 1] | s + \alpha \Delta s^{aff} \ge 0, \, z + \alpha \Delta z^{aff} \ge 0}. \)</p>
</li>
<li>
<p>Finding the primal and dual variables is then a question of composing the two updates in the foregoing to yield</p>
\[x \leftarrow x + \alpha \Delta x, \\
s \leftarrow s + \alpha \Delta s, \\
z \leftarrow z + \alpha \Delta z.\]
</li>
</ul>
<p><a name="example-codes"></a></p>
<h3 id="example-code">Example code</h3>
<p>An example implementation of this algorithm in the PyTorch Library is available on my <a href="https://github.com/lakehanne/RAL2017/blob/devel/pyrnn/src/model.py">github page</a>.</p>
<p><a name="acknowledgements"></a></p>
<h3 id="acknowledgment">Acknowledgment</h3>
<p>I would like to thank <a href="https://bamos.github.io/">Brandon Amos</a> of the CMU Locus Lab for his generosity in answering my questions while using his <a href="https://locuslab.github.io/qpth/">qpth OptNET framework</a>.</p>
Wed, 05 Apr 2017 11:15:00 +0000
http://scriptedonachip.com/QP-Layer-MRAS
http://scriptedonachip.com/QP-Layer-MRASconvexqpthbackpropagation