Solving Optimization Problems with Chemotaxis

12 Jan 2020

When I started my career as a control engineer, a very wise man once told me a little trick, how to solve almost any real-life optimization problem with an algorithm that is so simple that it just takes 6 lines of code and you can remember it for the rest of your life once you understand how it works.

I’ve since passed on this wisdom to almost all students I supervised and I find it so valuable that I will share it here.

The best thing about it is that by understanding and experimenting with Chemotaxis you’ll also learn a lot about optimization in general and get a very good foundation of more advanced techniques.

Having this simple algorithm in my personal toolbox saved me a huge amount of time. I used it to solve a lot (!) of optimization problems. If you’re an engineer or work in science this is something you have to know! So let’s see how it works.

Motivation

In mathematical optimization, our job is usually to find values for a certain number of variables $\mathbf{x} = x_1, x_2, …$ such that an objective function (or cost function) $f$ is either maximized or minimized.

$$\underset{\mathbf{x}}{\arg\min} \ {f(\mathbf{x})}$$

For example, let’s say we want to figure out the production rate of different dependent machines in a factory $x_1, x_2, …$ so that the output of that factory $f$ is maximized.

In most cases trying every combination of $x_1, x_2, …$ is not a viable option, because usually it will take time to compute $f$ and we want to get a result within finite time. So what is Chemotaxis and how can it inspire a better solution than just brute force?

In biology, Chemotaxis is “the movement of an organism in response to a chemical stimulus”. “Taxis” means movement. It refers to the concept that an organism will adjust its random movement based on some substance it can sense, for example, food.

Chemotaxis is the movement of an organism in response to a chemical stimulus.

So I imagine it probably looks something like this (mmmhh delicious green goo):

This E. coli is very happy because it just found some food thanks to a process called Chemotaxis. Thank you Chemotaxis. — This E. coli is very happy because it just found some food thanks to a process called **Chemotaxis**. Thank you Chemotaxis.

So apparently in nature, there is a mechanism that allows bacteria to find food through random movement. However, how does it know, which direction is better?

Chemotaxis optimization in a nutshell

So here is the basic idea: You’re a bacterium and you try to find the place that has the most food, so you just walk into a random direction with random distance, whereas the probability of traveling a longer distance is declining with the distance traveled. Next, you probe your immediate surroundings for food. Did you find less food? Not good, go back to the previous position. Repeat the process until no improvement is found. Did you find more food? Good, stay there.

That’s it! If you can remember that, you now have an extremely handy algorithm in your personal toolkit too.

To put it into action, let’s translate it into an algorithm. Let’s assume for now that we want to minimize $f$ (as opposed to maximizing $f$), then we have to do the following steps:

Start with the best estimate you can make for the input parameters $\mathbf{x} := \mathbf{x}_{good}$
Set cost to a high initial value $c_{good} := 10^{99}$
Repeat:
- Add random movement $\mathbf{x} := \mathbf{x} + \mathbf{v} \cdot \mathbf{r}()$, whereas $\mathbf{r}$ is a random number from normal distribution and $\mathbf{v}$ is the variance of that distribution
- $c = f(\mathbf{x})$ - Calculate $f$ based on $\mathbf{x}$ and save as $c$
- If $c$ < $c_{good}$:
  - Set $\mathbf{x}_{good}$ to $\mathbf{x}$
  - Set $c_{good}$ to $c$

Or as code for the 1-dimensional case (one variable subject to optimization):

function optimize_1D( costfunc, x_good, x_var = 0.1, n_max = 50 ) {
  let c_good = 1e99;
  for ( let n = 0; n < n_max; n++ ) {
    let x = x_good + x_var * randn();
    let c = costfunc(x);
    if ( c < c_good ) {
      x_good = x;
      c_good = c;
    }
  }
  return x_good;
}

This is JavaScript, but it could be easily adapted to MATLAB or Python. If we test the code now with a simple example:

optimize_1D(x => (x - 3) ** 2, 1.0)
// Output: 2.998974026430649

Sure enough, we get an approximation of the correct result, which is precisely 3.

You might ask two questions at this point: How many iterations should you do and how would you choose the variance x_var?

The ideal number of iterations really depends. I recommend you start with 10 to see if everything is working and then increase to 50 - 100. You can always save the final values of $x$ and $c$ and use them as input for more iterations. You want to stop when you don’t see any improvement (no better values of $c$ are found for many iterations).

For the variance, my recommendation is to set it one order of magnitude lower than your $x_{good}$. So let’s say you expect your $x$ to be between 100 and 1000, then set the variance to 10. If you see $x$ “jumping around a value”, then decrease by a factor of 10. If $x$ only goes in one direction, then increase by a factor of 10. Visualizing each step helps a lot to see what’s going on.

In any case, I highly encourage you to experiment. Chemotaxis can be a great teacher and you’ll gain a quick understanding of how the values affect the result.

Example: Optimal Skyscrapers

Last year I moved to New York City and one thing that fascinated me is a phenomenon called needle towers: A new breed of buildings that have recently emerged and that shape the skyline of Manhattan. Increasingly high, super-skinny buildings for the ultra-rich. So let’s see what the economy behind these buildings is and figure out, what the optimal height would be from the point of view of a real-estate developer (Ok, I admit the example is a little bit artificial, but hey, I tried).

Ok, first let’s come up with a cost function for building a skyscraper in Manhattan. To make it a little bit more interesting we not only have to figure out the optimal height $h$, but also the ratio of apartments to commercial area (stores, boutiques, restaurants, etc.). We express this with $r$, which is the commercial area divided by the total area of usable space. So for example, 90% apartments and 10% commercial area corresponds to $r = 0.1$.

Of course the more commercial space we develop, the fewer apartments we can build, which means less revenue from rent but more revenue from shops (as long as there are enough people to buy things). A cost function could then look something like this:

function objective( h, r ) {
  // Revenue from apartments
  let Em = 0.5 * ( 1 - r ) * h * Math.sqrt( h );

  // Revenue from shops
  let Es = 100 * Math.sqrt( Em ) * r;

  // Land
  let Kl = 500;

  // Construction
  let Kc = 250 + (700/(200*200)) * h * h;

  // Air rights
  let Ka = (200/220) * h;

  let K = Kc + Ka + Kl;
  let E = Em + Es;

  return E - K
}

where:

$E_m$: Earning from renting out apartments. More apartments ($1 - r$) means more rent and obviously the higher the more rent we can charge ($\frac{1}{2} h \sqrt{ h }$).
$E_s$: Earning from shops. The more shops the more earning, but only when there are people to buy stuff, so $E_s$ depends on $E_m$.
$K_l$: Cost of land (fixed)
$K_c$: Cost of construction. The higher the more expensive.
$K_a$: Cost of air rights (yes, that’s a thing in NYC)
$E - K$: Total profit in Million $

Now let’s implement the chemotaxis optimizer with our new knowledge:

let h_good = 100; // First guess: 100m
let r_good = 0.0; // First guess: no shops
let h_var = 2;    // Variance for changes in height
let r_var = 0.02; // Variance for changes in r
let profit_good = -1e99;
let log = [];

for ( let n = 0; n < 10; n++ ) {

  // update h and r
  let h = h_good + h_var * randn();
  let r = r_good + r_var * randn();

  // limit r between 0 and 1
  r = limit(r, 0, 1);

  // calculate objective function
  let profit = objective(h, r);

  // if profit is higher than any previous profit,
  // update h and r to new value
  if ( profit > profit_good ) {
    h_good = h;
    r_good = r;
    profit_good = profit; // Don't forget to also update profit_good
  }

  // save results for visualization
  log.push([steps, h, r, profit, profit > profit_good]);
  steps += 1;
}

We can imagine that $h$ and $r$ correspond to the x- and y-coordinate of the bacterium looking for food (profit). Also, for implementation tips for randn and limit see at the end of this article

Alright, now let’s run the optimizer. I implemented it in JavaScript so you can play around with it a little bit. First press Run 10 steps and try to understand what happens. Change the Variance by pressing + Var a few times, as you can see, the steps are now much bigger. You can see that this speeds up the process, but also fewer improvements are found. I encourage you to play a little bit. What would be a good variance? When does the optimization fail? Also, try to decrease the variance when you get the feeling that you’re close to the optimal solution.

It's alive! Beautiful! The top plot shows h (x-axis) and r (y-axis). The bottom plot shows the cost c (y-axis) over iterations (x-axis). Orange dots are improvements, violets dots are worse than the current best solution.

You should get a profit of approximately $ 540 Mio., an optimal height of 315 m, and an optimal shop ratio of 0.4. That sounds about right. One of the first needle towers 432 Park Avenue is 392.1m high and cost $ 1.25 billion to build. Compared to our $ 2.77 billion ($K_l = 500$, $K_c = 1986$, $K_a = 286$).

If you want to dig deeper into the example, you can find the source code here. It’s written in Vue.JS, an awesome frontend framework I highly recommend.

Use Chemotaxis like a boss

You feel like you’re ready to take it to the next level? Okay… time to use Chemotaxis like a boss.

There are a few things that can be improved, but I leave that up to you as a little challenge. So, here are a few ideas for improvements:

Implement a generalized $n$-dimensional version based on vectors
Implement a reusable class (in MATLAB, Python or JavaScript)
Automatically stop when there is no improvement found, for example, based on tolerance or when no improvement is found for $n$ consecutive steps.
Automatically decrease all variances by a factor of 10 after a certain number of iterations.
Come up with a meta optimizer for the variance. For example, automatically decrease the variance of a single variable when it keeps alternating around a certain value and increase it, when it always changes in one direction for each improvement.
Think about global optimization. Currently, the algorithm performs a local optimization. A naive idea for global optimization would be to sprinkle many bacteria over the optimization area and let them evolve individually. Maybe removing bacteria in overcrowded areas.
Think about how knowledge of the slope of the optimization landscape could be used.
Try other random distributions. Why does it make sense to use a gaussian distribution?
Try to find a cost function for the The Big Bend 😳

If you have implemented all these improvements, it might as well be a good idea to implement a more advanced optimization algorithm ;-)

From my experience however, keeping it simple is often the best.

Conclusion

Chemotaxis is an extremely simple algorithm, which you can implement in pretty much any programming language without additional libraries. You’ll be able to understand 100% what’s going on, you can easily experiment with different strategies and it works in most cases… at least most day-to-day engineering problems, since computing $f$ is not super critical.

I feel honored that this wise man once shared this little piece of wisdom with me. Now I’ve shared it with you, so please feel free to share this blog post too, if you think you’ve learned something.

Appendix

The functions randn and limit can be implemented in JavaScript like this:

function randn() {
  var u = 0, v = 0;
  while(u === 0) u = Math.random(); //Converting [0,1) to (0,1)
  while(v === 0) v = Math.random();
  return Math.sqrt( -2.0 * Math.log( u ) ) * Math.cos( 2.0 * Math.PI * v );
}

For Python you can use numpy.random.randn

function limit(v, l, u) {
  return Math.min( Math.max(l, v), u );
}