Monday, February 12, 2018

Adaptive algorithms in NDB and in cars

The world is brimming with the ideas of self-driving cars and all sorts of
other concepts where computers are supposed to take care of
decision making.

This makes a bit worried, for one because I simply like to drive and
would not want a computer to interfere with my driving. I am already
quite irritated by many automatic things in cars that don't really work
when winter is upon in Sweden :)

Anyways this post is not about that, this post is more about the general
problem of designing adaptive algorithms.

I've been designing NDB software for more than 20 years. During the
course of these years I have learned a bit about what is optimal
when executing NDB. Most of the software I write today is about
putting this knowledge into the NDB software itself.

This is a trend in databases today to automate configuration handling
in a DBMS. In NDB we started this trend in MySQL Cluster 7.4
when we implemented a "top" facility inside the NDB data nodes.
At the same time we also keep track of lags in writing to disk.

We used this knowledge to design an adaptive algorithm that changes
the speed of writing local checkpoints based on the current CPU usage
and IO lag.

We moved on in 7.5 and implemented an adaptive algorithm to control
from where sending will happen. This algorithm is also based on
keeping track of CPU usage in the threads in the data node.

The new partial LCP algorithm is also highly adaptive where it decides
how incremental the LCP should be based on the writing in the
database.

There is also work ongoing on some adaptiveness in the NDB API
where some threads will start up to assist the receive thread in the NDB
API when it gets overloaded.

There is even more work ongoing to ensure that the checkpoint speed
adapts also to conditions where we are starting to run out of REDO log.

Now the idea of adaptive algorithms is clearly a good idea, but, and there
is a big but, there are two problems with ANY adaptive algorithm.

The first problem is oscillation. Adaptive algorithms works by changing
the environment based on input from the environment. When you look
at an adaptive algorithm that works it is actually quite impressive. By
merely finding the proper conditions based on the input you can get a
system that quickly adapts to any working condition and finds a new
optimal spot to work in.

My original study at the university was mathematical statistics.
One important fact in most mathematical statistics is that you
have stable states and you have transient states.

An adaptive algorithm will work fine as long as the frequency of
changes in the environment is not faster than the time it takes to
find a new stable state.

As an example in the algorithms in NDB, most of them takes
decisions to change the environment about once per second.
One important thing to make those adaptive algorithms better
at adapting is to not change the controls to much. If one base
the decision on what to do the next second only on the last
second the adaptive algorithm is quite likely to
self-oscillate.

Thus it is important to build in some inertia in the adaptive
algorithm. This protects the algorithm from going wild.
But it doesn't make it adapt to conditions that change
quicker than the change frequency. Adaptive algorithms
cannot handle that.

So this is the first problem, to ensure that the adaptive
algorithm is quick enough to change to handle the
changing environment, but not so quick that it starts to
self-oscillate.

The second problem is when two adaptive algorithms
crash into each other. As an example in NDB we have a
problem when CPU load is extremely high due to
application activity while at the same time we are
coming close to the limit of the REDO log. In this case
we have two adaptive algorithms that conflict, one wants
to decrease the checkpoint speed to keep the application
activity while the other algorithm tries to slow down the
checkpoint activity to avoid running out of REDO log.

Now in a car the bets are higher, its human lifes involved.
Almost the same problem a self-driving car will have to
solve when the driver has decided on the speed he wants
to travel while at the same time the control of the car sees
dangers coming up ahead. These dangers could be other
cars, cliffs or any other thing.

Sometimes cars even have to make decision on whether
its own passengers should survive or whether the by-stander
should survive.

So the software of a self-driving car and any other
self-controlling software suffers from two big problems
to solve.

1) How often should I take input from the environment and
decide to change the controller parameters.
2) How should I handle conflicting requirements

Failure in handling 1) will lead to self-oscillating
behaviour and failure to handle 2) will lead to
crashes.

So hopefully any developer of self-driving cars has read up
a lot on adaptive algorithms and know exactly when the
algorithm is safe and when it isn't.

Personally I always feel a bit uneasy about any adaptive
algorithm since I know that it is almost impossible to
predict exactly how it is going to behave in all situations.

The mathematics involved in understanding adaptive
algorithms requires a lot of understanding of differential
equations.

No comments: