Welcome to Arkanis Development


sdt_dead_reckoning.h - Single header file library to create signed distance fields


Another small C function that came out of the project I'm currently working on. This one is to create a signed distance field out of a black and white image (see pictures below). Signed distance fields are quite useful for rendering 2D or 3D shapes on the GPU. I came across them while investigating text rendering but they're very versatile constructs.

While searching for a function to create one I stumbled across the paper "The dead reckoning signed distance transform" by George J. Grevera. It contained a quite nice pseudo-code implementation of the algorithm so I ended up implementing it in C. This is how code using it looks like:

// Load the mask image with stb_image.h
int width = 0, height = 0;
uint8_t* mask = stbi_load("mask.png", &width, &height, NULL, 1);

// Allocate and create the distance field for the mask. Pixels > 16 in the
// mask are considered inside. Negative distances in the distance field are
// inside, positive ones outside.
float* distance_field = malloc(width * height * sizeof(float));
sdt_dead_reckoning(width, height, 16, mask, distance_field);

// Create an 8 bit version of the distance field by mapping the distance
// -128..127 to the brightness 0..255
uint8_t* distance_field_8bit = malloc(width * height * sizeof(uint8_t));
for(int n = 0; n < width * height; n++) {
    float mapped_distance = distance_field[n] + 128;
    float clamped_distance = fmaxf(0, fminf(255, mapped_distance))
    distance_field_8bit[n] = clamped_distance;

The distance field itself is written into an float buffer. I've added the "mapping" part above because in most use cases you want to store the distance field not as floats but as something else (e.g. as 8 bit image). The function leaves that to you because it would make the API quite a lot more bloated.

Below is an example of an 500x500 pixel black and white mask image and the distance field computed for it. On my old Intel Core i5-3570K @3.40GHz it took about 5ms to compute it (single threaded, compiled with -O2 -ffast-math). Not useful for fancy realtime stuff but fast enough for my needs. And above all: Simple to use. :)

A mask image for a 2D shape. White pixels are considered inside the shape, black pixels outside (pretty intuitive for humans).
The resulting distance field. Each pixel contains the distance (in pixels) to the outline. Gray (128) is a distance of 0, white (255) is 127 pixels outside of the outline and black (0) is 128 pixels inside the outline.

While it doesn't look like much a distance field is really easy to render with a GLSL shader. And you get high-quality anti-aliasing almost for free. Another very handy thing is that you can shrink or grow the shape by simply adding or subtracting a value from the distance in the shader. And a lot of other cool things.

Without going into to much detail… if you have to render 2D shapes look into signed distance fields. :)

iir_gauss_blur.h - A Gaussian blur single header file library


A few days ago I needed a C function to blur an image. I had an old implementation of a Gaussian blur filter lying around so I reused that. It worked rather well and it's just one C function so I decided to clean it up and package it into a single header file library: iir_gauss_blur.h. This is what it looks like:

int width = 0, height = 0, components = 1;
uint8_t* image = stbi_load("foo.png", &width, &height, &components, 0);

float sigma = 10;
iir_gauss_blur(width, height, components, image, sigma);

Here stb_image.h is used to load the image before it is blurred. iir_gauss_blur.h compiles as C and C++ so you can just plug it in and use it (tested it on GCC only though).

The function is an implementation of the paper "Recursive implementation of the Gaussian filter" by Ian T. Young and Lucas J. van Vliet. It has nothing to do with recursive function calls, instead it's a special way to construct a filter. Other (convolution based) gauss filters apply a kernel for each pixel and the kernel grows as the blur strength (sigma) gets larger. Meaning their performance degrades the more blurry you want your image to be.

Instead the algorithm in the paper sweeps across the image four times: From left to right, right to left, top to bottom and bottom to top. This is enough to propagate the blur across the entire image, no matter how large or small the blur strength (sigma) is. Thanks to that the performance always stays the same. You can have some ridiculously large sigmas without any performance hit.

The meaning of life… eh, sigma (σ)

Sigma (σ) is a number (at least 0.5) that defines the blur strength:

The same image blurred with different sigmas. The original (top left), sigma 1.0 (top right), sigma 5.0 (bottom left) and sigma 10.0 (bottom right).

It seems the usual approach is to start out with some sigma value and then adjust it until you get the blurriness you want.

There's also the concept of a "blur radius" but I have no idea what exactly it's supposed to mean. I suspect it's the kernel size for convolution based algorithms. There seem to be several rules of thumb out there, e.g. radius = 2 * sigma. So if you want to have a blur radius of 10px you can use sigma = (1.0 / 2.0) * radius to get the sigma for it (5.0). I also found other interesting equations here and here (again for kernel sizes I guess).

All this got me thinking about what a useful "blur radius" would actually mean for me. Mentally I think that the filter blurs each pixel and I want to configure the radius by which a pixel is blurred. In my current project I have two difference scenarios where that's important for me:

I tried to solve the equation of the normal distribution to calculate the proper sigma for a given radius and threshold (e.g. 16/256 or 1/256 as above). But after a bit of tinkering (and a lot of math confusion) I gave up and went the empirical way: I made a test image and blurred it with different sigmas. Then used GIMP to measured how far the white seeped into the black region (again for the thresholds of 16 and 1). This is what I got for the threshold of 16:

A white bar (top and bottom) blurred with different sigmas. The red bars show the distance from the original white bar until the brightness drops below 16.

I did the measurements one image (and sigma) at a time. Combining many into the image above makes the lower gray values a bit more visible than they're otherwise. Anyway, I did the same for the threshold of 1 and plugged it into Wolfram Alpha to approximate the values with a linear equation. Again, after some fiddling around this is what I got:

sigma = (1.0 / 1.42) * radius16
sigma = (1.0 / 3.66) * radius1

It seems to work more or less well for a sigma up to 100. At least that's good enough for my use cases.

But given how crude these approximations are take them with a grain of salt. Or even better a whole tee spoon of salt. They work for me but might not work for anything you have in mind. That's why I decided to leave these calculations out of the iir_gauss_blur() function.

Just in case they work for you and you also want to use them via the command line I created a corresponding CLI tool.

By the way: The algorithm looks like it could be implemented quite nicely on a GPU with OpenCL or compute shaders. I assume for small sigmas a convolution based GLSL implementation is probably faster. It can use the bilinear interpolation of the hardware and is probably more local and texture cache friendly. But for larger sigmas the "recursive" algorithm will outperform it, even if it needs 4 passes over the data. Also its performance doesn't degrade with the blur strength so that might be the killer argument in some cases. Anyway, I'm very happy with the (single-threadded) CPU performance so I won't do that any time soon.

I hope this stuff is useful to someone or was at least interesting. :)

js2plot - plot 2D functions written in JavaScript


A while ago I wanted to plot a simple mathematical function for a project of mine. No problem, there are a lot of good function plotters out there. Then I stumbled over the polynomial smooth min function by Inigo Quilez. It was written for GLSL shaders so it uses functions like clamp() and mix():

float smin( float a, float b, float k )
    float h = clamp( 0.5+0.5*(b-a)/k, 0.0, 1.0 );
    return mix( b, a, h ) - k*h*(1.0-h);

Ok, getting that into one of those plotters seemed like a hassle. Some of those tools can do amazing things, but I’m a programmer so I wanted to write the functions in some programming language. I found code snippets and small libraries to draw function plots into a canvas but nothing simple with panning and zooming. I know there’s insert your favorite tool here but my Google bubble wasn’t trained for that and I didn’t find it. Of course there are the heavy-weights like Octave or CindyJS but those didn’t appeal to me at all.

I think you can guess the rest. :D How hard can it be to write a simple plotter with panning and zooming? So I wrote a little tool for my own personal needs: js2plot (GitHub project).

It executes a piece of JavaScript and in there you can call plot() to show the graph of the function. You can then explore the graph with panning and zooming. Or add more functions, or use variables, JavaScripts Math functions, and so on. All in all nothing special but for me it’s more convenient to write mathematical code that way (see smooth min example).

In case that you have the same problem I hope the tool is useful to you. The plotting part is written as a small library without dependencies. With it you can turn a canvas element into a graph with panning and zooming. But you have to provide a way for users to write the plotting code.

ps.: I only use the tool on desktops and notebooks so I didn’t implement touch panning and zooming. Panning isn’t a problem but handling to a pinch-zoom still seems like a pain. Reimplementing that gesture isn’t my idea of fun so I skipped it for now.

Chromium and WebKit bug when switching stylesheets


A few days ago I implemented a style switcher for my latest project. Unfortunately I ran into a rather obscure Chromium (and WebKit) bug. This post should give a small heads up to anyone running into the same weird effects. I opened an issue for the bug, so please post your comments there.

To switch between styles you just have to disable the current stylesheet and enable another one. I had no trouble with that in Firefox and Internet Explorer. But in Chromium and WebKit enabling the other stylesheet doesn't work the first time. It just stays disabled. This leaves the page unstyled since the first stylesheet gets disabled (disabling works). The bug only happens the first time. When you disable it and enable it again it works.

And that’s also the workaround: Instead of just enabling the other stylesheet you can enable it, disable it and then enable it again. All in one go.

Time to illustrate this with some code. This is the code of a small page usually styled by blue.css. As soon as the page is loaded the style is switched to green.css:

<!DOCTYPE html>
<title>Switch when loaded</title>
<link href="blue.css" rel="stylesheet" title="blue">
<link href="green.css" rel="alternate stylesheet" title="green">
    window.addEventListener("load", function(){
        var blue_style = document.querySelector("link[title=blue]");
        var green_style = document.querySelector("link[title=green]");

        blue_style.disabled = true;
        green_style.disabled = false;

Note that green.css is an alternate stylesheet, meaning it’s not used by default (the browser ignores it). When the page is loaded the styles’ disabled attributes are set accordingly. That attribute is part of the HTMLLinkElement interface. Or you can use the newer StyleSheet interface in which case it would be green_style.sheet.disabled = false; (same bug applies).

The workaround looks rather silly but is effective never the less:

<!DOCTYPE html>
<title>Switch when loaded</title>
<link href="blue.css" rel="stylesheet" title="blue">
<link href="green.css" rel="alternate stylesheet" title="green">
    window.addEventListener("load", function(){
        var blue_style = document.querySelector("link[title=blue]");
        var green_style = document.querySelector("link[title=green]");

        blue_style.disabled = true;
        green_style.disabled = false;
        // Workaround for Chromium and WebKit bug (Chromium issue 843887)
        green_style.disabled = true;
        green_style.disabled = false;

You can try it with this minimalistic standalone style swticher. The styles just change the color of the first paragraph. When it’s black the page is unstyled. This only happens when you load the page and click on "green". Everything else after that or every other combination of clicks works.

Finding the cause of the bug was quite maddening… and I hope this post spares you the same experience.

smtp_send - A simple PHP function to send mails


A few years ago I wrote a small PHP function to send mails via SMTP. Not so long ago a friend asked for it. So I cleaned it up and wrote a bit of documentation: smtp_send. It won’t work for everyone but it’s only about 90 lines of code and you can easily change it to suite your needs.

If you don’t care about dependencies or code footprint you can use one of the many libraries out there. But if you’re tired of PHPs mail() function, overly complicated APIs or huge amounts of dependencies just to send mails this might be worth a look. The function is meant for people who want to construct the mails themselves. Either for simplicity or because of more complex cases. Then you can use the function to simply send the result.

Usage example:

$message = <<<EOD
From: "Mr. Sender" <sender@dep1.example.com>
To: "Mr. Receiver" <receiver@dep2.example.com>
Subject: SMTP Test
Date: Thu, 21 Dec 2017 16:01:07 +0200
Content-Type: text/plain; charset=utf-8

Hello there. Just a small test. 😊
End of message.
smtp_send('sender@dep1.example.com', 'receiver@dep2.example.com', $message,
    'mail.dep1.example.com', 587,
    array('user' => 'sender', 'pass' => 'secret')

I put it online because my friend searched for a simple PHP mailer but couldn’t find one matching the projects requirements. I had the same experience a few years ago. In the end decided to look at the SMTP RFC and write my own little function. With a bit of luck this makes your life easier if you’re caught in same corner cases.

The documentation also contains examples of mails with attachments, alternative content (HTML mail with plain text fallback) or both (alternative content nested in mixed content). For some reason this information seems to be surprisingly hard to find.

Measurements of system call performance and overhead


I was kind of bored for once and figured I could clean up an old microbenchmark and kick the results out of the door. This is gonna be another very technical post for programmers. Be prepared. ;)

A while ago I came across a paper about "exokernels", very light-weight operating system kernels that leave most of the work up to libraries (Exokernel: An Operating System Architecture for Application-Level Resource Management). The paper is from 1995 and among other things they implemented high performance system calls.

I always heard system calls are expensive but I never saw any real numbers or measurements. Only the stuff the C standard library was doing with fread() and fwrite(): Batching I/O to minimize system calls. But system calls evolved a lot since the x86 32 bit days and today we have a dedicated syscall instruction for them.

This got me wondering: How fast are system calls on Linux today compared to normal function calls? In the paper they compared a function call with no arguments to the getpid() system call. That system call just does the minimum of work: Switch into kernel mode, read one variable and return its value. Sounds simple, so I wrote a set of small microbenchmarks and did the same on my Linux notebook.

When I asked my brother for some old CPUs he gave me this…
…an entire stack of CPU chips.

Back when reading the paper I just ran the microbenchmarks on my own notebook (an old Intel Core 2 Duo T6600 from 2009). But right now I'm with my brother and he owns quite a few computers and I asked him to run the microbenchmarks on some of them. We also run the benchmarks on some older CPUs (see the photo :)). Anyway, it gave me a nice perspective of how the performance of system calls evolved over time.

Mind you, I just took whatever PC was close by (no idea how fast or slow) as well as a few older CPUs. I was curious how system call performance changed over the years on x86_64 and don't really care about what CPU is faster or slower. Don't take these numbers and say that CPU X is faster than CPU Y. It just doesn't make sense. System calls are one of many factors that make up the performance of a CPU and software but by no means a defining one. So don't take the numbers to serious. This is just for fun after all. ;)

Before we get to the numbers I have to explain a bit about how Linux optimizes some system calls: On x86 64 bit system calls are usually made through the syscall instruction. You put the arguments into the proper registers and the syscall instruction then transitions into kernel mode and calls the proper kernel function. This incurs some overhead since the CPU has to switch into a different address space and this might need updates to the TLB, etc.

For some system calls the Linux kernel provides optimized versions via the vDSO. In some cases these optimized versions don't need to transition into kernel space and thus avoid the overhead. By default the C runtime on Linux (glibc) uses these optimized functions automatically whenever there is one available. I'm interested in how both of these mechanisms perform.

On to the numbers. Just for a bit of context: Reading a value from the main memory takes about 100ns, see Latency Numbers Every Programmer Should Know.

Syscalls incur quite a heavy overhead compared to normal function calls. That got better with time. The vDSO implementation of the getpid() system call is pretty good at mitigating the system call overhead and is almost as fast as a normal function call. On the Intel Celeron D 341 from 2004 the a system call via the syscall instruction was about 25 times slower than a system call via the vDSO. On the Intel Core i7-4790K from 2014 it's only about 12 times slower. For me I'll use 10 times slower as rule of thumb for modern CPUs and 25 times for older CPUs.

In case you're wondering about the details:

Performance of fread() and fwrite() I/O batching

I was also wondering about the I/O batching the standard C library does with fread() and fwrite(). Is it really worth the effort? Or are that many function calls wasted since system calls can be pretty fast, too?

A few microbenchmarks answered that question, too. This time I measured 1 million 4 byte reads and writes. Again directly via the syscall instruction and via the vDSO. But this time also via fread() and fwrite() instead of using system calls. Here's what I got:

So, yeah, in this scenario the I/O batching definitely is worth it. It's about 10 to 5 times faster.

But keep in mind, these were 4 byte reads and writes. I took a look with strace and they got batched into 4 KiByte reads and writes. So when you're reading or writing a few KiBytes the speedup is likely not that large.

It's also interesting that the vDSO doesn't help much for the read() and write() system calls. But usually it doesn't hurt either. So maybe these system calls can't be optimized in that way.

If you're still with me: Thanks for reading. I hope my strange little experiment was interesting. :)

Math 3D - A simple vector and matrix library for C


If you've done a bit of graphics programming with OpenGL you're probably familiar with vector and matrix math. Every point or direction in 3D space can be represented as a vector and most manipulations of those can be represented as matrices (moving them around, rotating them, projecting them on the screen, etc.).

I wasn't happy with the C math libraries I found so I wrote a new one (surprise!). Read on if you want to know why. But here is what using it looks like:

mat4_t projection = m4_perspective(60, 800.0 / 600.0, 1, 10);
vec3_t from = vec3(0, 0.5, 2), to = vec3(0, 0, 0), up = vec3(0, 1, 0);
mat4_t transform = m4_look_at(from, to, up);

vec3_t world_space = vec3(1, 1, -1);
mat4_t world_to_screen_space = m4_mul(projection, transform);
vec3_t screen_space = m4_mul_pos(world_to_screen_space, world_space);

OpenGL provides a lot of vector and matrix math for shaders that run on the GPU but on the CPU you have to do it yourself. I like it this way because it allows you to choose what kind of 3D math library you want: One that tries to optimize every last bit of performance but is complicated to use or one that is easy to use but lax on performance. Depending on the project you can pick what you need.

I do pretty much all of my low level graphics programming in C (I know, I'm weird, but OpenGL is a C API after all). And I haven't found a 3D math library I'm happy with for C. I don't know all of them but the ones I've looked at were complicated to use and vague on their semantics. Where they written with the OpenGL or Direct3D conventions in mind? How are matrices laid out in memory? Can I pass them into OpenGL directly or do I have to transpose them before hand?

There are good C++ math libraries that perfectly fit my purpose but I'm an awful C++ programmer. I spend way more time fiddling around with the language than I spend on solving the problem. So I stayed with C. Maybe someone else will find non-templated math code similarly appealing.

All this drove me to write my own small 3D math library for C. Just the basics, nothing fancy. A friend of mine joined in and together we started out from scratch. It was quite a nice learning experience. We spend a lot of time on the whiteboard, calculated a lot of the math by hand and wrote a lot of tests until we understood what the math should actually mean. It was just there that we realized that the perspective divide step in OpenGL is just an fixed function hack to fit a perspective projection into a 4x4 matrix. Kind of pointless with shaders...

Anyway, if you ever do OpenGL stuff in C and need a simple to use math library you can pick it up here: math_3d.h. It's a single header-file library in the style of stb_image.h so it's easy to integrate into projects.

It covers basic 3D vector math, transformation and camera matrices (translation, rotation, scaling and look at), projection matrices (orthographic and perspective) as well as basic matrix math (matrix-matrix, matrix-point and matrix-direction multiplication, inversion of affine transformations). The documentation is in the file itself so take a look if you're interested.

Happy programming. :)

Build your own DynDNS


During the last few weeks I wrote MiniDynDNS to build my own dynamic DNS service. Something like DynDNS but all by myself. This post explains the basic steps needed to wire MiniDynDNS into the worldwide DNS system.

I'm using it to create DNS names that point to devices at home I want to access via the internet. This is pretty nice with IPv6 since every device gets its own public IPv6 address. But please make sure only the services you want to have available are actually listening on the public IPv6 address. Or configure your firewalls accordingly.

To build your own DynDNS you'll need a few bits and pieces:

The bigger picture

The whole idea of this operation is to create a subdomain that is managed by a program running on your server. Here we'll use dyn.example.com but it can be anything as long as it's a subdomain of your registered domain. Whenever someone on the world resolves a name like weather.dyn.example.com they're going to ask that program on your server to get the current IP of that name.

For that we first need a program running on your server that can answer DNS requests and allows us to update these IPs when they change. Obviously we're going to use MiniDynDNS for that. That's why I wrote it.

Second we need to tell the global DNS system that the program running on your server is responsible ("authoritative") for dyn.example.com subdomain. This is called "delegating" a subdomain. When you bought your own domain you also bought the right to delegate subdomains to whoever you deem worthy. With that in place whenever someone resolves a name in dyn.example.com they'll ask MiniDynDNS on your server.

Note that you can only delegate a subdomain to a host, e.g. ns.example.com. This host then has to resolve to You can delegate to whatever host you want but in the end this host has to resolve to your public IP. Here we'll use ns.example.com as a placeholder for that.

The final part is a script running on whatever device or computer you want to have a dynamic domain name for. That script will periodically report its current IP to your MiniDynDNS.

If everything works correctly you can add any devices you want to your dyn.example.com subdomain and access them from everywhere on the world. pi.dyn.example.com, weather-station.dyn.example.com, tv.dyn.example.com, touchtable.dyn.example.com, spidercam.dyn.example.com or whatever. Get creative.

So lets get to it.

1. Run MiniDynDNS on your server

Download or clone MiniDynDNS from GitHub and do the "installation". Basically that's renaming config.example.yml to config.yml and setting the proper values for your setup. The domain, soanameserver and soamail are the important ones.

For MiniDynDNS to answer incoming DNS requests it has to listen on port 53. That's where other servers or clients expect to get DNS requests answered. Changing that will probably break things.

Per default it will use port 80 for a simple HTTP API with which we can update DNS records. In case this port is already used by a webserver like Apache you can change it to something else like 8080. We only need it for the scripts that periodically report the IPs of your devices to the server.

You can tell your server system to start MiniDynDNS on server startup. For me it's just a funny hobby so I leave it running in a screen terminal. You might also need to tell your servers firewall to open port 53 and 80 for incoming traffic (or whatever port you use for the HTTP interface). Otherwise the firewall will block everything and you'll just get timeouts.

Now your basic DynDNS server should already be up and running. To test it you can fire up a terminal and try this command:

nslookup foo.dyn.example.com

This tells nslookup to resolve foo.dyn.example.com by asking the DNS server If everything works well it should tell you that foo.dyn.example.com has the IP address

This assumes that you use the default database (just renamed db.example.yml to db.yml). If you already changed your DB you have to change the domain name in the command accordingly.

2. Delegate dyn.example.com to the MiniDynDNS server

Now on to tell the rest of the world that your server manages the dyn.example.com by itself. For this you have to add two records to your normal nameserver. In my case the company where I registered my domain provides a webinterface for this task.

You have to add these two DNS records to your example.com nameserver:

dyn  NS  ns.example.com
ns   A

The first record tells the DNS system that the dyn.example.com subdomain is delegated to ns.example.com. The second record says that ns.example.com has the IPv4 address Please remember to replace the domain name and the IP with your own values. If your server also has an IPv6 address you should add an AAAA record for that, too.

3. Update your IPs periodically

This differs greatly between IPv4 and IPv6.

With IPv4 only your router has a public IP address. For every service you have to create a port forwarding to the appropriate internal computer or device. So in any way there's only one public IP to report to the DNS server. Many routers already have build-in support for this. Usually it's hidden somewhere in the webinterface called "dynamic DNS", "DynDNS" or something like that.

In the case of my FritzBox router I have to enter the domain (foo.dyn.example.com), a user name (just foo), the password (bar) and an update URL (http://ns.example.com/?myip=<ipaddr>). The router will replace "<ipaddr>" with its current public IPv4 address. It seems like it then just fires this HTTP request every 30 minutes. Again these values are based on the default db.yml file.

The required steps are probably quite different for other routers. You might even have to look into the manual or search a bit to figure out how to do this.

With IPv6 the situation is a bit simpler. Each device gets its own public IPv6 address. A script that runs every few minutes can simply report that IP to the DNS server. In case of a RaspberryPi running Raspbian this script will do the job:

IP=$( \
    ip addr show dev eth0 scope global | \
    grep --perl-regexp --only-matching '(?<=inet6 )2003:[0-9a-f:]+' | \
    head --lines 1 \
curl -s http://foo:bar@ns.example.com/?myip=$IP

Again, "foo", "bar" and "ns.example.com" are values from the default db.yml. Replace them with the values for your setup. In case you changed the port of the webinterface you also have to add a port to the HTTP request (something like http://foo:bar@ns.example.com:8080/?myip=$IP for port 8080).

Save it to /home/pi/update_ip.sh and make it executable with chmod u+x update_ip.sh. When you run it (./update-ip.sh) you should see something like "Your IP has been updated".

To execute the script every 5 minutes you just need to add a line to your crontab. The command crontab -e should show you an text editor and by adding this line at the end you should be set:

*/5  *  *  *  *  /home/pi/update_ip.sh 2>&1 >> /home/pi/update_ip.log

The "*/5" at the beginning means "every 5 minutes". If you want to run the script every 30 minutes its "*/30".


Phew, this was more text than I expected. When you run the command nslookup foo.dyn.example.com you should now see as a result (again, default database values). Note that this command asks the nameserver provided by your environment (e.g. by your ISP). Thanks to the domain delegation this DNS request should end up at your nameserver which can then answer it. When you have a webserver running on one of your devices you can even use these domain names to access websites.

Anyway, with that setup you should be able to manage your own subdomains. The MiniDynDNS readme has some more information and useful commands so better take a look there, too.

Have fun with MiniDynDNS. :)



During the last few weeks I've been working on a small, simple and self-contained dynamic DNS server: MiniDynDNS. A friend of mine wanted to bring one of his computers online but didn't want to use an 3rd party service like DynDNS.

Well, you probably know the typical hacking answer: "Why not build something like that ourselves?". How hard can it be? After all we only need a hash table or dictionary for a simple DNS server. Throw in some code to parse and generate DNS packets and your're done.

After a bit of hacking and testing it actually worked. An HTTP interface would be nice so I can tell my router to report changed IPs by itself. So in it went.

You can probably imagine that the result was a funny but scary bit of code. I did some cleanup and ended up with about 600 lines of Ruby. Some more refactoring into several classes resulted into a bit to much glue code for my taste so I reverted back to simple procedural style. The code isn't pretty or fancy but it's easier to understand this way I hope.

So if you want to role your own little DynDNS for your private stuff, take a look. Maybe it's what you want. But remember: It's only build for small stuff, not high-traffic sites. :)

When does the smallest gap between two floats reaches 1.0?


Programmers might know this question: Can I use a float variable for that? Or should I use double or some integer? During my time as a programmer I came across this question on a few occasions. Usually I wanted to use floats for coordinates, like the x, y and z position of an object in space.

But floats have a problem: The larger the number becomes the larger the gap between two representable floats. For example the next representable float after 512.0 is 512.000061. So the gap between the two numbers is 0.000061. After 65536 the next representable number is 65536.007812 so the gap is 0.007812 large. A float variable simply can't represent anything between any of those two values. It doesn't have enough bits to do so. This is also called the precision of the float: The higher the value the lower the precision.

Some time ago I used floats for the x and y coordinates of a large map with mostly images on it. But I was wondering how large that map (that is the x or y coordinate) can become before before the gap between two float values is larger than one pixel (1.0). If a coordinate exceeds this value I can no longer properly position images on the map and can no longer use floats to calculate the position of individual pixels.

Usually I would guess based on what I read and picked up over the years. But this time I wrote a small C program that prints a table of float values and the size of the gap to the next larger value.

#include <stdio.h>
#include <math.h>

int main() {
    for (float x = 1.0f, precision; precision < 1.0f; x *= 2.0f) {
        precision = nextafterf(x, INFINITY) - x;
        printf("precision of %f at value %.0f\n", precision, x);

    for (double x = 1.0, precision; precision < 1.0; x *= 2.0) {
        precision = nextafter(x, INFINITY) - x;
        printf("precision of %lf at value %.0lf\n", precision, x);

    return 0;

It uses the nextafter() function of C99 and for completenesses sake I also added a table for double. You can compile it with gcc -std=c99 float_precision_table.c -lm.

This is a bit of the programs output:

precision of 0.003906 at value 32768
precision of 0.007812 at value 65536
precision of 0.015625 at value 131072
precision of 0.031250 at value 262144
precision of 0.062500 at value 524288
precision of 0.125000 at value 1048576
precision of 0.250000 at value 2097152
precision of 0.500000 at value 4194304
precision of 1.000000 at value 8388608

At the float value of about 8388608 the gaps are about 1.0 in size. So once the x or y coordinate exceeds this value strange errors will happen in my program. Fortunately this value is quite a bit larger than I expected it to be.

Even if I would use float for coordinates in a 3D world and wanted 1cm precision the world can reach from about -65536m to about +65536m in each direction. That's a cube with a side length of about 130km. Which is again way larger than I originally thought. :)

As a little contrast here are some values for doubles:

precision of 0.007812 at value 35184372088832
precision of 0.015625 at value 70368744177664
precision of 0.031250 at value 140737488355328
precision of 0.062500 at value 281474976710656
precision of 0.125000 at value 562949953421312
precision of 0.250000 at value 1125899906842624
precision of 0.500000 at value 2251799813685248
precision of 1.000000 at value 4503599627370496

You can make some quite large worlds with double coordinates. In the same situation as above it would be a cube with a side length of 470 AU and still something like 1cm precision at the outer regions. :)

I hope this small program can provide some point of reference should you ever find yourself pondering the same questions.

Thanks for scrolling down all the way, it can get quite lonely here…
Anyway, looking for older entries? Want to know more? Take a look at the archive.