Polymorphic Code:What it is, how it works, and how it is used
Add to your RSS feedWhat Polymorphic Code is, how it works, and real-world use cases like BusyBox.

Table of Contents
Introduction
Last night, just before going to sleep, I was scrolling through YouTube when I came across a recommended video from Carles Cabergs, where the title of the video was Polymorphic Executables (recommended), which got my attention, since I hadn’t heard about it before, only something related with Alpine-based Docker images (we will see it later), but nothing else.
So, what is exactly a “Polymorphic Code”?
What is Polymorphic Code
If we want to define quickly what is a Polymorphic Code, with a simple search you will find something like
Code that uses a polymorphic engine to mutate while keeping the original algorithm intact.
But what does that actually mean? To understand it better, let’s first recap how arguments are used in programs. For example, consider a file named main
with the following code:
#! /bin/bash
echo "first argument: $0"
echo "second argument: $1"
echo "third argument: $2"
Now, if we execute the program, we will get different outputs based on the arguments that we specified in the execution. As an example
root@covicale:~$ ./main covicale.com
first argument: ./main
second argument: covicale.com
third argument:
root@covicale:~$ ./main hello world
first argument: ./main
second argument: hello
third argument: world
One thing we notice, is that $0
will contain the value of the first argument, which is always the script that has been called
Lets take a look at the next code:
#!/bin/bash
case "$(basename "$0")" in
"main")
echo "main file called"
;;
"polymorphic")
echo "wow this is polymorphic"
;;
*)
echo "unknown executable name: $(basename "$0")"
;;
esac
If we try to execute it normally through the main
file, we already know what is going to happen:
./main
main file called
Now, if we wanted to reach the polymorphic
case in our switch statement, we would think that we could do it only changing the name of the file. If that were the case, this whole article would not exist and you would not be reading this.
So, how could we reach other cases without changing the name of the file? The secret resides on symbolic links
As I said before:
$0
will contain the value of the first argument, which is always the script that has been called
This means $0
will contain the script that has been called, not the name of the file itself.
Right now, we only have a main
file, which is the entrypoint for executing the code, but what will happen if we create a symbolic link to that file with a different name? Lets try it.
To create the symbolic link, we will use the ln
command, which create a link to a file and the -s
option, specifying that we want a symbolic link rather than a hard link. As an example, we will create two symbolic links: polymorphic
and whatisgoingon
.
ln -s main polymorphic && ln -s main whatisgoingon
Once we did this, if we execute ls -l
, we can check the symbolic links we just created
ls -l
total 4
-rwxr-xr-x 1 root root 230 Mar 11 18:26 main
lrwxrwxrwx 1 root root 4 Mar 11 18:40 polymorphic -> main
lrwxrwxrwx 1 root root 4 Mar 11 18:40 whatisgoingon -> main
Now, if we execute the files, they belong to the same main
file, but since the first argument is the symbolic link, it will reach other parts of the code:
root@covicale:~$ ./main
main file called
root@covicale:~$ ./polymorphic
wow this is polymorphic
root@covicale:~$ ./whatisgoingon
whatisgoingon action is not specified :/
How Polymorphic Codes are used and BusyBox
The main reason for using symbolic links in this way is optimization, saving disk space, where instead of having multiple copies of the same binary for different commands, a single executable can be reused under different names, with symbolic links pointing to it. Another important point is improved maintenance and updates, since if we apply a new patch or any improvement to the code, it will be applied to all linked commands, instead of needing to apply it one by one.
BusyBox is probably the most famous, one of the oldest (more than 25 years) and one of the most used piece of sotware that uses the concept of Polymorphic Code to improve the size of their binaries, since it was specifically created at first instance for embedded operating systems with very limited resources.
Maybe you think you have never used BusyBox and I thought the same, however, one thing that you used almost one hundred percent sure if you developed something with docker images, are the Alpine Docker-Based Images. Almost all bigest and most used docker images, has their own alpine version (golang example: 1.24.1-alpine3.21). This Alpine Images are based, of course, on Alpine Linux which is a super lightweight linux distribution and guess what, uses BusyBox.
If you want to check it by yourself that this is true, you can test it very quickly running a container based on the alpine image, move to the \bin
directory, and executing ls -l
. Once this is done, you can check how all the commands that exists, are just symlinks to the busybox
program.
/bin # ls -l
total 792
lrwxrwxrwx 1 root root 12 Feb 13 23:04 arch -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 ash -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 base64 -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 bbconfig -> /bin/busybox
-rwxr-xr-x 1 root root 808712 Jan 17 18:12 busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 cat -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 chattr -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 chgrp -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 chmod -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 chown -> /bin/busybox
lrwxrwxrwx 1 root root 12 Feb 13 23:04 cp -> /bin/busybox
...
I thought it was quite interesting how a feature like symbolic links, which at least for me, I only used before when I wanted to have a quick access to something through the desktop, can be used for other purposes like this and how almost all of us, used this feature before without even know it :)