Research

Decoding Cobalt Strike: Understanding payloads

Identifying and Parsing Cobalt Payloads
Threat Research Team
Threat Research Team
Published
July 8, 2021
Read time
16 Minutes
Decoding Cobalt Strike: Understanding payloads
Written by
Threat Research Team
Threat Research Team
Published
July 8, 2021
Read time
16 Minutes
Decoding Cobalt Strike: Understanding payloads
    Share this article

    Intro

    Cobalt Strike threat emulation software is the de facto standard closed-source/paid tool used by infosec teams in many governments, organizations and companies. It is also very popular in many cybercrime groups which usually abuse cracked or leaked versions of Cobalt Strike. 

    Cobalt Strike has multiple unique features, secure communication and it is fully modular and customizable so proper detection and attribution can be problematic. It is the main reason why we have seen use of Cobalt Strike in almost every major cyber security incident or big breach for the past several years.

    There are many great articles about reverse engineering Cobalt Strike software, especially beacon modules as the most important part of the whole chain. Other modules and payloads are very often overlooked, but these parts also contain valuable information for malware researchers and forensic analysts or investigators.

    The first part of this series is dedicated to proper identification of all raw payload types and how to decode and parse them. We also share our useful parsers, scripts and yara rules based on these findings back to the community

    Raw payloads

    Cobalt Strike’s payloads are based on Meterpreter shellcodes and include many similarities like API hashing (x86 and x64 versions) or url query checksum8 algo used in http/https payloads, which makes identification harder. This particular checksum8 algorithm is also used in other frameworks like Empire.

    Let’s describe interesting parts of each payload separately.

    Payload header x86 variant

    Default 32bit raw payload’s entry points start with typical instruction CLD (0xFC) followed by CALL instruction and PUSHA (0x60) as the first instruction from API hash algorithm.

    x86 payload
    x86 payload
    Payload header x64 variant

    Standard 64bit variants start also with CLD instruction followed by AND RSP,-10h and CALL instruction.

    x64 payload
    x64 payload

    We can use these patterns for locating payloads’ entry points and count other fixed offsets from this position.


     

    Default API hashes

    Raw payloads have a predefined structure and binary format with particular placeholders for each customizable value such as DNS queries, HTTP headers or C2 IP address. Placeholder offsets are on fixed positions the same as hard coded API hash values. The hash algorithm is ROR13 and the final hash is calculated from the API function name and DLL name. The whole algorithm is nicely commented inside assembly code on the Metasploit repository.

    Python implementation of API hashing algorithm
    Python implementation of API hashing algorithm

    We can use the following regex patterns for searching hardcoded API hashes:


     

    Default API hashes
    Default API hashes

    We can use a known API hashes list for proper payload type identification and known fixed positions of API hashes for more accurate detection via Yara rules.


     

    Payload identification via known API hashes
    Payload identification via known API hashes

    Complete Cobalt Strike API hash list:


     

    API hash list 1/2
    API hash list 1/2
    API hash list 2/2
    API hash list 2/2

    Complete API hash list for Windows 10 system DLLs is available here.

    Customer ID / Watermark

    Based on information provided on official web pages, Customer ID is a 4-byte number associated with the Cobalt Strike licence key and since v3.9 is embedded into the payloads and beacon configs. This number is located at the end of the payload if it is present. Customer ID could be used for specific threat authors identification or attribution, but a lot of Customer IDs are from cracked or leaked versions, so please consider this while looking at these for possible attribution.

    DNS stager x86

    Typical payload size is 515 bytes or 519 bytes with included Customer ID value. The DNS query name string starts on offset 0x0140 (calculated from payload entry point) and the null byte and max string size is 63 bytes. If the DNS query name string is shorter, then is terminated with a null byte and the rest of the string space is filled with junk bytes.

    DnsQuery_A API function is called with two default parameters:

    Anything other than the default values are suspicious and could indicate custom payload.

    Python parsing:


     

    Default DNS payload API hashes:


     

    Yara rule for DNS stagers:


     

    SMB stager x86

    The default payload size is 346 bytes plus the length of the pipe name string terminated by a null byte and the length of the Customer ID if present. The pipe name string is located right after the payload code on offset 0x015A in plaintext format.

    CreateNamedPipeA API function is called with 3 default parameters:

    Python parsing:


     

    Default SMB payload API hashes:


     

    Yara rule for SMB stagers:


     

    TCP Bind stager x86

    The payload size is 332 bytes plus the length of the Customer ID if present. Parameters for the bind API function are stored inside the SOCKADDR_IN structure hardcoded as two dword pushes. The first PUSH with the sin_addr value is located on offset 0x00C4. The second PUSH contains sin_port and sin_family values and is located on offset 0x00C9 The default sin_family value is AF_INET (0x02).

    Python parsing:


     

    Default TCP Bind x86 payload API hashes:


     

    Yara rule for TCP Bind x86 stagers:


     

    TCP Bind stager x64

    The payload size is 510 bytes plus the length of the Customer ID if present. The SOCKADDR_IN structure is hard coded inside the MOV instruction as a qword and contains the whole structure. The offset for the MOV instruction is 0x00EC.

    Python parsing:


     

    Default TCP Bind x64 payload API hashes:


     

    Yara rule for TCP Bind x64 stagers:


     

    TCP Reverse stager x86

    The payload size is 290 bytes plus the length of the Customer ID if present. This payload is very similar to TCP Bind x86 and SOCKADDR_IN structure is hardcoded on the same offset with the same double push instructions so we can reuse python parsing code from TCP Bind x86 payload.

    Default TCP Reverse x86 payload API hashes:

    Yara rule for TCP Reverse x86 stagers:


     

    TCP Reverse stager x64

    Default payload size is 465 bytes plus length of Customer ID if present. Payload has the same position as the SOCKADDR_IN structure such as TCP Bind x64 payload so we can reuse parsing code again.

    Default TCP Reverse x64 payload API hashes:

    Yara rule for TCP Reverse x64 stagers:


     

    HTTP stagers x86 and x64

    Default x86 payload size fits 780 bytes and the x64 version is 874 bytes long plus size of request address string and size of Customer ID if present. The payloads include full request information stored inside multiple placeholders.

    Request address

    The request address is a plaintext string terminated by null byte located right after the last payload instruction without any padding. The offset for the x86 version is 0x030C and 0x036A for the x64 payload version. Typical format is IPv4.

    Request port

    For the x86 version the request port value is hardcoded inside a PUSH instruction as a  dword. The offset for the PUSH instruction is 0x00BE. The port value for the x64 version is stored inside MOV r8d, dword instruction on offset 0x010D.

    Request query

    The placeholder for the request query has a max size of 80 bytes and the value is a plaintext string terminated by a null byte. If the request query string is shorter, then the rest of the string space is filled with junk bytes. The placeholder offset for the x86 version is 0x0143 and 0x0186 for the x64 version.

    Cobalt Strike and other tools such as Metasploit use a trivial checksum8 algorithm for the request query to distinguish between x86 and x64 payload or beacon. 

    According to leaked Java web server source code,  Cobalt Strike uses only two checksum values, 0x5C (92) for x86 payloads and 0x5D for x64 versions. There are also implementations of Strict stager variants where the request query string must be 5 characters long (including slash). The request query checksum feature isn’t mandatory.

    Python implementation of checksum8 algorithm:


     

    Metasploit server uses similar values:


     

    You can find a complete list of Cobalt Strike’s x86 and x64 strict request queries here.

    Request header

    The size of the request header placeholder is 304 bytes and the value is also represented as a plaintext string terminated by a null byte. The request header placeholder is located immediately after the Request query placeholder. The offset for the x86 version is 0x0193 and 0x01D6 for the x64 version.

    The typical request header value for HTTP/HTTPS stagers is User-Agent. The Cobalt Strike web server has banned user-agents which start with lynx, curl or wget and return a response code 404 if any of these strings are found.

    API function HttpOpenRequestA is called with following dwFlags (0x84600200):

    Python parsing:

    Default HTTP x86 payload API hashes:


     

    Default HTTP x64 payload API hashes:


     

    Yara rules for HTTP x86 and x64 stagers:


     

    HTTPS stagers x86 and x64

    The payload structure and placeholders are almost the same as the HTTP stagers. The differences are only in payload sizes, placeholder offsets, usage of InternetSetOptionA API function (API hash 0x869e4675) and different dwFlags for calling the HttpOpenRequestA API function.

    The default x86 payload size fits 817 bytes and the default for the x64 version is 909 bytes long plus size of request address string and size of the Customer ID if present.

    Request address

    The placeholder offset for the x86 version is 0x0331 and 0x038D for the x64 payload version. The typical format is IPv4.

    Request port

    The hardcoded request port format is the same as HTTP.  The PUSH offset for the x86 version is 0x00C3. The MOV instruction for x64 version is on offset 0x0110.

    Request query

    The placeholder for the request query has the same format and length as the HTTP version. The placeholder offset for the x86 version is 0x0168 and 0x01A9 for the x64 version.

    Request header

    The size and length of the request header placeholder is the same as the HTTP version. Offset for the x86 version is 0x01B8 and 0x01F9 for the x64 version.

    API function HttpOpenRequestA is called with following dwFlags (0x84A03200):

    InternetSetOptionA API function is called with following parameters:


     

    Python parsing:


     

    Default HTTPS x86 payload API hashes:


     

    Default HTTPS x64 payload API hashes:


     

    Yara rule for HTTPS x86 and x64 stagers:


     

    The next stage or beacon could be easily downloaded via curl or wget tool:


     

    You can find our parser for Raw Payloads and all according yara rules in our IoC repository.

    Raw Payloads encoding

    Cobalt Strike also includes a payload generator for exporting raw stagers and payload in multiple encoded formats. Encoded formats support UTF-8 and UTF-16le. 

    Table of the most common encoding with usage and examples:

    Decoding most of the formats are pretty straightforward, but there are few things to consider. 

    • Values inside Decimal and Char Array are splitted via “new lines” represented by “\s_\n” (\x20\x5F\x0A).
    • Common compression algorithms used inside PowerShell scripts are GzipStream and raw DeflateStream.

    Python decompress implementation:

    XOR encoding

    The XOR algorithm is used in three different cases. The first case is one byte XOR inside PS1 scripts, default value is 35 (0x23).

    The second usage is XOR with dword key for encoding raw payloads or beacons inside PE stagers binaries. Specific header for xored data is 16 bytes long and includes start offset, xored data size, XOR key and four 0x61 junk/padding bytes.

    Python header parsing:


     

    We can create Yara rule based on XOR key from header and first dword of encoded data to verify supposed values there:


     

    The third case is XOR encoding with a rolling dword key, used only for decoding downloaded beacons. The encoded data blob is located right after the XOR algorithm code without any padding. The encoded data starts with an initial XOR key (dword) and the data size (dword xored with init key).

    There are x86 and x64 implementations of the XOR algorithm. Cobalt Strike resource includes xor.bin and xor64.bin files with precompiled XOR algorithm code. 

    Default lengths of compiled x86 code are 52 and 56 bytes (depending on used registers) plus the length of the junk bytes. The x86 implementation allows using different register sets, so the xor.bin file includes more than 800 different precompiled code variants.

    Yara rule for covering all x86 variants with XOR verification:


     

    The precompiled x64 code is 63 bytes long with no junk bytes. There is also only one precompiled code variant.


     

    Yara rule for x64 variant with XOR verification:


     

    You can find our Raw Payload decoder and extractor for the most common encodings here. It uses a parser from the previous chapter and it could save your time and manual work. We also provide an IDAPython script for easy raw payload analysis.

    Conclusion

    As we see more and more abuse of Cobalt Strike by threat actors, understanding how to decode its use is important for malware analysis.

    In this blog, we’ve focused on understanding how threat actors use Cobalt Strike payloads and how you can analyze them.

    The next part of this series will be dedicated to Cobalt Strike beacons and parsing its configuration structure.

    Threat Research Team
    Threat Research Team
    A group of elite researchers who like to stay under the radar.
    Follow us for more