MassLogger is an information stealer, first sold in hacking forums around April 2020. The malware author claims it to be the “most powerful logger and recovery tool” which costs $99 USD worth of Bitcoin for a lifetime license. MassLogger is highly configurable and gives its malicious users many options for delivery, anti-detection and anti-analysis, and capabilities such as keylogging and password stealing from a wide variety of browsers and applications.
Avast researchers have found that it is most commonly found in Turkey, Spain, Ukraine, Chile, the United States, Brazil, the United Kingdom, Germany and Poland. Avast AV is detecting this malware under “MSIL:MassLogger-*”. In addition, the latest variant of MassLogger will not run if it finds Avast or AVG AV present in the system.
The malware author has an active Github directory where he shares the source code of multiple malware features and packers for educational purposes. We are able to find many similarities between what is being used in the malware and some of his Github projects.
In August 2020, FireEye wrote an article explaining how to get past the anti-analysis tricks used by MassLogger version 1.3. Recently Talos wrote on MassLogger version 3. In their post, they focus on the campaign and delivery of the malware. In this blog, we will provide the last missing piece, a detailed analysis of the final payload’s obfuscation which includes operation codes and an interpreter as well as indirect calls to unassigned fields.
Our analysis is demonstrated with sample
Populate fields with methods at run-time
Scanning through the decompiled code, the majority of function calls are performed in an indirect manner where it tries to call uninitialized field values. As a result, the control flow of the malware are totally hidden from static analysis
Mo() is a wrapper function whose purpose is to call the 4th argument, in this case, vZ.kj. Interestingly, vZ.kj is of field type instead of method type, and there is no trace of it being assigned. Revisiting the vZ declaration, we find out that it is just one of the many internal sealed classes whose structure consists of field values with no assigned references, a caller function similar to Mo(), and an external “Invoke” function. In addition, they all call a function with token 0x060004D8 in the module initialization phase.
0x060004D8 is in charge of decrypting the 2KB embedded resource to build a dictionary between field tokens and method tokens. The field vZ.kj mentioned above has token 0x040001E2 which is mapped with method token 0x060001E0
Once the dictionary is constructed, the function looks through all the fields with Static, NonPublic, or GetField flag in the module to find the corresponding method tokens. If the token belongs to a static method, it will be assigned directly to the field using fieldInfo.SetValue()
If the specified method is not declared as static, a wrapper for the intended method is constructed then assigned to the field. This dynamically created method has an additional parameter of type System.Drawing.Imaging.ImageCodecInfo. The call to the intended function will be made through OpCodes.Callvirt or OpCodes.Call based on whether the first byte of the token is modified or not. For example, if the token is 0x46000361 in the dictionary, it will be converted back to the standard token 0x06000361, and OpCodes.Callvirt will be used instead of OpCodes.Call.
These dynamic wrapper methods may cause additional overheads when debugging due to the transfer to DynamicResolver.GetCodeInfo() method before the intended function is reached.
All strings used in the malware are encrypted and stored in the 23KB embedded resource. The method token 0x060004DB acts as the string provider where it decrypts and stores the string table upon its first run. This method receives the offset of the required string, the first DWORD is read to determine the string length, then the string following after is returned.
Retrieving Operation Codes from Resource
After the string decryption, the malware leads us to function 0x060001DF where the malware reveals its secretive flow control in the form of operation codes. First, an array of objects are deserialized from the 3rd 3KB embedded resource. These objects contain a list of operation codes and additional data that will be fed into an interpreter to perform tasks such as invoking a function, creating a new object, or modifying a List.
In order for the object array to be deserialized, the malware first starts with initializing the following structures:
- An array of 255 bytes. The first byte in the resource indicates the next number of words used to assign values to this array. The first byte in each word, ie 0x23 or 0x1B in the capsules, represent the index; while the second byte, ie 0x1 in the capsules, represents what read operations to use:
- 0x1 = Custom Binary Reader
- 0x2 = ReadInt64
- 0x3 = ReadSingle
- 0x4 = ReadDouble
- 0x5 = Read an array of data
- List of strings. The following byte, 0x0 in the square, determines the size of the list. In this case, no string will be needed
- An array of objects to be deserialized. The next byte, 0x9 in the square, tells us the size of the array. Each member of the array will be initialized to null and reconstructed on the later step.
- An array of offsets. This array has the same size as the object array and represents where the data associated with each object locates in the resource. Each entry of this array will be filled in using CustomBinaryReader starting at the position after the 0x9 byte.
When the above structures are in place, a lengthy routine that resides in 0x060001DF will start reconstructing a specified object from the array by reading the proper resource data. The main purpose of this object is to form a list of operation codes and the needed parameters to perform them.
The first part is the operation code which ranges from 1 to 173, while the second part represents the operand. The interpreter for these operations locates inside method 0x06000499 and consists of 157 unique handlers. This massive implementation is indicative of a commercial code protection tool, but we aren’t unable to find any further information at the moment.
The meaning of important operation codes from the above example:
- 154 : <method token>: invoke the specified method after reconstructing the needed arguments and the receiver of the returned value.
- 127 : <constructor method token>: create a new object after reconstructing the needed arguments for the constructor and where the new object will be assigned to
- 21 : <field Token>: get the value of the specified field, usually an encrypted config which is used by operation “166 : 0x60001C3” for decryption
A block of Base64 encoded strings can be found in the middle of the decrypted string table:
These are MassLogger encrypted config. First, each encrypted config is assigned to the corresponding field in Module 0x02000044. Note that the module token is consistent across all MassLogger v3 samples that we looked at.
Next, the config.Key is base64 decoded and used as the PBKDF2 password to derive a 32bytes _key (decryption) and a 64 bytes _authKey (encryption). When the malware is ready to read the config for usage, function 0x060001C4 from module 0x02000045 decrypts each config field with the following steps:
- Base64 decoded
- The first 32 bytes are SHA256 checksum which is used to verify the integrity of the string
- The next 16 bytes are used as IV
- The config is decrypted using AES with _key and IV from the previous step
The full config of this sample: https://github.com/avast/ioc/blob/master/MassLogger/config.txt
Despite the variation of obfuscation technique in each version, MassLogger makes little change in its functionality. Compared to the analysis in June 2020, a check for Avast and AVG AV ( looking for AvastGUI and AVGUI processes) is added at the beginning of execution. In addition, the malware has minimized the amount of fingerprints left on the system. The log data is no longer write to disk, and the “MassLogger” keyword has been removed:
Comparison between version 1.3 and version 3.0 log data
MassLogger is a versatile .NET information stealer with a complete list of features. The malware employs heavy obfuscation techniques which we intend to describe in this article. At the moment, we can’t confirm whether the malware is packed with a commercial crypter, but its complexity may indicate so. We also illustrate how the configuration can be extracted which can help with identifying IOCs for a particular sample.
Indicators of Compromise
The full list of IoCs is available at: