Sarinay GmbH Blog - MELANI meets Peter

It was recently revealed that the network of the Swiss company RUAG had been compromised. The GovCERT unit of the federal Reporting and Analysis Centre for Information Assurance MELANI was involved in the investigation and released a technical report on the espionage case.

On a few dozen pages, the document explains how the attack was probably carried out. The report also provides information on the malware used by the attackers. There is a rather remarkable five-page section on how a particular piece of malware implemented cryptography. Let us have a closer look at what GovCERT had to say and what was missing.

The search for a library

The investigators pay close attention to the cryptographic code in two pieces of malware referred to as Carbon-DLL and Tavdig. Carbon-DLL uses RSA and an unspecified symmetric cipher provided by the Microsoft Cryptography API. Tavdig, on the other hand, does not rely on the OS libraries at all. Instead, it carries a custom implementation of AES and ElGamal encryption. The GovCERT report concludes that the two malware families were developed by separate teams.

Whether or not the authors are independent of each other, CryptoAPI is of little help when implementing AES and ElGamal. One needs the Next Generation Cryptography API (CNG) introduced in Windows Vista to get AES. A programmer who needs to cater for Windows XP would not use CNG. ElGamal encryption is not provided by either CryptoAPI generation. In fact, the only public-key encryption algorithm offered by Microsoft Windows API is RSA.

Tavdig authors simply had to look elsewhere. There are several cryptography libraries to consider. A criminal malware author has the extra freedom to ignore any licensing terms the libraries may come with. GovCERT were interested whether any specific library was used in the malware found at RUAG. We shall follow their reasoning, offer new arguments and then make the same educated guess.

The trouble with modular arithmetic

ElGamal encryption, as used by Tavdig, is computed in the multiplicative group of integers modulo a 1024-bit prime M. Multiple precision arithmetic in Tavdig assumes any relevant integer is fixed in length and represented as an array of 65 words of 16 bits. There are functions to add, subtract and compare such numbers. The functions are used as building blocks to compute multiplication modulo M.

The GovCERT report includes pseudocode of the multiplication function as output by IDA Pro & Hex-Rays Decompiler in Figure 12. Edited for clarity, this is what the function Multiply(x, y) does:

Initialize result = 0.
For i = 0 to 1023 do the following:
- If the i-th bit of x is 1, let result = result + y.
- If result is odd, let result = result + M.
- Shift result one bit to the right.
If result > M, let result = result - M.
Return result.

The GovCERT analysts proceed to explain that the unusual function computes the quantity xy·2^-1024 mod M. The result is off by a factor of 2^-1024. In order to fix the error, the value should be multiplied by 2¹⁰²⁴ modulo M. Because the modular multiplication function itself introduces the factor 2^-1024 every time it is called, one has to Multiply() the result by 2²⁰⁴⁸ instead. GovCERT call the value corrector, Tavdig code includes a function to compute it.

Calls to Multiply() within Tavdig always appear in pairs. First, the function is called with the "real" arguments. Then the corrector is multiplied in to fix the result. In the GovCERT report, all this is well explained. What the report lacks, is an appropriate name for the multiplication function. Dear MELANI, let me introduce Peter.

Montgomery arithmetic

The multiplication function in Tavdig employs a well-known trick by Peter Montgomery. The method avoids integer division by the prime modulus, that would otherwise be needed to perform modular reduction. The division is traded for multiplications, additions and bit shifts.

In addition to the original 1985 paper by Peter Montgomery, concise explanations of the trick can be found in these notes by Arjen Lenstra or in Chapter 14 of the Handbook of Applied Cryptography by Menezes, Oorschot and Vanstone.

The function Multiply used by Tavdig is an instance of radix 2 Montgomery multiplication. More specifically, it is the Algorithm 14.36 from the Handbook with b=2, n=1024 and R=2¹⁰²⁴. The value corrector corresponds to R² mod M.

Would a library do this?

Let us use the new observation to see where the code may have originated from. In order to compute ElGamal decryption, the malware needs to implement exponentiation modulo M. The GovCERT report lists the corresponding pseudocode and describes it as "standard binary exponentiation". I rewrote the function as ModExp(base, exponent, corrector):

Initialize result = 1.
For i = 0 to 1023 do the following:
- If the i-th bit of exponent is 1, compute
  - result = Multiply(result, base)
  - result = Multiply(result, corrector).
- Let base = Multiply(base, base).
- Let base = Multiply(base, corrector).
Return result.

This function does resemble the standard square-and-multiply exponentiation algorithm. The result is computed in a rather unusual way. The function Multiply() is always followed by a second call to fix the result using R² mod M (or corrector). In other words, the Montgomery multiplication function is first turned into ordinary modular multiplication. That method is then plugged into the exponentiation algorithm.

How to make your job harder

The above approach completely ignores a key feature of Montgomery arithmetic. To compute almost anything more complex than a single product, it is customary to use Montgomery representation of integers modulo M.

Instead of calling the function Multiply() on arguments x and y, it can be called with input x' = xR and y' = yR. The values x',y' are Montgomery representation of x and y. The function Multiply(x', y') outputs xyR²R^-1 = xyR. The result is in the form z' = zR for z = xy, i.e. it is the Montgomery representation of the product. The output can directly used in subsequent calls to Multiply() as desired, without the need to fiddle with the factor R along the way. The factor only needs to be removed at the very end of the computation. This is very useful for exponentiation that is typically computed as a sequence of several multiplications.

In order to convert an integer x to Montgomery representation, compute Montgomery product of x and R², i.e. the corrector from the report. To remove the factor at the end, simply Multiply() the argument x' by 1.

In the case of Tavdig, this saves thousands of calls to Multiply(). A reasonable modular exponentiation function would operate as follows:

Initialize result = R.
Let base = Multiply(base, corrector).
For i = 0 to 1023 do the following:
- If the i-th bit of exponent is 1, compute
  - result = Multiply(result, base)
- Let base = Multiply(base, base).
Let result = Multiply(result, 1)
Return result.

The actual multiplication and exponentiation routines in Tavdig interact in a way that makes little sense.

Fixing the parameters

GovCERT point out that there is a fixed modulus M hardcoded in the binary of Tavdig. This unusual approach suggests that no generic library was probably used.

The programmers apparently did not pay attention to the fact that once the modulus is fixed, so can be the value R² (the so-called corrector). Instead, Tavdig carries a dedicated function to compute the same constant over and over again. The value is derived every time a single ElGamal ciphertext is about to be decrypted. Tavdig stores the value in a local variable, passes it onto the arithmetic functions and then deletes it until the next ciphertext comes along.

This example also indicates that the authors did not quite understand how the various functions actually interact. Malware authors, by definition, are not reasonable programmers. This time, they managed to stitch a few bits and pieces together well enough to get usable results. The code almost certainly does not originate from a single well-designed software library.

Choosing the base

Addition and multiplication within Tavdig operate on 16-bit words. As the report suggests, this is rather unusual. After all, any machine the code ever runs on can add 32-bit numbers comfortably.

The choice of radix 2 for the Montgomery multiplication is more puzzling. Working in this base is rather inefficient, the function Multiply() computes the product using additions, subtractions and bit shifts only. There is likely not a single MUL instruction involved. In software, the radix is usually chosen to correspond to the width of the machine word. Montgomery multiplication in radix 2 is an approach used decades ago in hardware that could not multiply at all.

Combined with the use of 16-bit digits for addition, this may suggest that the authors were inspired by the methods used in hardware or embedded devices.

Modular delusion?

This text was written after I read how GovCERT reported on the implementation of cryptography in Tavdig. I consider the terminology used in the document ("finite fields") somewhat inappropriate, at times even misleading (Figure 15). The authors do nevertheless explain reasonably well what is going on in the code, all this without ever mentioning Montgomery multiplication. Maybe GovCERT were not familiar with the method, maybe it was never really there.

It is possible that the authors of Tavdig were not familiar with Montgomery arithmetic either. Maybe they discovered the trick all by themselves. In that case, it would be a rather impressive piece of code. It is not, it is still malware.