Rtfobj

broken image


oletools is a package of python tools to analyze Microsoft OLE2 files (also called Structured Storage, Compound File Binary Format or Compound Document File Format), such as Microsoft Office documents or Outlook messages, mainly for malware analysis, forensics and debugging. It is based on the olefile parser. See http://www.decalage.info/python/oletools for more info.
Rtfobj

Tsurugi Linux - Tools listing. 'TSURUGI Linux - the sharpest weapon in your DFIR arsenal'. 19041—Page breaks not recognized when using the RTFObj UCD; 19035—PrintScreen function prints after choosing Cancel in the print dialog; 18999—No text on import when using PowerPoint Converter command; 18994—Erase icon breaks after using the Save and Compact option; 18961—Page breaks not recognized when importing RTF files. 19041—Page breaks not recognized when using the RTFObj UCD 19035 —PrintScreen function prints after choosing Cancel in the print dialog 18999 —No text on import when using PowerPoint Converter command. Most public tools were unable to correctly parse the embedded stream (rtfdump, rtfobj, or RTFScan). Although rtfdump doesn't parse the equation stream, it does provide a good layout of all embedded objects and lets us dump the stream suspected of being the equation stream. Rtfobj: to extract embedded objects from RTF files. And a few others (coming soon) Changelog v0.54. Olevba, msodde: added support for encrypted MS Office files; olevba: added detection and extraction of XLM/XLF Excel 4 macros; olevba, mraptor: added detection of VBA running Excel 4 macros; olevba: detect and display special characters such as.

  • 2016-11-01 v0.50: all oletools now support python 2 and 3.
    • olevba: several bugfixes and improvements.
    • mraptor: improved detection, added mraptor_milter for Sendmail/Postfix integration.
    • rtfobj: brand new RTF parser, obfuscation-aware, improved display, detect executable files in OLE Package objects.
    • setup: now creates handy command-line scripts to run oletools from any directory.
  • 2016-06-10 v0.47: olevba added PPT97 macros support, improved handling of malformed/incomplete documents, improved error handling and JSON output, now returns an exit code based on analysis results, new –relaxed option. rtfobj: improved parsing to handle obfuscated RTF documents, added -d option to set output dir. Moved repository and documentation to GitHub.
  • 2016-04-19 v0.46: olevba does not deobfuscate VBA expressions by default (much faster), new option –deobf to enable it. Fixed color display bug on Windows for several tools.
  • 2016-04-12 v0.45: improved rtfobj to handle several anti-analysis tricks, improved olevba to export results in JSON format.
  • olebrowse: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to view and extract individual data streams.
  • oleid: to analyze OLE files to detect specific characteristics usually found in malicious files.
  • olemeta: to extract all standard properties (metadata) from OLE files.
  • oletimes: to extract creation and modification timestamps of all streams and storages.
  • oledir: to display all the directory entries of an OLE file, including free and orphaned entries.
  • olemap: to display a map of all the sectors in an OLE file.
  • olevba: to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
  • MacroRaptor: to detect malicious VBA Macros
  • pyxswf: to detect, extract and analyze Flash objects (SWF) that may be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, which is especially useful for malware analysis.
  • oleobj: to extract embedded objects from OLE files.
  • rtfobj: to extract embedded objects from RTF files.
  • and a few others (coming soon)
oletools are used by a number of projects and online malware analysis services, including Viper, REMnux, FAME, Hybrid-analysis.com, Joe Sandbox, Deepviz, Laika BOSS, Cuckoo Sandbox, Anlyz.io, ViperMonkey, pcodedmp, dridex.malwareconfig.com, and probably VirusTotal. (Please contact me if you have or know a project using oletools)

Download and Install:
The recommended way to download and install/update the latest stable release of oletools is to use pip:

  • On Linux/Mac: sudo -H pip install -U oletools
  • On Windows: pip install -U oletools

This should automatically create command-line scripts to run each tool from any directory: olevba, mraptor, rtfobj, etc.
To get the latest development version instead:

  • On Linux/Mac: sudo -H pip install -U https://github.com/decalage2/oletools/archive/master.zip
  • On Windows: pip install -U https://github.com/decalage2/oletools/archive/master.zip

See the documentation for other installation options.

Documentation:
The latest version of the documentation can be found online, otherwise a copy is provided in the doc subfolder of the package.

Πηγή : kitploit

Malicious documents exploiting CVE-2017-11882 continue to be used by malicious actors, but it has been a few years since I took a deep dive into their mechanics. A quick spelunk through our dataset produces quite a few, but I wanted an RTF example with minimal RTF obfuscation and came across this email:

Rtvobj

So It Begins

Let's start out with analyzing the RTF document and compare it with past documents. We know from experience that this vulnerability can be exploited from multiple document types (RTF, DOCX, XLSX) and has two options for injecting the malicious stream (Equation stream and OleNativeStream). But this one immediately looks different. Most public tools were unable to correctly parse the embedded stream (rtfdump, rtfobj, or RTFScan). Although rtfdump doesn't parse the equation stream, it does provide a good layout of all embedded objects and lets us dump the stream suspected of being the equation stream.

We see the traditional ClassName (slightly obfuscated), EqUatioN.3, and the required FormatID of 0x00000002, and random data for the OLEVersion. And instead of seeing an Embedded Equation object header starting with 0x001c or any bytes reflecting an MTEF header, such as a MTEF version of 0x03 and product version of 0x03, we only see the FONT record at the correct offset, 0x0108 at 0x29.

Out of Doubt, Out of Dark

Let's load this sample into a debugger and see what other tricks have been developed. Because the equation object relies on COM, we can set a breakpoint when these objects are created and iterate until EQNEDT32.EXE is launched. Then attach a separate debugger to the Equation Editor process and set a break point on the vulnerable function, 0x0041160F. Just as my last analysis, the return address is overwritten with an address of a RET instruction. Because the font record location follows the return address on the stack, this also results in execution flow continuing into the first stage shellcode.

The first stage shellcode is slightly different for this sample, but not unique and already discussed here. Basically, the shellcode locates the OLE stream on the heap and uses kernel32.GlobalLock to lock the stream at this memory location. And then jumps to a statically defined offset with in the OLE stream.

Rtf

Similar to my previous analysis, the second stage shellcode starts with a decoder stub. The decoder contains quite a few JMPs to complicate analysis, but it can be boiled down to the following:

Rtfobj

  • a CALL instruction to load the start of the encoded shellcode on the stack
  • POP ESI to create a pointer to the encoded shellcode
  • Initialize the key for the XOR decoder
  • the key mutates every iteration with IMUL EDI, EDI, 67D6B6F7
  • each dword is decoded with XOR DWORD PTR DS:[ESI], EDI

Rtvobj Oletools

If This Is to Be Our End

Rtobject

Now that we know the shellcode for these malicious RTF documents hasn't changed much, can we use the unicorn engine to dump the final payload without relying on the heavy weight and manual process of running it within a debugger?

The first step will be extracting the shellcode from the RTF, starting at the last instruction of the first stage shellcode, JMP EAX. Then modifying this instruction with a relative jump. The two instructions preceding this one result in 0xD5 and the JMP instruction is at offset 0x33 from the start of the OLE stream. By modifying the JMP EAX to a relative near jump, we will be adding 3 additional bytes to the instruction. This results in JMP 0x9F. Stripping the shellcode from the original RTF and modifying the JMP instruction produces the following hex string:

I leave it to the reader to review their tutorial and sample scripts for your programming platform.

One interesting feature of the unicorn engine is how we can add hooks to instructions, code blocks, and even results of an instruction. We can use these hooks to add a callback function every time an instruction writes to memory or when an instruction reads from an unmapped segment of memory. To use the unicorn engine to decode our shellcode we will need to do the following:

Rtfobj
  • Define and map our address space
  • Define ESP to handle any POP instructions
  • Define a callback function on memory writes to determine what segment of our shellcode is being modified
  • Define a callback function on a memory read from an unmapped segment, this should indicate our final shellcode attempting to load a function from a module

Excellent! Our script was able to decode the final shellcode and can even see the API calls that are loaded via LoadLibraryW. Everlast 950 elliptical. Because the shellcode is UTF-16BE, we can print the important IoCs by setting the encoding for the strings command. Our pipeline had already pulled this sample and labeled it as MassLogger.

Rtf Objdata

Rtfobj

Tsurugi Linux - Tools listing. 'TSURUGI Linux - the sharpest weapon in your DFIR arsenal'. 19041—Page breaks not recognized when using the RTFObj UCD; 19035—PrintScreen function prints after choosing Cancel in the print dialog; 18999—No text on import when using PowerPoint Converter command; 18994—Erase icon breaks after using the Save and Compact option; 18961—Page breaks not recognized when importing RTF files. 19041—Page breaks not recognized when using the RTFObj UCD 19035 —PrintScreen function prints after choosing Cancel in the print dialog 18999 —No text on import when using PowerPoint Converter command. Most public tools were unable to correctly parse the embedded stream (rtfdump, rtfobj, or RTFScan). Although rtfdump doesn't parse the equation stream, it does provide a good layout of all embedded objects and lets us dump the stream suspected of being the equation stream. Rtfobj: to extract embedded objects from RTF files. And a few others (coming soon) Changelog v0.54. Olevba, msodde: added support for encrypted MS Office files; olevba: added detection and extraction of XLM/XLF Excel 4 macros; olevba, mraptor: added detection of VBA running Excel 4 macros; olevba: detect and display special characters such as.

  • 2016-11-01 v0.50: all oletools now support python 2 and 3.
    • olevba: several bugfixes and improvements.
    • mraptor: improved detection, added mraptor_milter for Sendmail/Postfix integration.
    • rtfobj: brand new RTF parser, obfuscation-aware, improved display, detect executable files in OLE Package objects.
    • setup: now creates handy command-line scripts to run oletools from any directory.
  • 2016-06-10 v0.47: olevba added PPT97 macros support, improved handling of malformed/incomplete documents, improved error handling and JSON output, now returns an exit code based on analysis results, new –relaxed option. rtfobj: improved parsing to handle obfuscated RTF documents, added -d option to set output dir. Moved repository and documentation to GitHub.
  • 2016-04-19 v0.46: olevba does not deobfuscate VBA expressions by default (much faster), new option –deobf to enable it. Fixed color display bug on Windows for several tools.
  • 2016-04-12 v0.45: improved rtfobj to handle several anti-analysis tricks, improved olevba to export results in JSON format.
  • olebrowse: A simple GUI to browse OLE files (e.g. MS Word, Excel, Powerpoint documents), to view and extract individual data streams.
  • oleid: to analyze OLE files to detect specific characteristics usually found in malicious files.
  • olemeta: to extract all standard properties (metadata) from OLE files.
  • oletimes: to extract creation and modification timestamps of all streams and storages.
  • oledir: to display all the directory entries of an OLE file, including free and orphaned entries.
  • olemap: to display a map of all the sectors in an OLE file.
  • olevba: to extract and analyze VBA Macro source code from MS Office documents (OLE and OpenXML).
  • MacroRaptor: to detect malicious VBA Macros
  • pyxswf: to detect, extract and analyze Flash objects (SWF) that may be embedded in files such as MS Office documents (e.g. Word, Excel) and RTF, which is especially useful for malware analysis.
  • oleobj: to extract embedded objects from OLE files.
  • rtfobj: to extract embedded objects from RTF files.
  • and a few others (coming soon)
oletools are used by a number of projects and online malware analysis services, including Viper, REMnux, FAME, Hybrid-analysis.com, Joe Sandbox, Deepviz, Laika BOSS, Cuckoo Sandbox, Anlyz.io, ViperMonkey, pcodedmp, dridex.malwareconfig.com, and probably VirusTotal. (Please contact me if you have or know a project using oletools)

Download and Install:
The recommended way to download and install/update the latest stable release of oletools is to use pip:

  • On Linux/Mac: sudo -H pip install -U oletools
  • On Windows: pip install -U oletools

This should automatically create command-line scripts to run each tool from any directory: olevba, mraptor, rtfobj, etc.
To get the latest development version instead:

  • On Linux/Mac: sudo -H pip install -U https://github.com/decalage2/oletools/archive/master.zip
  • On Windows: pip install -U https://github.com/decalage2/oletools/archive/master.zip

See the documentation for other installation options.

Documentation:
The latest version of the documentation can be found online, otherwise a copy is provided in the doc subfolder of the package.

Πηγή : kitploit

Malicious documents exploiting CVE-2017-11882 continue to be used by malicious actors, but it has been a few years since I took a deep dive into their mechanics. A quick spelunk through our dataset produces quite a few, but I wanted an RTF example with minimal RTF obfuscation and came across this email:

So It Begins

Let's start out with analyzing the RTF document and compare it with past documents. We know from experience that this vulnerability can be exploited from multiple document types (RTF, DOCX, XLSX) and has two options for injecting the malicious stream (Equation stream and OleNativeStream). But this one immediately looks different. Most public tools were unable to correctly parse the embedded stream (rtfdump, rtfobj, or RTFScan). Although rtfdump doesn't parse the equation stream, it does provide a good layout of all embedded objects and lets us dump the stream suspected of being the equation stream.

We see the traditional ClassName (slightly obfuscated), EqUatioN.3, and the required FormatID of 0x00000002, and random data for the OLEVersion. And instead of seeing an Embedded Equation object header starting with 0x001c or any bytes reflecting an MTEF header, such as a MTEF version of 0x03 and product version of 0x03, we only see the FONT record at the correct offset, 0x0108 at 0x29.

Out of Doubt, Out of Dark

Let's load this sample into a debugger and see what other tricks have been developed. Because the equation object relies on COM, we can set a breakpoint when these objects are created and iterate until EQNEDT32.EXE is launched. Then attach a separate debugger to the Equation Editor process and set a break point on the vulnerable function, 0x0041160F. Just as my last analysis, the return address is overwritten with an address of a RET instruction. Because the font record location follows the return address on the stack, this also results in execution flow continuing into the first stage shellcode.

The first stage shellcode is slightly different for this sample, but not unique and already discussed here. Basically, the shellcode locates the OLE stream on the heap and uses kernel32.GlobalLock to lock the stream at this memory location. And then jumps to a statically defined offset with in the OLE stream.

Similar to my previous analysis, the second stage shellcode starts with a decoder stub. The decoder contains quite a few JMPs to complicate analysis, but it can be boiled down to the following:

Rtfobj

  • a CALL instruction to load the start of the encoded shellcode on the stack
  • POP ESI to create a pointer to the encoded shellcode
  • Initialize the key for the XOR decoder
  • the key mutates every iteration with IMUL EDI, EDI, 67D6B6F7
  • each dword is decoded with XOR DWORD PTR DS:[ESI], EDI

Rtvobj Oletools

If This Is to Be Our End

Rtobject

Now that we know the shellcode for these malicious RTF documents hasn't changed much, can we use the unicorn engine to dump the final payload without relying on the heavy weight and manual process of running it within a debugger?

The first step will be extracting the shellcode from the RTF, starting at the last instruction of the first stage shellcode, JMP EAX. Then modifying this instruction with a relative jump. The two instructions preceding this one result in 0xD5 and the JMP instruction is at offset 0x33 from the start of the OLE stream. By modifying the JMP EAX to a relative near jump, we will be adding 3 additional bytes to the instruction. This results in JMP 0x9F. Stripping the shellcode from the original RTF and modifying the JMP instruction produces the following hex string:

I leave it to the reader to review their tutorial and sample scripts for your programming platform.

One interesting feature of the unicorn engine is how we can add hooks to instructions, code blocks, and even results of an instruction. We can use these hooks to add a callback function every time an instruction writes to memory or when an instruction reads from an unmapped segment of memory. To use the unicorn engine to decode our shellcode we will need to do the following:

  • Define and map our address space
  • Define ESP to handle any POP instructions
  • Define a callback function on memory writes to determine what segment of our shellcode is being modified
  • Define a callback function on a memory read from an unmapped segment, this should indicate our final shellcode attempting to load a function from a module

Excellent! Our script was able to decode the final shellcode and can even see the API calls that are loaded via LoadLibraryW. Everlast 950 elliptical. Because the shellcode is UTF-16BE, we can print the important IoCs by setting the encoding for the strings command. Our pipeline had already pulled this sample and labeled it as MassLogger.

Rtf Objdata

Rtvobjd Api

IoCs





broken image