Technical Analysis of CVE-2014-1761 RTF Vulnerability

 

Recently, Microsoft announced that an RTF sample exploiting CVE-2014-1761 is in the wild. The sample has just become publicly known. I spent some time analyzing the vulnerability and this blog describes what I found. The sample I analyzed has a SHA1 value of 200f7930de8d44fc2b00516f79033408ca39d610. The main module that was used in my analysis is wwlib.dll with file version of 14.0.7113.5001 used in Microsoft Office 2010.

 

The main parser is located at address of 0x31D0BAFF in the wwlib.dll file.

fig01.png 

Figure 1 Main RTF parser

 

This function is really big and has a few huge switch statements. Figure 2 shows a bird’s eye view of the function, giving you a sense of how it looks. Most of the code related to the vulnerability and the exploit exists in this function.

 fig02.png

Figure 2 Main RTF parser bird's eye view

 

Out-of-Bounds Array Overwrite

Basically, what is happening with the vulnerability is an out-of-bounds memory overwrite. There is a memory array created when the listoverridecount control word is used. The argument number for this control word, which is 25 in this case, is used to allocate the memory array used later to store memory pointers (Figure 3). Each array member has a size of 8 bytes.

 

 fig03.png

Figure 3 listoverridecount

 

Figure 4 shows code that allocates this array. The instruction at 0x31D0C67F is where it pushes array size (25) to the stack. The instruction at 0x31D0C690 saves the memory pointer to a structure instance on the memory (offset with 0x80FC). Register ecx points to a structure with parsed RTF information. From the offset we just saw (0x80FC), you know that the structure is huge. The structure is used over the whole code. I refer to the base address pointed to by ecx as the RTF parse object from now on.

 

 fig04.png

Figure 4 Array allocation code

 

Figure 5 shows a memory copy routine where the actual array assignments happen. Register edi at 0x31D131FF is the index for the destination array. You can confirm that in instruction 0x31D131F9, it is getting an array base from eax+0x80FC, which contains the address for the memory array. Register eax points to the RTF parse object in this case.

 

 fig05.png

Figure 5 Array copy routine

 

The problem is that when the memory copy happens, index value (edi at 0x31D131FF) is not checked against the maximum array size. So, an out-of-bounds memory array overwrite happens. The next question is what condition causes this out-of-bounds memory assignment? I found that each lfolevel control word makes the array assignment occurrence increasing the global array index value.

Figure 6 shows the control words that actually trigger this memory operation. There are 34 lfolevel control words and this is more than the original array can contain.

 

 fig06.png

Figure 6 Control word (lfolevel) that triggers array assignment

 

Array Overwrite Content

The first 4 bytes of data that are copied to each array are actually a memory address for a heap location. Sometimes, just a null value is copied. Figure 7 shows one of the memory areas (0x067d8870) that is copied to the array index of 0x1d. The array location at index of 0x1d is outside of the original array and is inside one of the objects used for the OART operation in GFX.dll.

 

 fig07.png

Figure 7 Array content copied

 

Figure 8 shows how object memory corruption happens. You can see that the GFX object related bytes of the memory are overwritten. The first 4 bytes of each 8 byte array are actually pointing to a vtable of that object and are overwritten. This enables code execution later when methods from this GFX object are called.

 

 fig08.png

Figure 8 GFX object vtable corruption

 

The memory contents at 0x067d8870 are shown in Figure 9. The contents serve as a vtable after memory corruption happens.

 

 fig09.png

Figure 9 Memory contents at 0x067d8870

 

Figure 10 shows the code that copies the memory contents to 0x067d8870. It copies 0x30 bytes of data from the RTF parse object+0x8104 location. Register ecx when memcpy is executed (at 0x31D12F78) points to the global RTF parse object+0x8104.

 

 fig10.png

Figure 10 Memcpy routine that copies contents to 0x067d8870

 

After all array assignments happen, the final memory related to exploitation looks like following. ArrayData represents copied data from RTF parse object+0x8104 eachtime.

 

 fig11.png

Figure 11 ArrayData related to the exploitation

 

How the RTF parse object+0x8104 area is constructed

The sample can trigger memory corruption with the contents from a specific memory area (RTF parse object+0x8104). The next two questions are related to how this area is constructed and whether this is attacker-controlled. The answer to the second question is yes. I already said that the RTF parse object contains information on the parsed RTF stream. The memory area around the RTF parse object+0x8104 is used by various control words. The following explains which control word is responsible for which part.

First, the contents at the RTF parse object+0x8104 change over time as new control words are parsed. I am using the memory contents when array 0x1d assignment happens as the explanation.

 

RTF parse object+0x8104 (DWORD)

The instructions from Figure 12 are responsible for putting the value in that location.

 

 fig12.png

Figure 12 RTF parse object+0x8104 modify instructions

 

The actual related control word is levelstartat. Figure 13 shows the part that is responsible for the bytes written from the exploit. You will recognize that the number 31611 is actually 0x7b7b in hex.

 

 fig13.png

Figure 13 Affecting control word

 

Figure 14 shows the 4 bytes from the RTF parse object+0x8104.

 

fig14.png 

Figure 14 RTF parse object+0x8104 (DWORD)

 

RTF parse object+0x8108 (BYTE)

Figure 15 shows the instructions that fill the RTF parse object+0x8108 location.

 

 fig15.png

Figure 15 RTF parse object+0x8108 modify instructions

 

The control word that triggers the code location is levelnfcn. From Figure 16, number 232 is 0xe8 which is the content written to the RTF parse object+0x8108 location.

 

 fig16.png

Figure 16 Affecting control word

 

 fig17.png

Figure 17 RTF parse object+0x8108 (BYTE)

 

RTF parse object+0x8109 (BYTE)

The control words levelnorestart, levelold, jclisttab, leveljcn affect this memory location. Figure 18 shows the code that modifies this memory byte.

 fig18.png

Figure 18 RTF parse object+0x8109 modify instructions

 

The memory byte at this offset is shown in Figure 19.

 

fig19.png 

Figure 19 RTF parse object+0x8109

 

Figure 20 shows the actual control words levelnorestart1 (which creates 8) and levelold1 (which assigns 40) which were used to make the value of 0x48 when xored together at the affected location.

 

 fig20.png

Figure 20 Affecting control word

 

RTF parse object+0x810A

The RTF parse object+0x810A location is filled from the argument from the levelnumbers control word. From Figure 21, the first 5 bytes (\’5A’) are escaped and interpreted as 0x5A.

 fig21.png

Figure 21 Affecting control word

 

The hex representation of the bytes is shown in Figure 22.

 

 fig22.png

Figure 22 Affecting control word (hex bytes)

 

Figure 23 shows the code that copies the bytes to the RTF parse object+0x810A memory location.

 

 fig23.png

Figure 23 RTF parse object+0x810A modify instructions

 

The interpreted bytes are shown in Figure 24.

 

 fig24.png

Figure 24 RTF parse object+0x810A

 

RTF parse object+0x8113 (BYTE)

The next single byte is controlled by the instruction from Figure 25.

 fig25.png

Figure 25 RTF parse object+0x8113 modify instructions

 

The control word that triggers this code is levelfollow. The actual RTF string is shown in Figure 26.

 

fig26.png 

Figure 26 Affecting control word

 

The argument 39 is 0x27 and it is reflected in Figure 27.

 

 fig27.png

Figure 27 RTF parse object+0x8113 (BYTE)

 

RTF parse object+0x8114 (DWORD)

The RTF parse object+0x8114 location is filled from the instruction shown in Figure 28.

 

 fig28.png

Figure 28 RTF parse object+0x8114 modify instructions

 

The control word, levelspace triggers the code and the argument shown in Figure 29.

 fig29.png

Figure 29 Affecting control word

 

The number 22873 is 0x5959 and is well reflected in Figure 30.

 

 fig30.png

Figure 30 RTF parse object+0x8114 (DWORD)

 

RTF parse object+0x8118 (DWORD)

Figure 31 shows the instructions that modify 4 bytes from the RTF parse object+0x8118 location.

 

 fig31.png

Figure 31 RTF parse object+0x8118 modify instructions

 

The control word levelindent is responsible for triggering the instructions. You can see the argument shown in Figure 32.

 fig32.png

Figure 32 Affecting control word

 

The number 23130 is 0x5a5a which is well represented in Figure 33.

 

 fig33.png

Figure 33 RTF parse object+0x8118 (DWORD)

 

Summary

This vulnerability and exploit are really interesting. The attacker has good control over most of the bytes used in the exploitation. The fact that the fake vtable the attacker replaces the original one with is controllable in a very stable manner also makes the first phase of the exploitation process very stable. The fact that the attacker was able to control the necessary bytes for exploitation using a few control words from a whole set of control words tells us that the attacker has a good knowledge of the internals of RTF processing with wwlib.dll.

 

 

 

 

Comments
stonedeyy | ‎04-18-2014 03:58 AM

Can you tell me your Testing environment, thank you

Matt_Oh | ‎04-23-2014 10:28 AM

@stonedeyy Windows XP SP3 + Office 2010 SP2 with full update at the time of testing

noobnoob | ‎04-24-2014 10:42 AM

How can i download this file?

Thanks

Matt_Oh | ‎04-24-2014 03:31 PM

@noobnoob You need to install Office 2010 SP2 and find the from the installation folder or go to MS14-001 page and extract the files from the patch.

stonedeyy | ‎05-10-2014 08:17 AM

Hi,  I has test POC, the POC is ok, but I can not debug ROP and shellcode, Can you tell me, how can I debug ROP and shellcode, thank you!

Arash_A | ‎06-07-2014 06:45 AM

Hi, What about Word 2007 SP3 + Windows XP SP3 ? What is the address of the parser  function in this case? What are the addresses of other parts you showed in the figures in this case(Word 2007 SP3)?

How could the parser function be found in this case?

khg | ‎06-10-2014 01:26 PM

are you find rtf header vulnerable?

Matt_Oh | ‎07-01-2014 11:01 AM

This blog software is not easy to follow up with questions and answers. Please contact me through @ohjeongwook and let's continue our threads over there. I'll add the output from those discussions here later.

 

Thanks.

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
Twitter: @ohjeongwook .
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.