Categorized | Security

Removing semantic NOP’s from Malware

A common obfuscation technique used by malware is by randomly inserting a sequence of instructions that have no other effect on the functionality of the program.  This technique is additionally used in polymorphic and metamorphic malware.

There are a couple of interesting papers that try to address this by identifying some classes of “semantic nops” in the malware and removing them.  The basic model is to identify a particular sequence or set of instructions to examine, typically generated by bruteforce of all possible combinations in each basicblock, and then ask an oracle if its a semantic nop.

There are two papers which are probably of interest.  Malware Normalization http://www.cs.wisc.edu/wisa/papers/tr1539/tr1539.pdf, is an in interesting paper and has several techniques to canonicalize a binary so that it can be more effective when used scanning for in antivirus.  Another paper of interest is http://www.eecs.berkeley.edu/~sseshia/pubdir/oakland05.pdf which is a little more formal (and thus for me, harder to understand), on Semantics-Aware Malware Detection.  This paper performs static analysis on the malware and tries to match it to a template signature which describes particular behaviours of the malware such as its memory access.

The semantic nop oracle that can tell you if a sequence of code is a nop or not, is an SMT decision procedure (http://en.wikipedia.org/wiki/Satisfiability_Modulo_Theories).  SMT is an extension of SAT.  You give it a list of constraints, such as equalities and arithmetic, and then you can query the decision procedure to see if certain conditions can possibly be true or false.  Counter examples can be given in return to prove the results of the decision procedure.

To determine if a sequence of instructions is a semantic NOP, you ask the decision procedure if the set of registers that are modified by the instructions, result in the same values as what they were before hand.  If they are the same, then it’s clearly a NOP.  You also need to check that all memory written to is the same before and after as well.  Care should be taken with the status flags, and the papers mentioned don’t talk in depth about this. Perhaps checking for a lack of liveness in the flags or that the flags are the preserved at the end of a potential nop sequence. All these types of checks are only applied within basicblocks.  Detecting semantic nops as code that branches seems a much harder problem.

This appears to fine in most cases, but what those papers I mentioned earlier fail to describe is the case of exceptions being generated from memory access.  Consider the following code, which for all purposes appears to be a semantic NOP.

mov temp_mem, %eax; # potential exception
inc %eax
dec %eax
mov %eax,temp_mem

Note: status flag are not live at the end of the above code sequence making it eligable as a nop, which means there is a subsequent operation that overwrites the flags before using them.

That code indeed is a semantic NOP, _except_ when an exception occurs from accessing temp_mem.  If an exception occurs, then eax is clearly different from its starting condition when control is transferred to the exception handler.  This opens up a can of worms I think.

If we only look for semantic nops between memory accesses, we will undoubtedly miss a large number of cases that really are nops.  But if we ignore the potential for exceptions, then removing those misidentified nops will result in an unsound transformation.

Another problem was suggested to me, in terms of exceptions resulting from hardware break points. Removing a sequence of code, or applying a transformation to it, may not be sound.

PS.  I implemented a basic and incomplete version of dynamic taint analysis, and am using it to track the return value of GetProcAddress, so I can reconstruct the import table after auto unpacking.  It works fairly nicely.  And I implemented most of the code to transform my IR into STP constraints (STP is an opensource SMT decision procedure).

View full post on Silviocesare’s Weblog

Related Posts
  • Removing Persistent Malware
    This blog post is not for the technical guru!While it's not for mums and dads either, its main purpose is to explain to an average user how to manually remove persistent malware that cannot be easily ...
  • Removing Vista Antivirus Pro Fake Anti-virus program Malware
    This video describes the steps to fully remove all traces of the Vista Antivrus pro fake anti-virus malware. The repairs are carried out by qualified engineers from reb...
  • Removing Persistent Malware
    This blog post is not for the technical guru!While it's not for mums and dads either, its main purpose is to explain to an average user how to manually remove persistent malware that cannot be easily ...
  • Osama bin Laden dead – so watch for the spams and scams
    Google's top-trending Anglophone search term right now is, understandably, "osama bin laden dead". Google officially describes its hotness (you couldn't make this stuff up) as volcanic.The short versi...
  • Remove Antivirus Center (Uninstall Guide)
    Antivirus Center is a rogue anti-spyware program from the same family as Internet Protection. This malware is installed onto your computer through the use of fake scanner pages and Trojans that preten...
  • Compromised ads leading to TDSS rootkit infections
    As we all know, compromised sites play an important role in web distributed malware, acting as the conduit, guiding user traffic to further malicious content. Sometimes, the attackers get lucky, and s...
  • Data thefts far more common than just Sony and Epsilon
    In the wake of the press reports concerning the recent data breaches at Sony and Epsilon, some organizations are getting the wrong idea about modern online attacks. The media largely chooses to cover ...
  • Be Careful If Searching For Images of Kate Middleton’s Dress
    Real-world events occasionally generate a massive number of online searches. Japan's recent earthquake and the subsequent tsunami that followed is a good example of a sudden event that turned the worl...
  • IME Injection Evolution
    Recently,we found many malwares using a smarter way to inject the specified dll into system related to IME management. Comparing to the old IME injection tricks, it is much more difficult to be discov...
  • FBI takes on Coreflood botnet – but is this a step too far?
    Two weeks ago, the Federal Bureau of Investigation (FBI) obtained a court order in Connecticut, USA. This court order allowed the FBI to undertake an anti-cybercrime operation of a sort which had neve...

8 Responses to “Removing semantic NOP’s from Malware”

  1. silviocesare says:

    Hi. I am unable to comment on the graph matching algorithm I’m using, as it is central to a paper I am currently trying to publish. Sorry :(

  2. anon says:

    out of interest, what is this graph matching algorithm you speak of?

  3. silviocesare says:

    I will definately put the paper online after the conference – presuming my submission is accepted. The conference is in January 2010.

  4. guy says:

    Neat.

    Send us a link once the paper becomes public.

  5. silviocesare says:

    I have not stopped posting! :-)

    For the past couple months I have been working on a prototype for a flowgraph based malware classification tool using a novel graph matching algorithm. I have finished the prototype and am currently working on writing a paper for an Australian academic conference. This has taken up most of my time. I am unable to publish the details of the prototype or algorithm until the paper is public, hopefully later this year.

    I hope that explains why I’ve not posted during this time.

  6. guy says:

    Why did you stop posting ?:/

  7. bw says:

    hey it’s nice to read about it, so i can make my own code more “aware” of removing nop sequences :)

  8. jf says:

    Just ran across this from phrack, hadn’t seen you since you popped up in nologin some months ago. You are truly gifted, which always causes problems (no gift is free). It’s amazing however, a few months later and its like you never left (quality of work wise).

    On a side note, if you have any idea how to get ahold of mandragore, he disappeared a year or two ago also.

Trackbacks/Pingbacks


Security Status

Beware Facebook "Timeline" scams http://t.co/W5EW0cVv
5 months ago
Nigerian government (unknowingly) hosts phishing website http://t.co/uQd42ENw
5 months ago
PCMag Awards McAfee All Access its Editors’ Choice: SANTA CLARA, Calif.--(BUSINESS WIRE)--McAfee today announced... http://t.co/FakV7Vd8
5 months ago
RT @mikko: I hadn't noticed Google Maps has added 3D models of buildings. Here's a (very accurate) view of F-Secure HQ in Helsinki http://t.co/IKfAZlak
5 months ago
North Koreans aren't known for their online presence. But others may be lured into clicking Kim Jong-Il 'videos' too http://t.co/yQOon6YT
5 months ago
How to Protect Your Professional Reputation on Facebook Timeline http://t.co/I4bcR2VN
5 months ago
This is pretty impressive from @Softpedia: Facebook scans 2 trillion link clicks and blocks 220 million posts each day http://t.co/vKsn9gNl
5 months ago
Need for integrated approach to security in industrial control systems - http://t.co/tPBCNOow with @PikeResearch
5 months ago
Some free-based music we play at work http://t.co/xu5agZfc
5 months ago
Japan’s cyber defense weapon: a virus. It includes quotes by @Luis_Corrons via @InfosecurityMag
5 months ago