Saturday, March 29, 2014

Yet Another Security Tool @Troopers14


i read 'boutique conference' somewhere recently, and that term actually nails it. occupying the print media complex in heidelberg for an entire TROOPERS manages to provide two days of workshops, two more of conference and a day of roundtable discussions as well as various side events like an IPv6 security summit or an SAP security track or a telco sec day or.. did i miss anything? guess so.

there is a soldering station to fiddle with your.. batch, because it comes with an arduino attached. troopers provides food & coffee & mate nearly around the clock. AND, well not that i'm anything like picky about clothing at all, BUT they have conference shirts available for girls.

no one goes to cons just for collecting shirts.. but this industry is moaning about the lack of females, no? so when they finally show up it is really nice when this kind of events actually acknowledges beforehand that this could happen. in other words, giving out shirts only for men sort of implies that there will be no women.

summing it up, great event :]
and here comes what we did there.



as i refused for ~5 months to come up with documentation, i guess, no finally it is the right time. DiffRay in short is a tool to diff Windows 7 and Windows 8 executables to spot missing security functions in an automated way.

if one can fiddle around with input values for an application, without that application checking for their validity, chances are high one can actually perform creative abuse on memory structures. the inclined reader understands potential impact of memory corruption. Microsoft does too, and thus came up with dedicated libraries that provide input checking functions to make it easy for developers to apply the right security check for a dedicated input value. namely these libraries are intsafe and strsafe. they provide APIs like ULongAdd or StringCchCopy, which do nothing more than checking if a given value stays within expected boundaries (more information on MSDN). 

following these functions are called 'safe functions'. we assume, that when in one version of a library such a safe function is applied while in the same piece of code of another version of that library no safe function is called - something is fishy. 

we perform the diffing on Windows libraries and drivers (.dll, .sys) in a very simple way. our approach is to decompile each binary, scan for safe functions and put every hit per API into a database. this way we simply count the hits for a specific safe function in a library function and diff it with the complementary library function of another Windows version. 

finally, if the hit counts differ we have a good chance, that some value in that library function at hand goes unchecked. we consider this a potential vulnerability.


the bare necessities
Python 2.7 32bit
pymssql 1.0.2 32bit for Python 2.7
PyQt 4 32bit 

DiffRay comes with two executables for decompilation of libs and drivers and a python application for parsing the spotted safe function hits into a database and for producing the final diffings. basically what we do is decompile the binaries to .c-files, so yeah we produce some sort of Windows OS source code :) 
you will need IDA Pro and the hexrays decompiler for this step. the .c-files are then parsed to a database. you can choose between sqlite or mssql; i highly recommend mssql. or you implement a DB handler of your choice, this is python!!

next step is the parsing. DiffRay parses either files or whole directories for symbols of safe functions. right now there are 130 symbols, they can be extended by just editing the signatures.conf. also, there is signature mapping if some safe function turns out to be equivalent to another one. we saw this in the past, but didn't yet come up with the right mappings. configuration could be achieved by editing the signature_mapping.conf in the form sig1=sig2, line by line.

once the parsing is finished (depending on the DB backend that could take a while) DiffRay can start with the diffing. the commandline instructions you need are listed in the slides. basically, diffing can happen via library id or via library name. the lib id way is not very handy when diffing various libraries. thus for automation i recommend using the name option and creating a batch file that feeds DiffRay with the library names and dumps the output to a directory of choice.

attention, for the name option the name should identify the Win7/Win8 versions, without extention! so e.g. kernel32 is fine, when there is a Win7 version and a Win8 version present.

the output then should be a bunch of files, preferably .csv, that contain data like this:

Function_Name Pattern Win8 Win7
EQoSDispatchIoctl StringCbLength 2 0
Ipv4SetEchoRequestCreate ULongAdd 4 0
Ipv6SetEchoRequestCreate ULongAdd 4 0
WfpAleAuditEvent StringCbCopy 2 1
WfpAleCaptureImageFileName StringCbCopy 1 2

from here on the researcher is on his own. now, get the libraries, open them in IDA and jump to the mentioned functions. good luck ;)


for ease of use we decided we wrap this whole process up in a GUI. its designed in Qt, so very easy to build and modify - you can even change the colors if you don't like my chewing gum style. anyway, all the colors do make sense as all functions are integrated into one window:


configuration dialog. you HAVE to be connected to a database when you start parsing! you can configure credentials for mssql, there is nothing to configure for sqlite (always connects to the same sqlite db). the buttons CREATE DB and FLUSH DB at the moment actually do the same, dropping everything and creating a new db from scratch. via the configuration dialog you can edit signatures, mssql settings, mappings and logging.


decompilation box. click around here to invoke the dll2idb.exe and idb2c.exe that should come with the python project. then watch the IDA Pro window pop up and down as it decompiles :)


parsing box. you need python 2.7 for it to work. for anything to work.. you can either parse a file or an entire directory, make sure its all .c-files and to have the right operating system selection. parsing is done in separate processes, you can start on multiple directories at a time.


diffing box. either on library ids or on library names, as mentioned a name has to hit one Win7 and one Win8 library. sadly, we don't yet have a way to do batch via GUI. will come in the future.


search box. check if a libname actually finds libraries or which lib got which id. or get all diffing info of one particular library.  


in total we decompiled and diffed more than 900 libraries of Win7 and Win8. the slides show some of the results, not all of them though. it is still a lot of work to actually check all the potential vulnerabilities and to evaluate if they are triggerable after all. a lot of false positives arise due to new code parts in Win8, manual checks on the Win7 side that have been replaced with a safe function or different naming of safe functions between both versions. 

apart from that, great fun. if you're hacker, bored, don't know what to do - start a joint project with someone else. you will learn what he knows, what you don't know and the other way round. both have expertise and great ideas, put them together and tadaaa you get twice as much of each. 


well there is Windows 8.1 right? besides, there is a lot more we can parse for than symbols. there is actually a lot more we could do with the pseudo source code of windows... and it would actually be a bright idea to switch to IDAPython instead of decompiling stuff.

there has to be some bug fixing, some more logging, some more automation. there could be some machine learning element or integration of symbolic execution, that could add completely different maaaagic.. 

but that would be a different blog post.

Thursday, March 20, 2014

The Mystery of Anti-Debug by HeapAlloc

first of all, a big thank you to my friend moti who actually provided the final hints to solve that mystery and saved me A LOT of time googling heap structures. i wish everyone of you would have a moti when getting stuck in RE questions :)

to the story: meanwhile, in a zeus trojan. last week i peeked into a zeus just to really quickly 
stumble over an anti-debug trick. SURPRISE! kidding. 

that anti-debug is really easy to pass by, but wasn’t that trivial to explain; or at least that is what it seems like because i couldn’t find any suitable documentation. and this while the interwebz is full of malware reverser’s write ups.. kidding again.

so here we go...

in a nutshell: the malware would allocate heap memory and use the header of the heap entry as an indicator for an attached debugger. simple, after all. is that frequently used? i don’t know.

but, lets start at the end. the debugger would crash with an access violation when executing invalid arguments. those invalid arguments were produced by an unpacking routine, and it took some runs for brute forcing with IDA Stealth to find a possible root cause: NtGlobalFlag. well known you might think now, and indeed i had some people smiling at me with a yes my dear, thats not tricky at all. but i went on to at least spot the check for this flag, which as for example Mr. Yason described very nicely indicates an attached debugger. guess what, didn’t find that check. 

in the first illustration you see the call that causes the exception: EnumWindows, that should call into a handler function, which actually is the unpacked code and turned out to be invalid when NtGlobalFlag is set.

Exception happening in EnumWindows handler function
so i held on to that exception and discovered that by turning IDA Stealth on and off the decryption key in the unpacking routine would change. gotcha, challenge accepted. so, investigating that key i eventually ended up in a piece of memory that preceded an allocated heap block. i provided my walkway up the unpacking routine in a screenshot doku, for people who like pictures just as much as me.

in the unpacking loop: bl being used to XOR the future code

up: ebx initialized with key_init

up: modifying key_init with esi

up: tweaking esi

up: grabbing esi initially from allocated_heap-8

root of all evil: HeapAlloc

so i ended up clueless inside of the RtlAllocateHeap function of ntdll.dll. in fact the value that the malware would grab from memory was initialized by RtlAllocateHeap and i admit it took some staring at the memory to accept the fact that it must be part of the header of the allocated heap block. the memory happened to always be allocated at 9A0688h; the value requested in the code therefor was 9A0680h. i could wonderfully watch the value at this offset turn from 03 to 01 when turning IDA Stealth, specifically NtGlobalFlag protection on or off. and this very value was the cause of crash.

Value in Question in the Memory Block Header

here is where moti jumped in and pointed me to the _HEAP_ENTRY structure, which assigns names to the values preceding my heap block:

The Memory Block Header

with data_offset-8 one hits exactly the size variable of that structure, which actually contains the size divided by 4 PLUS.. size of management information - like the header for example, which would explain a +1. or additional debug information, which could explain a +3. overall, the allocated buffer size in that particular case is always 14000h (it is dedicated to contain unpacked code later). the size value in _HEAP_ENTRY therefor is 2803h when a debugger is attached, or 2801h without a debugger or IDA Stealth activated.

Magic happened - Value changed!
so finalizing, when NtGlobalFlag indicates an attached debugger the heap manager understands this as a higher need for debug information so the allocated heap blocks are slightly bigger than without that flag set. this fact is used by the malware, as it uses the lower byte of the size value for calculation of the unpacking XOR key. 

for more information on heap structures check out this article which i found very informative
OR ask your own moti!

Tuesday, March 11, 2014

Bright Future Ahead

some weeks ago i was invited to talk at an austrian highschool, to a class of 18 year olds, about.. me. odd right? thing is, it were mostly girls in there and their teacher thought it would be a good idea to present them a -kindof- different perspective of future. and honestly, before that i didn't think i would ever serve as example for others. i'm not usually stepping up and taking a stage unless i'm asked to do so. then having all these big eyes on me.. unsettling. but not only after the talk i understood what it was actually about. never i was like 'i am a woman and can do cool stuff'. it was 'i do cool stuff. you should too.' you can figure out later that this is unusual for a girl, if you need to.

it in fact is easy to talk passionately about something you seriously think is cool. so thinking back that was the most fun talk i've given so far. about how i decided to study computer science, or why i started as a malware analyst, opportunities and obstacles, future plans and on how to choose ones dreams carefully.


honestly, it sounded like a cool idea. after studying information security none of the other options looked appealing, thinking back i can't even remember what they were about. so i started on a malware analysis project for my thesis and later got a job at our local anti-virus company.

now, 3 years later, what to say - i'm happy. i'm free and independent, love what i'm doing and got so many possibilities...

that screenshot you can see on slide number three shows IDA Pro, by far the most powerful and also my favorite tool in my analysis lab. i know, at first sight it looks terrible, but wait, could you believe that i had such a good time with IDA Pro inside a number of binaries; way more than i ever had using.. MS Visio? or Adobe InDesign? all a question of perception.

reversing is like building puzzles. a binary a big black box at first, but i promise as long as you don't give up you will reveal secret after secret and eventually end up with an 'UHH i understand this now'. once you have more practise you will experience more success in a day than most professional artists seem to have all their career long. true story. because every little 'uh i understand this now' feels awesome. reversing looks to me as an art on its own, but a determinable one. and one that, on average, pays better.

a binary can just only work a certain way. even the sophisticated advanced ones are never as complicated as dealing with humans. there is always a solution for any problem.


i think, a lot of technological studies have questionable reputation - because they are perceived the wrong way. economy, philosophy, politics, multi media design and therelike are topics that a first world human being experiences every day. processor design, electrical engineering, structured programming or mathematical equations just don't appear, ever, outside of a classroom or a lab. ordinary humans tend to fear the unknown. so why would a youngster, especially a female one whos not even supposed to like tech, out of the blue decide he, or she, wants to understand machine level code?

a similar thought on talent. we are successful in things that we are good at. we are goon in things that we practise. i believe in practise, more than in natural talent. but we all practise a lot what we like, and we tend to like things that we are good in. which, if you think about it, is a circle of like - practise - like more - be genius - practise more.

so concluding, what do you think you're good at - and are you sure there's nothing else? i did my own case study on that theory, unintentionally.  when i was 17 i did my driving license exam, and i was terrible in parking cars. i somehow made it through that exam and decided i would just never park anything again unless it was unavoidable. then i went on to university, public transport was sparse, but so were the parking spaces around the campus. and every year it seemed there were more and more cars and parking lots would  become smaller and smaller. so my situation was clear, park that small car of mine into ridiculous corners or walk a long distance to the campus. finally, i rather learned to park than to walk...

beautiful end, after 5 years of ridiculous daily parking i figured i could fit my renault into any space that was just an inch bigger than the renault itself.

so again, do you think there is something you are not good in, and are you sure you don't want to change that? free after einstein, saying something is too complicated just means you don't understand it well enough. plus back to the ladies. driving cars is not a male talent; i'm SURE they just practised harder. i wonder how many females actually did get an electric toy car at age 3. like my older brother did.


now finally, let me get back to the binaries. why become a malware analyst, if you still don't like binary, is actually easy. more jobs, more money, faster career, more freedom. and if you have money and freedom, you are actually more likely to get what you want after all. go figure.

apart from that, you will have fun trust me. you will receive ridiculous appreciation, for doing something that others, even men, are afraid of because they just don't understand it well enough. you will very often stand out and be better than others, because reverse engineering like most engineering fields just doesn't have so much competition going on like.. marketing. thereafter, you will experience less discrimination as in competitive fields. you will meet a number of very bright and interesting personalities, which is within the most beautiful aspects of this job. you will face an incredible diversity of people and tasks and lots of neverending challenges that remodel your own personality. 


so now, if your fingers are already burning, i added some links in the slide set above where you will find homework. if you're still scared, contact me and i will help.

but if you're still not sure what to dream about - as long as you define your success by your own achievements you should be fine. when looking back and finding there is nothing to regret, you did something right.

closing this post i want to quote a card my brother (!) has lying around in his car (!).