Saturday, March 29, 2014

Yet Another Security Tool @Troopers14

TROOPERS14


i read 'boutique conference' somewhere recently, and that term actually nails it. occupying the print media complex in heidelberg for an entire TROOPERS manages to provide two days of workshops, two more of conference and a day of roundtable discussions as well as various side events like an IPv6 security summit or an SAP security track or a telco sec day or.. did i miss anything? guess so.

there is a soldering station to fiddle with your.. batch, because it comes with an arduino attached. troopers provides food & coffee & mate nearly around the clock. AND, well not that i'm anything like picky about clothing at all, BUT they have conference shirts available for girls.

no one goes to cons just for collecting shirts.. but this industry is moaning about the lack of females, no? so when they finally show up it is really nice when this kind of events actually acknowledges beforehand that this could happen. in other words, giving out shirts only for men sort of implies that there will be no women.

summing it up, great event :]
and here comes what we did there.


DIFFRAY VULNERABILITY RESEARCH



WHAT IS IT 

as i refused for ~5 months to come up with documentation, i guess, no finally it is the right time. DiffRay in short is a tool to diff Windows 7 and Windows 8 executables to spot missing security functions in an automated way.

if one can fiddle around with input values for an application, without that application checking for their validity, chances are high one can actually perform creative abuse on memory structures. the inclined reader understands potential impact of memory corruption. Microsoft does too, and thus came up with dedicated libraries that provide input checking functions to make it easy for developers to apply the right security check for a dedicated input value. namely these libraries are intsafe and strsafe. they provide APIs like ULongAdd or StringCchCopy, which do nothing more than checking if a given value stays within expected boundaries (more information on MSDN). 

following these functions are called 'safe functions'. we assume, that when in one version of a library such a safe function is applied while in the same piece of code of another version of that library no safe function is called - something is fishy. 

we perform the diffing on Windows libraries and drivers (.dll, .sys) in a very simple way. our approach is to decompile each binary, scan for safe functions and put every hit per API into a database. this way we simply count the hits for a specific safe function in a library function and diff it with the complementary library function of another Windows version. 

finally, if the hit counts differ we have a good chance, that some value in that library function at hand goes unchecked. we consider this a potential vulnerability.

HOW DOES IT WORK

the bare necessities
Python 2.7 32bit
pymssql 1.0.2 32bit for Python 2.7
PyQt 4 32bit 

DiffRay comes with two executables for decompilation of libs and drivers and a python application for parsing the spotted safe function hits into a database and for producing the final diffings. basically what we do is decompile the binaries to .c-files, so yeah we produce some sort of Windows OS source code :) 
you will need IDA Pro and the hexrays decompiler for this step. the .c-files are then parsed to a database. you can choose between sqlite or mssql; i highly recommend mssql. or you implement a DB handler of your choice, this is python!!

next step is the parsing. DiffRay parses either files or whole directories for symbols of safe functions. right now there are 130 symbols, they can be extended by just editing the signatures.conf. also, there is signature mapping if some safe function turns out to be equivalent to another one. we saw this in the past, but didn't yet come up with the right mappings. configuration could be achieved by editing the signature_mapping.conf in the form sig1=sig2, line by line.

once the parsing is finished (depending on the DB backend that could take a while) DiffRay can start with the diffing. the commandline instructions you need are listed in the slides. basically, diffing can happen via library id or via library name. the lib id way is not very handy when diffing various libraries. thus for automation i recommend using the name option and creating a batch file that feeds DiffRay with the library names and dumps the output to a directory of choice.

attention, for the name option the name should identify the Win7/Win8 versions, without extention! so e.g. kernel32 is fine, when there is a Win7 version and a Win8 version present.

the output then should be a bunch of files, preferably .csv, that contain data like this:

Function_Name Pattern Win8 Win7
EQoSDispatchIoctl StringCbLength 2 0
Ipv4SetEchoRequestCreate ULongAdd 4 0
Ipv6SetEchoRequestCreate ULongAdd 4 0
WfpAleAuditEvent StringCbCopy 2 1
WfpAleCaptureImageFileName StringCbCopy 1 2

from here on the researcher is on his own. now, get the libraries, open them in IDA and jump to the mentioned functions. good luck ;)

THE GUI 

for ease of use we decided we wrap this whole process up in a GUI. its designed in Qt, so very easy to build and modify - you can even change the colors if you don't like my chewing gum style. anyway, all the colors do make sense as all functions are integrated into one window:


yellow

configuration dialog. you HAVE to be connected to a database when you start parsing! you can configure credentials for mssql, there is nothing to configure for sqlite (always connects to the same sqlite db). the buttons CREATE DB and FLUSH DB at the moment actually do the same, dropping everything and creating a new db from scratch. via the configuration dialog you can edit signatures, mssql settings, mappings and logging.

blue

decompilation box. click around here to invoke the dll2idb.exe and idb2c.exe that should come with the python project. then watch the IDA Pro window pop up and down as it decompiles :)

pink

parsing box. you need python 2.7 for it to work. for anything to work.. you can either parse a file or an entire directory, make sure its all .c-files and to have the right operating system selection. parsing is done in separate processes, you can start on multiple directories at a time.

green

diffing box. either on library ids or on library names, as mentioned a name has to hit one Win7 and one Win8 library. sadly, we don't yet have a way to do batch via GUI. will come in the future.

grey

search box. check if a libname actually finds libraries or which lib got which id. or get all diffing info of one particular library.  

THE PROJECT itself

in total we decompiled and diffed more than 900 libraries of Win7 and Win8. the slides show some of the results, not all of them though. it is still a lot of work to actually check all the potential vulnerabilities and to evaluate if they are triggerable after all. a lot of false positives arise due to new code parts in Win8, manual checks on the Win7 side that have been replaced with a safe function or different naming of safe functions between both versions. 

apart from that, great fun. if you're hacker, bored, don't know what to do - start a joint project with someone else. you will learn what he knows, what you don't know and the other way round. both have expertise and great ideas, put them together and tadaaa you get twice as much of each. 

WHATS NEXT

well there is Windows 8.1 right? besides, there is a lot more we can parse for than symbols. there is actually a lot more we could do with the pseudo source code of windows... and it would actually be a bright idea to switch to IDAPython instead of decompiling stuff.

there has to be some bug fixing, some more logging, some more automation. there could be some machine learning element or integration of symbolic execution, that could add completely different maaaagic.. 

but that would be a different blog post.

Thursday, March 20, 2014

The Mystery of Anti-Debug by HeapAlloc

first of all, a big thank you to my friend moti who actually provided the final hints to solve that mystery and saved me A LOT of time googling heap structures. i wish everyone of you would have a moti when getting stuck in RE questions :)

to the story: meanwhile, in a zeus trojan. last week i peeked into a zeus just to really quickly 
stumble over an anti-debug trick. SURPRISE! kidding. 

that anti-debug is really easy to pass by, but wasn’t that trivial to explain; or at least that is what it seems like because i couldn’t find any suitable documentation. and this while the interwebz is full of malware reverser’s write ups.. kidding again.

so here we go...


in a nutshell: the malware would allocate heap memory and use the header of the heap entry as an indicator for an attached debugger. simple, after all. is that frequently used? i don’t know.

but, lets start at the end. the debugger would crash with an access violation when executing invalid arguments. those invalid arguments were produced by an unpacking routine, and it took some runs for brute forcing with IDA Stealth to find a possible root cause: NtGlobalFlag. well known you might think now, and indeed i had some people smiling at me with a yes my dear, thats not tricky at all. but i went on to at least spot the check for this flag, which as for example Mr. Yason described very nicely https://www.blackhat.com/presentations/bh-usa-07/Yason/Whitepaper/bh-usa-07-yason-WP.pdf indicates an attached debugger. guess what, didn’t find that check. 
 

in the first illustration you see the call that causes the exception: EnumWindows, that should call into a handler function, which actually is the unpacked code and turned out to be invalid when NtGlobalFlag is set.


Exception happening in EnumWindows handler function
so i held on to that exception and discovered that by turning IDA Stealth on and off the decryption key in the unpacking routine would change. gotcha, challenge accepted. so, investigating that key i eventually ended up in a piece of memory that preceded an allocated heap block. i provided my walkway up the unpacking routine in a screenshot doku, for people who like pictures just as much as me.


in the unpacking loop: bl being used to XOR the future code


up: ebx initialized with key_init


up: modifying key_init with esi


up: tweaking esi


up: grabbing esi initially from allocated_heap-8


root of all evil: HeapAlloc

so i ended up clueless inside of the RtlAllocateHeap function of ntdll.dll. in fact the value that the malware would grab from memory was initialized by RtlAllocateHeap and i admit it took some staring at the memory to accept the fact that it must be part of the header of the allocated heap block. the memory happened to always be allocated at 9A0688h; the value requested in the code therefor was 9A0680h. i could wonderfully watch the value at this offset turn from 03 to 01 when turning IDA Stealth, specifically NtGlobalFlag protection on or off. and this very value was the cause of crash.


Value in Question in the Memory Block Header

here is where moti jumped in and pointed me to the _HEAP_ENTRY structure, which assigns names to the values preceding my heap block:


The Memory Block Header

with data_offset-8 one hits exactly the size variable of that structure, which actually contains the size divided by 4 PLUS.. size of management information - like the header for example, which would explain a +1. or additional debug information, which could explain a +3. overall, the allocated buffer size in that particular case is always 14000h (it is dedicated to contain unpacked code later). the size value in _HEAP_ENTRY therefor is 2803h when a debugger is attached, or 2801h without a debugger or IDA Stealth activated.


Magic happened - Value changed!
so finalizing, when NtGlobalFlag indicates an attached debugger the heap manager understands this as a higher need for debug information so the allocated heap blocks are slightly bigger than without that flag set. this fact is used by the malware, as it uses the lower byte of the size value for calculation of the unpacking XOR key. 

for more information on heap structures check out this article which i found very informative
http://www.informit.com/articles/article.aspx?p=1081496
OR ask your own moti!

Tuesday, March 11, 2014

Bright Future Ahead


some weeks ago i was invited to talk at an austrian highschool, to a class of 18 year olds, about.. me. odd right? thing is, it were mostly girls in there and their teacher thought it would be a good idea to present them a -kindof- different perspective of future. and honestly, before that i didn't think i would ever serve as example for others. i'm not usually stepping up and taking a stage unless i'm asked to do so. then having all these big eyes on me.. unsettling. but not only after the talk i understood what it was actually about. never i was like 'i am a woman and can do cool stuff'. it was 'i do cool stuff. you should too.' you can figure out later that this is unusual for a girl, if you need to.

it in fact is easy to talk passionately about something you seriously think is cool. so thinking back that was the most fun talk i've given so far. about how i decided to study computer science, or why i started as a malware analyst, opportunities and obstacles, future plans and on how to choose ones dreams carefully.




HOW ONE DECIDES TO BECOME MALWARE ANALYST

honestly, it sounded like a cool idea. after studying information security none of the other options looked appealing, thinking back i can't even remember what they were about. so i started on a malware analysis project for my thesis and later got a job at our local anti-virus company.

now, 3 years later, what to say - i'm happy. i'm free and independent, love what i'm doing and got so many possibilities...

that screenshot you can see on slide number three shows IDA Pro, by far the most powerful and also my favorite tool in my analysis lab. i know, at first sight it looks terrible, but wait, could you believe that i had such a good time with IDA Pro inside a number of binaries; way more than i ever had using.. MS Visio? or Adobe InDesign? all a question of perception.

reversing is like building puzzles. a binary a big black box at first, but i promise as long as you don't give up you will reveal secret after secret and eventually end up with an 'UHH i understand this now'. once you have more practise you will experience more success in a day than most professional artists seem to have all their career long. true story. because every little 'uh i understand this now' feels awesome. reversing looks to me as an art on its own, but a determinable one. and one that, on average, pays better.

a binary can just only work a certain way. even the sophisticated advanced ones are never as complicated as dealing with humans. there is always a solution for any problem.

A QUESTION OF PERCEPTION

i think, a lot of technological studies have questionable reputation - because they are perceived the wrong way. economy, philosophy, politics, multi media design and therelike are topics that a first world human being experiences every day. processor design, electrical engineering, structured programming or mathematical equations just don't appear, ever, outside of a classroom or a lab. ordinary humans tend to fear the unknown. so why would a youngster, especially a female one whos not even supposed to like tech, out of the blue decide he, or she, wants to understand machine level code?

a similar thought on talent. we are successful in things that we are good at. we are goon in things that we practise. i believe in practise, more than in natural talent. but we all practise a lot what we like, and we tend to like things that we are good in. which, if you think about it, is a circle of like - practise - like more - be genius - practise more.

so concluding, what do you think you're good at - and are you sure there's nothing else? i did my own case study on that theory, unintentionally.  when i was 17 i did my driving license exam, and i was terrible in parking cars. i somehow made it through that exam and decided i would just never park anything again unless it was unavoidable. then i went on to university, public transport was sparse, but so were the parking spaces around the campus. and every year it seemed there were more and more cars and parking lots would  become smaller and smaller. so my situation was clear, park that small car of mine into ridiculous corners or walk a long distance to the campus. finally, i rather learned to park than to walk...

beautiful end, after 5 years of ridiculous daily parking i figured i could fit my renault into any space that was just an inch bigger than the renault itself.

so again, do you think there is something you are not good in, and are you sure you don't want to change that? free after einstein, saying something is too complicated just means you don't understand it well enough. plus back to the ladies. driving cars is not a male talent; i'm SURE they just practised harder. i wonder how many females actually did get an electric toy car at age 3. like my older brother did.

CLOSING THE GAP

now finally, let me get back to the binaries. why become a malware analyst, if you still don't like binary, is actually easy. more jobs, more money, faster career, more freedom. and if you have money and freedom, you are actually more likely to get what you want after all. go figure.

apart from that, you will have fun trust me. you will receive ridiculous appreciation, for doing something that others, even men, are afraid of because they just don't understand it well enough. you will very often stand out and be better than others, because reverse engineering like most engineering fields just doesn't have so much competition going on like.. marketing. thereafter, you will experience less discrimination as in competitive fields. you will meet a number of very bright and interesting personalities, which is within the most beautiful aspects of this job. you will face an incredible diversity of people and tasks and lots of neverending challenges that remodel your own personality. 

EXERCISES

so now, if your fingers are already burning, i added some links in the slide set above where you will find homework. if you're still scared, contact me and i will help.

but if you're still not sure what to dream about - as long as you define your success by your own achievements you should be fine. when looking back and finding there is nothing to regret, you did something right.

closing this post i want to quote a card my brother (!) has lying around in his car (!).
DON'T BELIEVE EVERYTHING YOU THINK.
cheers.

Saturday, February 22, 2014

Dissection Is My Hobby: Upatre Insights

i found the biggest problem to face with actual all my projects is not necessarily that i lack the idea of what i want to do, but that i lack documentation on how to do it and then go and have to figure out myself. would that be nice if someone had just mounted a page like.. malware reverser's frequently asked questions, arrrr well never happened.

now im not going to start a FAQ page, but in order to help that situation i produced detailed documentation of my latest reversing project

if you read this because you want to know about malware you will be a bit disappointed probably. mainly because the purpose of that malware itself is not so exciting at all. it just downloads.. stuff. but also because i myself focused not directly on the final executable but on the stony path it takes to get there. it is just awkwardly fascinating to watch malware shift bytes around in memory and trying to escape. i would recommend everyone just slightly interested to try it out himself, the according sample hashes for identification are listed in the write up doc.

for the records, the malware is detected as TrojanDownloader.Win32.Upatre, the full report can be downloaded at https://drive.google.com/file/d/0B9Mrr-en8FX4MS1HdjBjNEhYWk0/edit?usp=sharing, on a summary i will try now here. could get dirty though.

FUNCTIONALITY 

the analyzed sample is a malicious downloader with the sole purpose to connect to a remote C&C when invoked and to download and execute additional malware. it communicates via HTTPS to one of two hardcoded domains, which are believed to be legitimate websites on compromised web servers. malware execution can be parted in a protection layer, an unpacking layer with different stages and the final payload. For an initial infection the malware just copies its own image to the systems %TEMP% directory and executes that copy.

PROTECTION LAYER

the malware possesses a neat collection of anti-analysis tricks, none of them highly-sophisticated but very nice for learning purposes. 

anti-simulation
 
the first one is an anti-simulation trick targetting anti-virus simulation engines by the use of a multimedia API as seen in the picture. acmMetrics is an API call present in the msacm32.dll library. usually it is used for retrieving metrics for ACM objects (Audio Compression Manager). during the startup procedure of a malware sample it is highly likely that this was not the initial intention when placing that call. acmMetrics is part of the multimedia library since at least Microsoft Windows 2000 (according to Microsoft documentation) and in this special case called to trick AV simulation engines.

in our case acmMetrics is expected to deliver an error message for an invalid handle, which is not surprising given that the handle parameter is not initialized beforehand. in case the return value is not MMSYSERR_INVALIDHANDLE, code 5, execution continues to access the memory referenced by edx, which at this point always results in a memory access violation. edx is not initialized thus set to zero. 

the point of this check is, on a normal operating system like Windows 2000 or newer this function returns 5 in any case. Simulator engines usually don’t support media APIs due to overhead, therefor either crash on the call or later on the access violation.

implicit breakpoint detection

the protection layer performs minor decryption of a part of its own code, which results in implicit breakpoint detection. the decryption consists in subtracting a key from every opcode of a given section. the simple decryption routine iterates code on the position 40100Fh, where execution continues later on. If a software breakpoint is placed in the section to be decrypted the routine produces invalid opcodes and the malware crashes later on.

window confusion

At the end of what could be classified as protection layer stage one the malware invokes CreateWindowExA with a provided WndClass Structure. This structure defines the handler function of the dummy window, which will execute the second part of the protection layer. The created window has no graphical representation, thus can’t be seen so just only serves for executing said handler function. If the analyst does not recognize the switch of execution to the handler function and places according breakpoints control of the debugger will be lost.  


broken timing defence

interesting in the next section of the protection layer is a rdtsc-triggered timing defense. malware can utilize the system time to verify if a debugger, including a human analyst, is attached to the running process. windows offers various mechanisms to request the system time, most commonly used are rdtsc or the GetTickCount system call.  for detecting an attached debugger/human malware wants to know the time difference between two time stamps, namely if the delta is too big as if the CPU would execute without interruption.

the malware at hand issues two rdtsc instructions, wrapped around the decryption loop. the delta is calculated immediately afterwards, but never checked against any threshold. instead it is kept in eax until the next system call overwrites it with its return value. no other verification could be found, this anti-debugging trick is either broken or the first timestamp servers a different purpose that could not be identified. 



more multimedia disturbance

the windows media library is used a second time as a means of protection from analysis. the malware issues a call to mciSendStringA with the command “set waveaudio door open”. it is not perfectly clear what purpose the command “set waveaudio door open” usually fulfills, but without doubt the aim of the malware at hand is not to interfere with multimedia devices. an effect of mciSendStringA is that it starts up two additional threads for interaction with devices – the analyst could lose control of the debugger when inappropriately configured. a solution is to configure the debugger to stop on the start-up of a new thread, step back to the original code and continue execution until it returns to the malware code. 

UNPACKING LAYER

after bypassing all the protection mechanisms the unpacker executes without problems. unpacking can be parted in three steps: 
  • one that compresses and decrypts the packed payload data
  • second one that decompresses the same data using RtlDecompressBuffer
  • third one which performs checks on the unpacked binary, patches function call offsets and reconstructs the import address table (IAT). 
details on the unpacking routine, including the IAT reconstruction, can be found in the report mentioned above. any critics on that analysis are welcome :) ou.. how did that sample actually catch my attention? it was part of a mean malware spam wave, that has been ongoing in austria for at least since november. samples are all the same size and have very similar protection/unpacking mechanisms. so some future research would be to look at the other ~100 related malwares that i have and correlate similarities on binary level. maybe..

future shines bright, you know?