Introduction to Malware Analysis: Triage & x86 Architecture

What this lesson covers

Malware analysis exists because every day, hundreds of thousands of new samples reach corporate networks, government agencies, and personal machines. Anti-virus signatures buy time, but they do not answer the question that matters most when an incident occurs: what is this file doing on my system, and what has it already done? Answering that requires reading the binary itself.

This lesson lays the groundwork for every video that follows. We start with what malware is and how analysts categorise it without getting lost in taxonomy. We then introduce the three-question triage that separates urgent threats from routine samples. The second half is purely technical: x86 architecture, the registers an analyst must know, the stack, and why static and dynamic analysis are complementary rather than alternative.

Defining malware honestly

The word malware is short for malicious software. The definition is deliberately broad — a twenty-line batch script that wipes a directory qualifies, and so does a state-sponsored implant that survives firmware reflashes and hard disk replacements. What makes malware analysis worth studying is not the definition itself but the variety of what an analyst encounters in practice.

Every sample tells a story: who built it, what they wanted, how they tried to conceal it, and where they made mistakes. The analyst's job is to read that story from the binary.

A practical taxonomy, not a textbook one

Malware is classified by what it does and how it spreads. In practice, these categories overlap constantly.

WannaCry (2017) was simultaneously ransomware and a worm.
Emotet started life as a banking trojan, evolved into a spam botnet, and eventually operated as a dropper-for-hire platform.
TrickBot functioned as a downloader, a credential stealer, and a lateral movement framework depending on which modules the operator loaded at any given time.

The value of taxonomy is triage speed. When an analyst sees vssadmin delete shadows /all in a Procmon log, the word ransomware triggers an immediate containment playbook. When the same analyst sees outbound SMTP carrying Base64-encoded attachments, the word keylogger narrows the investigation to credential theft. Classification is not about putting samples into neat boxes; it is about knowing which box tells you what to do next.

The three-question triage

Every sample, before any deep analysis begins, gets these three questions answered:

What is this file? File type, hash, code-signing status, compile timestamp, suspected family.
Is it suspicious? Public threat intel hits, structural anomalies (high entropy, truncated imports, non-standard sections), behavioural red flags during a brief sandbox detonation.
What can we learn before running it? Strings, embedded resources, imported APIs, persistence indicators visible in the binary.

A good analyst can complete this triage in under thirty minutes for a typical sample. The goal is not to solve the case — it is to decide whether the sample warrants the next four hours of deep work.

Static vs dynamic analysis

These are the two foundational approaches, and they are not alternatives. They are complementary.

Static analysis examines the binary without running it. We compute its hash, parse the PE structure, list its imports, extract its strings, and disassemble its code. The advantage is safety: nothing executes. The limitation is that packed and obfuscated samples reveal nothing useful until they unpack themselves at runtime.

Dynamic analysis runs the sample inside an isolated VM and observes what happens. We monitor file-system changes with Procmon, capture network traffic with Wireshark or FakeNet, watch process activity with Process Hacker, and dump memory at strategic points. The advantage is direct observation of behaviour. The limitation is that defended malware can detect the lab environment and refuse to execute.

A complete analysis uses both. Static analysis tells us where to look during dynamic analysis. Dynamic analysis tells us what the static code actually does at runtime when packers and obfuscation peel away.

x86 — only what an analyst needs

A complete x86 reference would be a textbook of its own. For malware analysis, we need a working knowledge of registers, the stack, and a handful of common instructions. Everything else can be learned in context.

General-purpose registers

Registers are storage locations inside the CPU. Accessing a register takes one clock cycle; accessing RAM takes hundreds. The register state at any point tells the analyst what the program is doing right now.

Register	Convention	Why it matters in malware
`EAX`	Return values	After `CALL`, the return value sits here. `IsDebuggerPresent` returns 1 or 0 in EAX.
`EBX`	General storage	Callee-saved. Holds values that persist across function calls.
`ECX`	Loop counter	`LOOP` and `REP` auto-decrement ECX. XOR decryption loops store buffer length here.
`EDX`	I/O, overflow	Combined with EAX for 64-bit `MUL`/`DIV` results.
`ESP`	Stack top pointer	Push decrements, pop increments. Always points to the last item pushed.
`EBP`	Frame anchor	Local variables at `[EBP-N]`, arguments at `[EBP+N]`.
`ESI`	Source index	Source for `REP MOVSB` memory copies.
`EDI`	Destination index	Destination for memory copies.

Two special registers deserve their own line:

EIP (Instruction Pointer) — the address of the next instruction. Cannot be set with MOV. Changes only via JMP, CALL, RET, and conditional jumps. Controlling EIP is the goal of most exploitation techniques.
EFLAGS — a 32-bit register where individual bits are flags set by arithmetic and comparison operations. Conditional jumps read these flags. Malware uses them for runtime decisions, including anti-debugging checks.

The stack and the function frame

The stack grows downward in memory. A typical function does this on entry:

push  ebp           ; save caller's frame pointer
mov   ebp, esp      ; establish new frame
sub   esp, 0x20     ; reserve 32 bytes of local space

After this prologue, local variables are accessed via [EBP-N] (further from EBP = newer locals) and arguments via [EBP+N] (8 = first arg, 12 = second arg, etc., on x86 32-bit cdecl). Recognising prologues and epilogues is a fundamental skill — they delineate function boundaries in stripped binaries.

What you should be comfortable with after this lesson

Defining malware, and recognising why categories blur in real samples
Running the three-question triage on a sample you have never seen before
Naming each general-purpose x86 register and stating one common use
Identifying a function prologue in disassembled code

The next lesson takes the same foundations deeper, with worked examples from real samples.

Introduction to Malware Analysis: Triage & x86 Architecture

What this lesson covers

Defining malware honestly

A practical taxonomy, not a textbook one

The three-question triage

Static vs dynamic analysis

x86 — only what an analyst needs

General-purpose registers

The stack and the function frame

What you should be comfortable with after this lesson

References

Exercises

Hash a known-good binary

Run the three-question triage

Identify a function prologue