EpochZero Learn
EpochZero LearnMulti-Domain Tech Learning Hub
All videos
Ep. 1foundations

Introduction to Malware Analysis: Triage & x86 Architecture

8 May 20264 views

Why malware analysis is a discipline, the three-question triage every analyst runs, and a clean introduction to x86 registers, the stack, and the difference between static and dynamic analysis.

What this lesson covers

Malware analysis exists because every day, hundreds of thousands of new samples reach corporate networks, government agencies, and personal machines. Anti-virus signatures buy time, but they do not answer the question that matters most when an incident occurs: what is this file doing on my system, and what has it already done? Answering that requires reading the binary itself.

This lesson lays the groundwork for every video that follows. We start with what malware is and how analysts categorise it without getting lost in taxonomy. We then introduce the three-question triage that separates urgent threats from routine samples. The second half is purely technical: x86 architecture, the registers an analyst must know, the stack, and why static and dynamic analysis are complementary rather than alternative.

Defining malware honestly

The word malware is short for malicious software. The definition is deliberately broad — a twenty-line batch script that wipes a directory qualifies, and so does a state-sponsored implant that survives firmware reflashes and hard disk replacements. What makes malware analysis worth studying is not the definition itself but the variety of what an analyst encounters in practice.

Every sample tells a story: who built it, what they wanted, how they tried to conceal it, and where they made mistakes. The analyst's job is to read that story from the binary.

A practical taxonomy, not a textbook one

Malware is classified by what it does and how it spreads. In practice, these categories overlap constantly.

  • WannaCry (2017) was simultaneously ransomware and a worm.
  • Emotet started life as a banking trojan, evolved into a spam botnet, and eventually operated as a dropper-for-hire platform.
  • TrickBot functioned as a downloader, a credential stealer, and a lateral movement framework depending on which modules the operator loaded at any given time.

The value of taxonomy is triage speed. When an analyst sees vssadmin delete shadows /all in a Procmon log, the word ransomware triggers an immediate containment playbook. When the same analyst sees outbound SMTP carrying Base64-encoded attachments, the word keylogger narrows the investigation to credential theft. Classification is not about putting samples into neat boxes; it is about knowing which box tells you what to do next.

The three-question triage

Every sample, before any deep analysis begins, gets these three questions answered:

  1. What is this file? File type, hash, code-signing status, compile timestamp, suspected family.
  2. Is it suspicious? Public threat intel hits, structural anomalies (high entropy, truncated imports, non-standard sections), behavioural red flags during a brief sandbox detonation.
  3. What can we learn before running it? Strings, embedded resources, imported APIs, persistence indicators visible in the binary.

A good analyst can complete this triage in under thirty minutes for a typical sample. The goal is not to solve the case — it is to decide whether the sample warrants the next four hours of deep work.

Static vs dynamic analysis

These are the two foundational approaches, and they are not alternatives. They are complementary.

Static analysis examines the binary without running it. We compute its hash, parse the PE structure, list its imports, extract its strings, and disassemble its code. The advantage is safety: nothing executes. The limitation is that packed and obfuscated samples reveal nothing useful until they unpack themselves at runtime.

Dynamic analysis runs the sample inside an isolated VM and observes what happens. We monitor file-system changes with Procmon, capture network traffic with Wireshark or FakeNet, watch process activity with Process Hacker, and dump memory at strategic points. The advantage is direct observation of behaviour. The limitation is that defended malware can detect the lab environment and refuse to execute.

A complete analysis uses both. Static analysis tells us where to look during dynamic analysis. Dynamic analysis tells us what the static code actually does at runtime when packers and obfuscation peel away.

x86 — only what an analyst needs

A complete x86 reference would be a textbook of its own. For malware analysis, we need a working knowledge of registers, the stack, and a handful of common instructions. Everything else can be learned in context.

General-purpose registers

Registers are storage locations inside the CPU. Accessing a register takes one clock cycle; accessing RAM takes hundreds. The register state at any point tells the analyst what the program is doing right now.

RegisterConventionWhy it matters in malware
EAXReturn valuesAfter CALL, the return value sits here. IsDebuggerPresent returns 1 or 0 in EAX.
EBXGeneral storageCallee-saved. Holds values that persist across function calls.
ECXLoop counterLOOP and REP auto-decrement ECX. XOR decryption loops store buffer length here.
EDXI/O, overflowCombined with EAX for 64-bit MUL/DIV results.
ESPStack top pointerPush decrements, pop increments. Always points to the last item pushed.
EBPFrame anchorLocal variables at [EBP-N], arguments at [EBP+N].
ESISource indexSource for REP MOVSB memory copies.
EDIDestination indexDestination for memory copies.

Two special registers deserve their own line:

  • EIP (Instruction Pointer) — the address of the next instruction. Cannot be set with MOV. Changes only via JMP, CALL, RET, and conditional jumps. Controlling EIP is the goal of most exploitation techniques.
  • EFLAGS — a 32-bit register where individual bits are flags set by arithmetic and comparison operations. Conditional jumps read these flags. Malware uses them for runtime decisions, including anti-debugging checks.

The stack and the function frame

The stack grows downward in memory. A typical function does this on entry:

push  ebp           ; save caller's frame pointer
mov   ebp, esp      ; establish new frame
sub   esp, 0x20     ; reserve 32 bytes of local space

After this prologue, local variables are accessed via [EBP-N] (further from EBP = newer locals) and arguments via [EBP+N] (8 = first arg, 12 = second arg, etc., on x86 32-bit cdecl). Recognising prologues and epilogues is a fundamental skill — they delineate function boundaries in stripped binaries.

What you should be comfortable with after this lesson

  • Defining malware, and recognising why categories blur in real samples
  • Running the three-question triage on a sample you have never seen before
  • Naming each general-purpose x86 register and stating one common use
  • Identifying a function prologue in disassembled code

The next lesson takes the same foundations deeper, with worked examples from real samples.

Section 03

References

Section 04

Exercises

EX.01easy

Hash a known-good binary

Compute the MD5, SHA-1, and SHA-256 hashes of notepad.exe from a clean Windows VM. Submit the SHA-256 to VirusTotal. Note how many engines flag it (zero, in normal circumstances).

EX.02easy

Run the three-question triage

Pick any sample from MalwareBazaar (filter by tag — try njRAT). Without running it, answer: what is this file, is it suspicious, and what can you learn before running it?

EX.03medium

Identify a function prologue

Open any 32-bit Windows executable in Ghidra. Locate the entry point and identify the function prologue (push ebp ; mov ebp, esp). Then find any function it calls and identify its prologue.