The Malware Analysis Pipeline: Static & Dynamic Techniques

The pipeline at a glance

A working malware analyst is not measured by clever insights. They are measured by reproducibility. Given the same sample twice, on different days, the same pipeline should produce the same verdict. This lesson defines that pipeline end to end.

Stage 1 — Identification and hashing

The first action on any suspect file is to compute its cryptographic hash. The hash is the file's fingerprint. Two files with the same SHA-256 hash are, for all practical purposes, identical.

$ sha256sum invoice.exe
a1b2c3d4...  invoice.exe

After computing the hash, query it against VirusTotal, MalwareBazaar, or your organisation's internal threat intelligence platform. If the hash is already known, the report from previous analyses saves time.

Fuzzy hashing with ssdeep. Standard cryptographic hashes change completely if a single byte differs. ssdeep computes a context-triggered piecewise hash that produces similar outputs for similar files. A similarity score above 50% typically indicates a related sample — useful when an attacker has tweaked a known sample to avoid signature matching.

Stage 2 — Static properties of the PE

The Portable Executable format is used by Windows .exe, .dll, .sys, .scr, .cpl, .ocx, .efi, and .drv files. All share the same overall structure: a DOS stub, an NT header, section headers, and the sections themselves (commonly .text for code, .data for variables, .rdata for read-only data, .rsrc for embedded resources).

Tools to use at this stage: PE-bear, PEStudio, Detect It Easy (DIE), and pefile (Python).

What to look for:

Compile timestamp. Forged or bizarre values (1970, 2099) indicate tampering.
Section names. Standard compilers produce .text, .data, .rdata, .rsrc. Anything else is a flag.
Section permissions. A section that is both writable and executable is suspicious; legitimate code is rarely both.
Entropy. Shannon entropy above 7.0 (on a 0–8 scale) in .text typically indicates packed or encrypted code. Normal code sits between 5.0 and 6.5.

Import Address Table analysis

The IAT lists every Windows API the binary will call. It is the single most useful signal for predicting behaviour without running anything.

Imports you see	Likely behaviour
`URLDownloadToFile`, `WinHttpOpen`, `InternetOpenUrl`	Network communication, possibly C2 or downloader
`RegSetValueEx`, `RegCreateKeyEx` under HKLM\Run	Persistence via registry
`CreateRemoteThread`, `WriteProcessMemory`, `VirtualAllocEx`	Code injection
`CryptEncrypt`, `CryptGenRandom`, `CryptHashData`	Cryptography — possibly ransomware
`SetWindowsHookEx`, `GetAsyncKeyState`	Keylogging
Only `LoadLibraryA` and `GetProcAddress`	Almost certainly packed

A truncated IAT — only LoadLibraryA and GetProcAddress — is itself a strong indicator. The unpacking stub uses these two functions to resolve all other imports at runtime, after the payload has been decoded into memory.

String analysis

Run strings on the binary, or use FLOSS for stack-allocated strings missed by classic strings. Look for URLs, IP addresses, registry paths, file paths, mutex names, and obvious indicator words (encrypt, bitcoin, error messages from C2 frameworks).

Stage 3 — Detonation in an isolated VM

Static analysis answers what could this do. Dynamic analysis answers what does it actually do.

The lab VM must be isolated from the host network and the internet. The standard pattern: a Host-Only network adapter on the VM, a separate gateway VM running INetSim or FakeNet-NG that simulates DNS, HTTP, and FTP. The malware sees what looks like the internet but stays contained.

Workflow:

Snapshot the VM. Revert to the clean state before every new sample.
Start monitoring tools. Procmon (filtered by process name), Process Hacker, Wireshark or FakeNet-NG.
Execute the sample. Double-click or run from the command line with any required arguments.
Wait and interact. Some malware requires user interaction or delays. Wait at least 2–5 minutes.
Collect artefacts. Export Procmon logs, save PCAP files, note new processes, files, registry keys, mutexes.
Analyse the artefacts. Correlate file-system drops with network traffic. Identify persistence mechanisms.
Revert the VM. Return to the clean snapshot.

Always verify network isolation before detonation. A misconfigured adapter can leak the malware to your production network — or alert the attacker that their sample is being analysed.

Stage 4 — Identifying persistence mechanisms

Most malware wants to survive a reboot. The major persistence mechanisms on Windows:

Mechanism	Where to look
Run keys	`HKCU\Software\Microsoft\Windows\CurrentVersion\Run`, same under HKLM
Scheduled tasks	`schtasks /query /v` or `\Windows\System32\Tasks\`
Services	`sc query type= service`
Startup folder	`%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup`
WMI subscriptions	`Get-WmiObject -Namespace root\Subscription -Class __EventConsumer`
Image File Execution Options (IFEO)	`HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\<exe>`
Bootkit / MBR	Forensic image of the disk; not visible at OS level

A 30-second autoruns.exe run from Sysinternals catches the first six categories.

Stage 5 — Reporting

The output of the pipeline is a report, not a conclusion. The report contains: hashes, file metadata, IAT summary, key strings, behavioural observations, network indicators (domains, IPs, URLs, ports), file-system indicators (dropped paths, mutex names), persistence mechanism, and a verdict band (clean / suspicious / malicious).

A good report lets the next analyst — or the same analyst three months later — verify or contest each finding without re-running the pipeline.

What you should be comfortable with after this lesson

Running the full pipeline end to end on a fresh sample
Deriving a list of likely behaviours from the IAT alone
Configuring an isolated detonation VM with an INetSim or FakeNet gateway
Identifying the most common Windows persistence mechanisms in under 60 seconds

The Malware Analysis Pipeline: Static & Dynamic Techniques

The pipeline at a glance

Stage 1 — Identification and hashing

Stage 2 — Static properties of the PE

Import Address Table analysis

String analysis

Stage 3 — Detonation in an isolated VM

Stage 4 — Identifying persistence mechanisms

Stage 5 — Reporting

What you should be comfortable with after this lesson

References

Exercises

Run the full pipeline on a benign binary

Predict behaviour from the IAT

Catch a persistence mechanism