Why Anti Cheats Walk Your Call Stack
Every function call on x64 leaves a return address on the stack. By the time code invokes NtReadVirtualMemory or GetAsyncKeyState, that stack contains a complete record of how execution arrives there: the originating module, the runtime, the loader, all the way back to RtlUserThreadStart. Anti cheats capture this chain and ask one question: does this call originate from code that should exist? The answer requires no signature database, no memory scan, no hook. The call stack is the single most information-dense artifact available at the point of any Application Programming Interface (API) call, revealing the origin address, the owning module, whether the memory is file-backed, and whether the return addresses form a plausible execution path. Faking all of these properties simultaneously is genuinely hard, which is why stack walking is the primary heuristic detection vector across BattlEye, Easy Anti-Cheat (EAC), and most modern Endpoint Detection and Response (EDR) products.
This piece covers how the x64 unwinding machinery works, what specific properties anti cheats validate against each stack frame, how kernel drivers capture stacks at privilege levels that user mode code cannot interfere with, what BattlEye and EAC each do differently, and where current evasion techniques fall short.
Provenance
Cheat development moves steadily toward manual mapping, direct syscalls, and kernel mode execution. These techniques avoid leaving traditional artifacts in memory, but they do not eliminate the call stack. Anti cheats need a detection primitive that does not rely on finding injected code through module enumeration or signature matching. Stack walking fills that role. secret.club’s 2020 reverse engineering of BattlEye’s shellcode and multiple independent analyses of EAC’s kernel driver form the public evidence base for the detection mechanisms described here.
x64 stack unwinding
On x86, stack unwinding follows frame pointer chains: walk EBP, read the saved EBP, repeat. Windows x64 abandons that model entirely. Every function in a Portable Executable (PE) image has a RUNTIME_FUNCTION entry in the .pdata section that statically describes its stack layout:
typedef struct _RUNTIME_FUNCTION {
DWORD BeginAddress; // RVA of function start
DWORD EndAddress; // RVA past function end
DWORD UnwindData; // RVA to UNWIND_INFO
} RUNTIME_FUNCTION;
The UNWIND_INFO structure tells the unwinder exactly what the function prologue does: which registers it pushes, how much stack space it allocates, whether it sets up a frame pointer. Each operation gets an UNWIND_CODE entry:
UWOP_PUSH_NONVOL- pushes a non-volatile registerUWOP_ALLOC_SMALL- allocates up to 128 bytes of stackUWOP_ALLOC_LARGE- allocates more than 128 bytesUWOP_SET_FPREG- establishes a frame pointer
Replaying these codes in reverse, the unwinder calculates the exact frame size and locates the return address for any function. No RBP chain required.
The unwinding loop
The actual traversal calls two functions in a loop. RtlLookupFunctionEntry finds the RUNTIME_FUNCTION for a given instruction pointer. RtlVirtualUnwind applies the unwind codes to compute the caller’s context. If no RUNTIME_FUNCTION exists (the function is a leaf that never modifies the stack), the return address sits at [RSP].
RtlCaptureContext(&context);
while (context.Rip != 0) {
PRUNTIME_FUNCTION fn = RtlLookupFunctionEntry(
context.Rip, &imageBase, NULL
);
if (!fn) {
// leaf function - return address is at [RSP]
context.Rip = *(ULONG64*)context.Rsp;
context.Rsp += 8;
} else {
RtlVirtualUnwind(
UNW_FLAG_NHANDLER, imageBase,
context.Rip, fn, &context,
&handlerData, &establisherFrame, NULL
);
}
capturedStack[depth++] = context.Rip;
}
Anti cheats use this full unwind machinery, not RtlCaptureStackBackTrace (a convenience wrapper). The full path yields return addresses, frame sizes, and module attribution for every frame on the stack.
All of this metadata is static. It is baked into the PE at compile time. The unwinder does not need to execute code or trust runtime state. It reads .pdata and reconstructs the call history deterministically. That property is what makes it so useful for detection: it works even if the target code does everything it can to hide.
Module backing validation
The most basic check asks whether each return address resolves to a loaded module. On Windows, MEM_IMAGE memory is backed by a file on disk, meaning it comes through the standard image loader. Memory from VirtualAlloc or manual mapping is MEM_PRIVATE.
A return address landing in MEM_PRIVATE memory means the code is not loaded through normal channels. This catches injected code, shellcode, and manually mapped modules immediately. It is also the entire reason module stomping exists as a technique: overwriting the .text section of a legitimately loaded Dynamic Link Library (DLL) so return addresses resolve to a real module.
Truncated stack detection
A legitimate user mode call stack unwinds all the way back to RtlUserThreadStart (or BaseThreadInitThunk). If the unwinder hits something it cannot parse and the stack ends early, that signals corrupted frame data, a stack pivot, or synthetic frames that do not chain correctly.
This check is effective against spoofing implementations that only fix the immediate return address but leave the rest of the stack broken.
Gadget detection
A common return address spoofing technique uses Jump-Oriented Programming (JOP) gadgets, instructions like jmp qword ptr [rbx], as fake return addresses. After the target function returns, control flows through the gadget back to the caller’s fixup code via a register.
BattlEye directly targets this. Their exception handler checks for opcode FF 23 (jmp qword ptr [rbx]) at return addresses:
const auto spoof = *(_WORD *)caller_function == 0x23FF; // jmp qword ptr [rbx]
The original technique still works with a different gadget. Whether that means a different register or an entirely different instruction pattern does not matter. But the signature shows that anti cheat developers watch what the community publishes and write detections for specific techniques.
More sophisticated detection scans the instruction before each return address looking for a call. Legitimate return addresses always follow a call that places them on the stack. If the byte sequence before a return address is not some form of call, the frame is almost certainly synthesized. This catches most basic return address spoofing regardless of which gadget the attacker picks.
Impossible call sequences
Even if every return address points into a real module, the sequence of calls needs to make logical sense. KERNELBASE!PathReplaceGreedy calling KERNELBASE!SystemTimeToTzSpecificLocalTimeEx does not happen in any legitimate code path. Synthetic stacks that pick arbitrary return addresses from legitimate modules can accidentally create function-to-function relationships that are structurally valid but logically impossible.
Checking this properly requires knowledge of legitimate call graphs, which is expensive to build and maintain. Most anti cheats do not do deep call graph validation yet. Elastic’s EDR does, and the anti cheat space tends to adopt EDR techniques with a few years’ delay.
Stack pivot detection
The Thread Environment Block (TEB) contains stack boundaries for each thread: StackBase and StackLimit. If RSP falls outside these bounds, the stack has been pivoted to attacker-controlled memory, typically heap or global data used for Return-Oriented Programming (ROP) chains.
PTEB teb = NtCurrentTeb();
if (rsp < teb->NtTib.StackLimit || rsp > teb->NtTib.StackBase) {
// stack pivot detected
}
Windows Defender Exploit Guard hooks VirtualAlloc and VirtualProtect specifically to run this check on every call. It is simple, cheap, and effective. The only real bypass requires the ROP chain to execute on the thread’s actual stack, which constrains what an attacker can do.
Asynchronous Procedure Call stack walking
Everything above happens in user mode. Anti cheats with kernel drivers have much stronger tools, and this is where evasion becomes genuinely difficult.
The straightforward kernel approach queues an Asynchronous Procedure Call (APC) to the target thread. When it executes, it runs in the thread’s context with access to the full stack. Calling RtlCaptureStackBackTrace inside the APC captures the thread’s call stack as it exists when the APC fires.
Both BattlEye and EAC use this. The limitation is that APCs can be blocked. Threads running with kernel APCs disabled do not execute the APC until re-enablement. Cheat developers discover this quickly.
Non-Maskable Interrupt callbacks
Non-Maskable Interrupts (NMIs) solve the APC problem. Nothing blocks them. That is the entire point.
By registering an NMI callback via KeRegisterNmiCallback and periodically sending NMIs to each CPU core, anti cheats capture the processor state (RIP, RSP, full register context) of whatever is running when the interrupt fires. If the interrupted RIP points outside any loaded kernel module, unsigned code is executing.
The NMI approach is probabilistic. Any single NMI might not catch execution in the right place. But NMIs are cheap to send, and over the course of a match, regular NMIs statistically catch a thread executing in unsigned memory. It is only a matter of time. This technique makes hiding kernel mode code execution fundamentally hard. There is no flag to set, no callback to unhook, no state to manipulate that prevents the NMI from firing.
Instrumentation callbacks
EAC takes a different approach for user mode syscall monitoring. The driver sets the InstrumentationCallback field in KPROCESS via NtSetInformationProcess. When a syscall returns to user mode, the kernel checks this field. If set, it redirects execution to the callback instead of the original return address. The original address goes into R10.
The callback validates that R10 points into ntdll.dll or win32u.dll, the only modules that should contain the syscall instruction on a legitimate call path:
// in the instrumentation callback
PVOID returnAddr = (PVOID)teb->InstrumentationCallbackPreviousPc; // R10
if (!isInModule(returnAddr, "ntdll.dll") &&
!isInModule(returnAddr, "win32u.dll")) {
// direct syscall detected
}
This catches direct syscalls without any kernel hooks on the syscall path itself. No inline hooks on ntdll functions, no System Service Descriptor Table (SSDT) modifications. A callback fires on every syscall return. It is hard to bypass from user mode because the redirect happens at kernel privilege.
Event Tracing for Windows Threat Intelligence
The Event Tracing for Windows (ETW) Threat Intelligence (TI) provider ({09997CFF-4065-4844-BE7E-2A2DEEAD0D02}) is only accessible to Early Launch Anti-Malware (ELAM) signed drivers. It provides kernel level visibility into sensitive operations (executable memory allocation, cross-process memory writes, thread context changes, APC insertion) with optional stack capture on every event.
When EVENT_ENABLE_PROPERTY_STACK_TRACE is enabled, each event includes the full call stack collected via RtlWalkFrameChain. Every NtWriteVirtualMemory, every NtProtectVirtualMemory, every NtCreateThreadEx comes with a complete record of who calls it and through what code path.
Elastic’s EDR uses this to tag 21 behavioral indicators on ETW events: direct syscalls (no ntdll in the stack), ROP gadgets in the call chain, unbacked executable memory, proxy calls through unexpected modules, and more. The anti cheat equivalents are less publicly documented, but the infrastructure is identical.
BattlEye’s shellcode-based tripwires
BattlEye’s user mode stack walking is well-documented thanks to secret.club’s research. BattlEye dynamically streams a roughly 8KB shellcode to the game process during competitive matches, what secret.club calls “shellcode8kb”. This shellcode sets up a Vectored Exception Handler (VEH) and places INT3 breakpoints on 14 commonly used functions:
GetAsyncKeyState, GetCursorPos, IsBadReadPtr, NtUserGetAsyncKeyState, GetForegroundWindow, CallWindowProcW, NtUserPeekMessage, NtSetEvent, sqrtf, __stdio_common_vsprintf_s, CDXGIFactory::TakeLock, TppTimerpExecuteCallback
When a breakpoint fires, the exception handler grabs the caller address from the top of the stack and runs four checks:
- NtQueryVirtualMemory fails on the caller - suggests the function is hooked to hide memory regions
- Memory is uncommitted - suggests Virtual Address Descriptor (VAD) manipulation
- Memory is executable but not MEM_IMAGE - manually mapped or injected code
- Return address is
FF 23- namazso’sjmp qword ptr [rbx]gadget
If any check triggers, report ID 0x31 goes to BattlEye’s servers with 32 bytes of the calling function’s code and the memory metadata.
The function list reveals deliberate profiling. It is not just input functions. sqrtf and __stdio_common_vsprintf_s appear because cheat developers use math libraries and string formatting. CDXGIFactory::TakeLock catches DirectX interaction. The selection shows BattlEye profiling which functions cheats actually call and placing tripwires accordingly.
secret.club also notes that BattlEye’s signature for finding CDXGIFactory::TakeLock includes CC padding bytes from a specific compilation, which breaks across different builds. Small mistakes like this limit the effectiveness of what is otherwise a solid detection approach.
EAC’s kernel stack copying
EAC goes deeper than BattlEye on the kernel side. The EAC driver does not just queue APCs. It copies the contents of kernel thread stacks asynchronously and scans them for non-paged code pointers that do not belong to any loaded kernel module.
This bypasses the standard APC limitation entirely. The driver reads the thread’s kernel stack directly from its KTHREAD structure and walks it offline. If pointers to unsigned memory appear on the stack, even as saved return addresses from functions that have already returned, the thread gets flagged. Disabling APCs or keeping a thread in a non-alertable state does not prevent this. The stack is just memory, and EAC reads it directly.
Combined with instrumentation callbacks for user mode syscall origin validation and RtlLookupFunctionEntry/RtlVirtualUnwind for proper stack trace construction, EAC has coverage across both user and kernel transitions. It is arguably the most complete stack-based detection implementation in the anti cheat space right now.
Return address spoofing
Detection mechanisms push evasion techniques toward significant sophistication. The most basic approach works as follows: before calling a monitored function, swap the return address on the stack for a pointer to a JOP gadget in a legitimate module. Stash the real return address in a non-volatile register. After the call returns, the gadget redirects execution back through the register.
This defeats the module backing check, since the return address resolves to ntdll.dll or kernel32.dll. It fails the gadget detection check (BattlEye scans for FF 23) and the missing call instruction check (the byte before the gadget is not a call). As namazso’s original work shows, switching to a different gadget dodges BattlEye’s specific signature. But the structural problem remains: gadget addresses do not follow call instructions.
SilentMoonwalk
klezVirus takes spoofing significantly further with SilentMoonwalk, which constructs entire synthetic stack frames using .pdata metadata. It operates in two modes.
Synthetic mode creates artificial frames that look like legitimate function calls. Each frame is sized correctly according to the target function’s UNWIND_INFO, so RtlVirtualUnwind produces a coherent stack trace. The frames point to real functions in real modules.
Desync mode identifies problematic frames in the real stack and replaces them with frames from legitimate functions. It specifically looks for UWOP_PUSH_NONVOL entries to find frames that restore RBP on unwind, keeping the chain intact.
Both modes use ROP to restore the original stack state after the call completes. During the monitored function, the call stack shows only legitimate modules and plausible call sequences. SilentMoonwalk is the most complete user mode stack spoofing implementation publicly available.
The weakness is the “plausible” part. Synthetic mode can create call sequences that are structurally valid (frame sizes match, return addresses are in real modules) but represent execution paths that never actually happen. If an anti cheat maintains knowledge of legitimate call graphs, synthetic stacks become distinguishable from real ones.
Module stomping
Instead of fixing the call stack, module stomping fixes where code lives. Load a legitimate DLL, overwrite its .text section with the payload, and execute from there. Every return address naturally resolves to the stomped module because the attacker’s code literally occupies that module’s memory.
Detection compares the in-memory module against the on-disk copy. PE-sieve does this byte-by-byte. If more than roughly 10,000 bytes differ in the code section, the module is flagged (legitimate hooks typically patch far fewer bytes). Advanced variants restore the original bytes after execution, but memory page metadata, specifically the SharedOriginal flag in the working set entry, still changes when the page is written to. Tools like Moneta detect this.
Indirect syscalls
Instead of executing syscall from attacker-owned code (which puts the attacker’s module address in the return stack), indirect syscalls jump to the syscall instruction inside ntdll.dll. From the kernel’s perspective, the return address points into ntdll, passing the instrumentation callback check.
The limitation: indirect syscalls still bypass the inline hooks that EDR products place at ntdll function entries. If the anti cheat validates that the call enters through the function’s actual prologue, not just that syscall executes from within ntdll, indirect syscalls alone are not sufficient. One check is bypassed by failing another.
The defender’s asymmetry
Every evasion technique leaves a residual signal: return address spoofing leaves gadget patterns, SilentMoonwalk creates structurally valid but logically implausible call sequences, module stomping changes page metadata, and direct syscalls leave non-ntdll return addresses in KTRAP_FRAME. The fundamental asymmetry is that defenders only need to catch one anomaly across all the frames, all the calls, and all the threads over an entire session, while attackers need every frame, every call, and every thread to look perfect every time. NMI callbacks, instrumentation callbacks, and ETW TI stack traces operate at privilege levels where user mode evasion cannot reach, and Intel Control-flow Enforcement Technology (CET) shadow stacks will soon maintain a hardware-backed copy of return addresses that software cannot modify at all. The call stack is not going away as a detection vector. Hardware enforcement is about to make it significantly stronger, and the era of purely software-based stack spoofing has an expiration date tied to CET adoption reaching critical mass.