逆向 01 提取数据

发布于 6 小时前  6 次阅读


  假设我们现在有一个如此的情景:

  1. 我们有一个游戏采用了U3D作为引擎;
  2. 我们想要通过Il2CppDumper来提取其数据。

一、准备工作

  首先,我们需要在Il2CppDumper中下载release;然后查看我们预期逆向的游戏,确保其中存在:

  1. GameAssembly.dll,或是其他类似的名称的DLL;
  2. global-metadata.dat,一般位于\il2cpp_data\Metadata下。

  在准备了上述工具之后,我们可以尝试直接运行Il2CppDumper.exe,首先选择GameAssembly.dll,然后选择global-metadata.dat。如果提取成功则完成了这一阶段的任务;如果出现报错,则说明数据可能存在加密,无法直接提取。

二、解密或提取

  上述错误一般是对global-metadata.dat加密导致的,我们有如下两者方式可以对其解密:

  1. 使用Il2CppInspector提取metadata,如果是常见的游戏,一般也有对应的插件可以使用;
  2. 直接从内存中dump出metadata。

  这里我们直接使用第二种方式,尝试从内存中dump。这里我们首先在十六进制下查看global-metadata.dat,确保其开头部分是如此的内容:

AF 1B B1 FA XX XX XX XX

其中前四位AF 1B B1 FA用于帮助我们定位内存地址,后四位则是版本号。我们使用如下的python程序进行提取:

# Dump global-metadata.dat from memory

import pymem
import pymem.process
import pymem.pattern
import os
import sys

class MetadataDumper:
    def __init__(self, process_name: str, target_size_bytes: int):
        self.process_name = process_name
        # Read original size + 1MB to prevent overflow
        self.dump_size = target_size_bytes + (1 * 1024 * 1024) 
        self.pm = None

    def attach(self):
        try:
            self.pm = pymem.Pymem(self.process_name)
            print(f"[+] Successfully attached to process: {self.process_name} (PID: {self.pm.process_id})")
        except Exception as e:
            print(f"[-] Cannot find or attach to process '{self.process_name}'. Please ensure the game is running.")
            sys.exit(1)

    def scan_and_dump_all(self):
        print("[*] Starting to scan memory for *all* global-metadata signatures...")

        # Signature: magic number
        signature = b'\xAF\x1B\xB1\xFA'

        try:
            # Key modification: return_multiple=True, find all matches
            results = pymem.pattern.pattern_scan_all(self.pm.process_handle, signature, return_multiple=True)

            if not results:
                print("[-] No signature found in memory. Header might be erased or encryption method changed.")
                return

            print(f"[!] Found {len(results)} potential addresses. Starting extraction...")

            for index, address in enumerate(results):
                print(f"\n--- Processing address {index + 1}: {hex(address)} ---")
                self.dump_to_file(address, index)

        except Exception as e:
            print(f"[-] Error during scanning: {e}")

    def dump_to_file(self, address, index):
        try:
            # Try direct read
            data = self.pm.read_bytes(address, self.dump_size)
            self._save(data, index, address)

        except pymem.exception.MemoryReadError:
            print(f"[-] Address {hex(address)} read failed (Error 299), trying safe read...")
            self._safe_dump(address, index)
        except Exception as e:
            print(f"[-] Unknown error: {e}")

    def _safe_dump(self, start_address, index):
        buffer = bytearray()
        chunk_size = 1024 
        current_addr = start_address
        bytes_read = 0

        while bytes_read < self.dump_size:
            try:
                chunk = self.pm.read_bytes(current_addr, chunk_size)
                buffer.extend(chunk)
                current_addr += chunk_size
                bytes_read += chunk_size
            except Exception:
                break

        if len(buffer) > 1024 * 1024: 
            self._save(buffer, index, start_address)
        else:
            print("[-] Data too small, skipping save.")

    def _save(self, data, index, address):
        # Filename includes address for identification
        filename = f"dump_{index}_{hex(address)}.dat"
        with open(filename, "wb") as f:
            f.write(data)
        print(f"[+] Saved: {filename} (size: {len(data)} bytes)")
        self._check_if_decrypted(data)

    def _check_if_decrypted(self, data):
        """
        Simple heuristic check: look for common plaintext strings
        """
        # Check if data contains "UnityEngine" or "System" - common class names
        # Decrypted Metadata should show many plaintext class names
        sample = data[:1024 * 1024] # Only check first 1MB
        if b'UnityEngine' in sample or b'm_scor' in sample or b'System.String' in sample:
            print(f"    [*] Hint: This file looks like decrypted! (Found plaintext strings)")
        else:
            print(f"    [!] Hint: This file still appears to be encrypted/garbled.")

if __name__ == "__main__":
    TARGET_PROCESS = "_Program_.exe" 
    # Your original file size
    ORIGINAL_FILE_SIZE = 32880320 

    dumper = MetadataDumper(TARGET_PROCESS, ORIGINAL_FILE_SIZE)
    dumper.attach()
    dumper.scan_and_dump_all()

这里有两个要点:

  1. TARGET_PROCESS替换成对应的进程名;
  2. 添加一个ORIGINAL_FILE_SIZE用于控制文件大小。

  在Dump之后,我们得到的文件大小往往和原始文件对不上,现在我们对齐进行手动修理:

  1. 确保文件长度和原metadata文件一致(删除文件末尾多余的0x00);
  2. 确保版本号是正常的(我这里选择将其调整为18 00 00 00)。

在完成了上述两项操作后,我们将dump.dat和原来的global-metadata.dat对比检查:

  1. 文件长度都是1F5B6BF
  2. 文件的Magic Number都是AF 1B B1 FA
  3. 文件的版本号修订为18 00 00 00

三、提取

  完成了(二)中的修改之后,我们可以通过Il2CppDumper提取其数据。提取后的数据其中主要关注:

  1. DummyDll/Assembly-CSharp.dll,包含了游戏中C#函数的内存偏移;
  2. script.json,如果采用了xLua等脚本,一般在这里查看。