There is no attachment in the site you post. The source code you needed is post as text. Below is the translation from me:
Author: flier, Area: MSDN
Title: Detours x64 [Draft]
From: ShuiMu BBS (Note: this is a well-known BBS in China. Its server is in TsingHua University.), inside.
Dependency is damn! I must solve this before I try to solve another problem.:s
I didn't want to explain what is Detours. The version of 2.1 Express issued by MSR doesn't support x64. Someone else inquired this by email, MSR say it will cost 10,000 dollars to acquire the support. Too expensive to think of it. -_-b
So I try to modify it, and it can run on x64 now. Because some bugs exist, I don't issue the full version. I will tell you the framework.
API hook is a common technology now. It just is consist of: Find the entry of function. Substitute some bytes in the head place with 'jmp' code. It will jump into a template. After the work completed, it will jump into the origin. Please refer the source code of Detours for detail. Now I will introduce the difference of x64.
At first, you must find the real entrance of origin function and hook function when you call DetourAttachEx. Maybe you should skip the table of IMPORT, and some indirection jump codes for debug. You can see this in the function detour_skip_jmp in detours.cpp.
Some modifications are needed in x64, because the length of instruction maybe is different with x86. Most code is copied from mhook, I added the check of embed:
inline PBYTE detour_skip_jmp(PBYTE pbCode, PVOID *ppGlobals)
{
if (pbCode == NULL) {
return NULL;
}
if (ppGlobals != NULL) {
*ppGlobals = NULL;
}
if (pbCode[0] == 0xff && pbCode[1] == 0x25) {
PBYTE pbTarget = *(PBYTE*)(pbCode + 6 + *(INT32 *)&pbCode[2]);
if (detour_is_imported(pbCode, pbTarget)) {
PBYTE pbNew = *(PBYTE *)pbTarget;
DETOUR_TRACE(("%p->%p: skipped over import table./n", pbCode, pbNew));
return pbNew;
}
return detour_skip_jmp(pbTarget, ppGlobals);
}
else if (pbCode[0] == 0xe9) {
PBYTE pbNew = pbCode + 5 + *(INT32 *)&pbCode[1];
return detour_skip_jmp(pbNew, ppGlobals);
}
else if (pbCode[0] == 0xeb) {
PBYTE pbNew = pbCode + 2 + *(CHAR *)&pbCode[1];
return detour_skip_jmp(pbNew, ppGlobals);
}
return pbCode;
}
After knowing the address of entrance, detours will call DetourCopyInstructionEx to copy the code of entrance. It will guess the instruction base on the bytes, and append the information into s_rceCopyTable. For x64, the entrance from 0x40 to 0x4f should be modified, because they are used as REX prefix in x64 instruction set. Please refer "AMD architecture programmer manual volume 3" for detail. More information can be found in chapter "1.2.7 REX Prefixes". They will be looked as single byte here.
#ifdef DETOURS_X64 // For Rex Prefix
{ 0x40, ENTRY_CopyBytesPrefix }, { 0x41, ENTRY_CopyBytesPrefix }, { 0x42, ENTRY_CopyBytesPrefix },
{ 0x43, ENTRY_CopyBytesPrefix }, { 0x44, ENTRY_CopyBytesPrefix }, { 0x45, ENTRY_CopyBytesPrefix }, { 0x46, ENTRY_CopyBytesPrefix }, { 0x47, ENTRY_CopyBytesPrefix }, { 0x48, ENTRY_CopyBytesPrefix }, { 0x49, ENTRY_CopyBytesPrefix }, { 0x4A, ENTRY_CopyBytesPrefix }, { 0x4B, ENTRY_CopyBytesPrefix }, { 0x4C, ENTRY_CopyBytesPrefix }, { 0x4D, ENTRY_CopyBytesPrefix }, { 0x4E, ENTRY_CopyBytesPrefix }, { 0x4F, ENTRY_CopyBytesPrefix }, #else
The code far indirect jmp/call from FF is different between x86 and x64.
PBYTE CDetourDis::CopyFF(REFCOPYENTRY pEntry, PBYTE pbDst, PBYTE pbSrc)
{
(void)pEntry;
if (0x15 == pbSrc[1] || 0x25 == pbSrc[1]) {
#ifdef DETOURS_X64
DWORD dwOffset = *((PDWORD)&pbSrc[2]);
*m_ppbTarget = (PBYTE)*((PDWORD_PTR)(pbSrc + 6 + dwOffset));
#else
PBYTE *ppbTarget = *(PBYTE**)&pbSrc[2];
*m_ppbTarget = *ppbTarget;
#endif
}
else if (0x10 == (0x38 & pbSrc[1]) || dR/M == 010
0x18 == (0x38 & pbSrc[1]) || dR/M == 011
0x20 == (0x38 & pbSrc[1]) || R/M == 100
0x28 == (0x38 & pbSrc[1]) R/M == 101
) {
*m_ppbTarget = (PBYTE)DETOUR_INSTRUCTION_TARGET_DYNAMIC;
}
const COPYENTRY ce = { 0xff, ENTRY_CopyBytes2Mod };
return (this->*ce.pfCopy)(&ce, pbDst, pbSrc);
}
After copy the original instructions into template, the code to jump to the original should be added. In x86, detours use function "detour_gen_jmp_immediate" to create instructions. In x64, this should be changed, because the pointer is 64-bits. If the target is in the range of 2G, 32-bits jmp instruction can work. If else, the address of target should be placed in RAM, then set the address of RAM as parameter of jmp. When detour_alloc_trampoline allocate memory for template, it will try to allocate memory besides 2G to avoid the long jump instruction. This can assure the goal function more than 5 bytes can work as before. But the address of template and hook function maybe out of 2G. So some thing must be done in this situation. As the same reason, I copied the source code of mhook as below:
inline PBYTE detour_gen_jmp(PBYTE pbCode, PBYTE pbJumpTo)
{
PBYTE pbJumpFrom = pbCode + 5;
SIZE_T cbDiff = pbJumpFrom > pbJumpTo ? pbJumpFrom - pbJumpTo : pbJumpTo -
pbJumpFrom;
if (cbDiff <= 0x7fff0000) {
*pbCode++ = 0xe9;
*((PDWORD)pbCode) = (DWORD)(DWORD_PTR)(pbJumpTo - pbJumpFrom);
pbCode += sizeof(DWORD);
} else {
*pbCode++ = 0xff;
*pbCode++ = 0x25;
*((PDWORD)pbCode) = (DWORD)0;
pbCode += sizeof(DWORD);
*((PDWORD_PTR)pbCode) = (DWORD_PTR)(pbJumpTo);
pbCode += sizeof(DWORD_PTR);
}
return pbCode;
}
Besides above, some exceptions must be handled. For example: When thread is suspended, if its RIP is the address of template or target function, same method as x86 can work in theory. But I can't test it in this situation.
#ifdef DETOURS_X64
#define DETOURS_EIP Rip
#define DETOURS_EIP_TYPE DWORD64
#endif // DETOURS_X64
There are more codes to handle some others, such as the handle of RIP in ModRm. These codes will be issued in future with the full code robustly.
There is some white cloud floating on the blue sky. That's the landscape I like.
|