到目前为止,我使用内联asm来破坏它不是获得良好性能的最佳选择。 我从汇编开始,但我正在我的机器(GCC)中编程,但结果代码是在64位(Sandy Bridge和Haswell)中以其他方式(ICC)运行。
要调用一个没有参数的函数,我们可以使用CALL来完成它,但是我不太了解如何使用参数调用函数,因此我尝试在所有函数内部使用内联__asm__ 。 这是一个不错的选择?
我的功能:
void add_N(size_t *cnum, size_t *ap, size_t *bp, long &n, unsigned int &c){ __asm__( //Insert my code here ); }当我看到拆卸 (使用GCC)时,我有:
add_N(unsigned long*, unsigned long*, unsigned long*, long&, unsigned int&): 0x100001ff0 <+0>: pushq %rbp 0x100001ff1 <+1>: movq %rsp, %rbp 0x100001ff4 <+4>: movq %rdi, -0x8(%rbp) 0x100001ff8 <+8>: movq %rsi, -0x10(%rbp) 0x100001ffc <+12>: movq %rdx, -0x18(%rbp) 0x100002000 <+16>: movq %rcx, -0x20(%rbp) 0x100002004 <+20>: movq %r8, -0x28(%rbp) 0x100002008 <+24>: popq %rbp 0x100002009 <+25>: retq我理解发生了什么..如果函数签名是相同的,不同的编译器/微体系结构是否总是关联相同的寄存器地址?
然后在我的函数中放入一些代码(NOT __ASM__ CODE),并且反汇编PUSH很多寄存器。 为什么会这样? 为什么我不需要推%rax和%rsi (例如),需要推r13 , r14和r15 ? 如果我需要推送r**寄存器,我可以inline __asm__吗?
0x100001ea0 <+0>: pushq %rbp 0x100001ea1 <+1>: movq %rsp, %rbp 0x100001ea4 <+4>: pushq %r15 0x100001ea6 <+6>: pushq %r14 0x100001ea8 <+8>: pushq %r13 0x100001eaa <+10>: pushq %r12 0x100001eac <+12>: pushq %rbx 0x100001ead <+13>: movq %rdi, -0x30(%rbp) 0x100001eb1 <+17>: movq %rsi, -0x38(%rbp) 0x100001eb5 <+21>: movq %rdx, -0x40(%rbp) 0x100001eb9 <+25>: movq %rcx, -0x48(%rbp) 0x100001ebd <+29>: movq %r8, -0x50(%rbp)Until now, I used inline asm with clobbering what it is not the best choice to get good performance. I am starting with assembly, but I am programing in my machine (GCC), but the result code is to run in other on (ICC), both in 64 bit (Sandy Bridge & Haswell).
To call a function without arguments we can do it with a CALL, but I do not understand too well how to call a function with arguments, and because of that I am trying use an inline __asm__ inside of all function. It is a good choice?
My function:
void add_N(size_t *cnum, size_t *ap, size_t *bp, long &n, unsigned int &c){ __asm__( //Insert my code here ); }And when I see the disassembly (with GCC), I have:
add_N(unsigned long*, unsigned long*, unsigned long*, long&, unsigned int&): 0x100001ff0 <+0>: pushq %rbp 0x100001ff1 <+1>: movq %rsp, %rbp 0x100001ff4 <+4>: movq %rdi, -0x8(%rbp) 0x100001ff8 <+8>: movq %rsi, -0x10(%rbp) 0x100001ffc <+12>: movq %rdx, -0x18(%rbp) 0x100002000 <+16>: movq %rcx, -0x20(%rbp) 0x100002004 <+20>: movq %r8, -0x28(%rbp) 0x100002008 <+24>: popq %rbp 0x100002009 <+25>: retqI understand what is happening.. Will different compilers/microarchitectures always associate the same registers addresses if the function signature be the same?
Then put some code inside of my function (NOT __ASM__ CODE), and the desassembly PUSH a lot of registers. Why did it happen? Why didn't I need to push %rax and %rsi (for example), and need to push r13, r14 and r15? If I need to push the r** registers, can I do in the inline __asm__?
0x100001ea0 <+0>: pushq %rbp 0x100001ea1 <+1>: movq %rsp, %rbp 0x100001ea4 <+4>: pushq %r15 0x100001ea6 <+6>: pushq %r14 0x100001ea8 <+8>: pushq %r13 0x100001eaa <+10>: pushq %r12 0x100001eac <+12>: pushq %rbx 0x100001ead <+13>: movq %rdi, -0x30(%rbp) 0x100001eb1 <+17>: movq %rsi, -0x38(%rbp) 0x100001eb5 <+21>: movq %rdx, -0x40(%rbp) 0x100001eb9 <+25>: movq %rcx, -0x48(%rbp) 0x100001ebd <+29>: movq %r8, -0x50(%rbp)最满意答案
对于最后一个问题 - 是的,它将使用相同的寄存器参数,只要它们使用相同的ABI。 Linux x86_64 ABI在此定义: http : //www.x86-64.org/documentation/abi.pdf ,所有编译器都必须符合它。 具体来说,您感兴趣的是第16页 - 参数传递。
我认为Windows的ABI略有不同。 因此,您无法运行在Linux上编译并在Windows上运行的程序或库(例如,有一些其他原因)。
有关gcc内联汇编的详细信息,请查看一些现有教程,因为它是一个很长的主题。 这是一个好的开始: http : //asm.sourceforge.net/articles/rmiyagi-inline-asm.txt
For the last question - yes, it will use the same register for parameters as long as they use the same ABI. The Linux x86_64 ABI is defined here: http://www.x86-64.org/documentation/abi.pdf and all compilers have to conform to it. Specifically you are interested in page 16 - Parameters Passing.
Windows have slightly different ABI I believe. So you cannot run your program or library compiled on Linux and run on Windows for example (there are some additional reasons for that though).
For details about gcc inline assembly check some existing tutorial as it's quite long topic. This is good start: http://asm.sourceforge.net/articles/rmiyagi-inline-asm.txt
更多推荐
发布评论