So you think you’re a hacker and you want to embed (inline) assembly into your C program. Well in many cases it is this low level material that can exploit other programs. The ability to manipulate registers can be very powerful. I am more thinking about the efficiency advantages that can be done writing a function in assembler rather than C. This used to be more common with early compilers, but these days most compilers can produce more efficient assembly than a human. There are some circumstances, usually in the embedded systems environment, where it can be advantageous to embed assembly into your C program.
The older versions of GNU C aka GCC compiler for Linux used the AT&T assembler syntax instead of the Intel assembler syntax, nowadays you can add in a directive to inform the compiler to use the Intel syntax. The Intel syntax is more reflective of the most common assembly paradigms, the Intel syntax has the first operand as the destination, and the second operand as the source whereas in AT&T syntax the first operand is the source and the second operand is the destination.
Okay, lets get on with it. The examples of assembly are for the x86 architecture.
Embed Assembly In C
To embed your assembly instructions into your C program we use the inline asm call. We can also use __asm__( to ensure there are no variables with the name asm that we may interfere with.
int main()
{
__asm__("movl %esp,%eax");
return 0;
}
This little function moves a long from register ‘a’ to the stack pointer. As you can see its quite easy to embed assembly into your C program. Lets do something ‘useful’.
#include <stdio.h>
#include <unistd.h>
int main()
{
int data1 = 2, data2 = 3;
// Fancy assembly statement
__asm__("addl %%ebx,%%eax"
:"=a"(data1)
:"a"(data1), "b"(data2)
);
fprintf(stdout, "data1 + data2 = %d\n", data1);
return 0;
}
The above example is the addition of two longs. The syntax I am using is the AT&T syntax that is expanded:
__asm__("<asm routine>" : output : input : modify);
In this case I add data1 and data2, the output is in “=a” or register ‘a’, the input is data1 in register ‘a’ and data2 in register ‘b’. So at the end I use my C fprintf statement to output the result of 5.
If you want to specifically use the Intel syntax you can inform the compiler of this by using:
__asm__(".intel_syntax noprefix\n\t"
"pop edx\n\t"
"mov eax,edx\n\t"
The .intel_syntax tells the compiler that the form being used is Intel.
Embed Volatile Assembly
In C you can declare a function as volatile. What is a volatile function? Volatile informs the compiler that this piece of code should not be moved or re-arranged as an efficiency optimization, this keeps the code exactly in the order you have typed it. For example:
__asm__ __volatile__("blah"
Now you can become the hacker you always dreamed of becoming!
Comments