Often it is a good idea to link assembly language programs or routines with high-level programs which may contain resources unavailable to you through direct assembly programming–such as using C’s built in graphics library functions or string-processing functions. Conversely, it is often necessary to include short assembly routines in a compiled high-level program to take advantage of the speed of machine language.
All high-level languages have specific calling conventions which allow one language to communicate to the other; i.e., to send variables, values, etc. The assembly-language program that is written in conjunction with the high-level language must also reflect these conventions if the two are to be successfully integrated. Usually high-level languages pass parameters to subroutines by utilizing the stack. This is also the case for C.
In order to ensure that the assembly language procedure and the C program will combine and be compatible, the following steps should be followed:
- Declare the procedure label global by using the GLOBAL directive. In addition, also declare global any data that will be used.
- Use the EXTERN directive to declare global data and procedures as external. It is best to place the EXTERN statement outside the segment definitions and to place near data inside the data segment.
- Follow the C naming conventions–i.e., precede all names (both procedures and data) with underscores.
Whenever entering a procedure, it is necessary to set up a stack frame on which to pass parameters. Of course, if the procedure doesn’t use the stack, then it is not necessary. To accomplish the stack setup, include the following code in the procedure:
push ebp mov ebp, esp
EBP allows us to use this pointer as an index into the stack, and should not be altered throughout the procedure unless caution is taken. Each parameter passed to the procedure can now be accessed as an offset from EBP. This is commonly known as a “standard stack frame.”
It is necessary that the procedure preserve the contents of the registers ESI, EDI, EBP, and all segment registers. If these registers are corrupted, it is possible that the computer will produce errors when returning to the calling C program.
C passes arguments to procedures on the stack. For example, consider the following statements from a C main program:
| extern int Sum(); | int a1, a2, x; | x = Sum(a1, a2);
When C executes the function call to Sum, it pushes the input arguments onto the stack in reverse order, then executes a call to Sum. Upon entering Sum, the stack would contain the following:
Since a1 and a2 are declared as int variables, each takes up one word on the stack. The above method of passing input arguments is called passing by value. The code for Sum, which outputs the sum of the input arguments via register EAX, might look like the following:
_Sum push ebp ; create stack frame mov ebp, esp mov eax, [ebp+8] ; grab the first argument mov ecx, [ebp+12] ; grab the second argument add eax, ecx ; sum the arguments pop ebp ; restore the base pointer ret
It is interesting to note several things. First, the assembly code returns the value of the result to the C program through EAX implicitly. Second, a simple RET statement is all that is necessary when returning from the procedure. This is due to the fact that C takes care of removing the passed parameters from the stack.
Unfortunately, passing by value has the drawback that we can only return one output value. What if Sum must output several values, or if Sum must modify one of the input variables? To accomplish this, we must pass arguments by reference. In this method of argument transmission, the addresses of the arguments are passed, not their values. The address may be just an offset, or both an offset and a segment. For example, suppose Sum wishes to modify a2 directly–perhaps storing the result in a2 such that a2 = a1 + a2. The following function call from C could be used:
The first argument is still passed by value (i.e., only its value is placed on the stack), but the second argument is passed by reference (its address is placed on the stack). The “&” prefix means “address of.” We say that &a2 is a “pointer” to the variable a2. Using the above statement, the stack would contain the following upon entering Sum:
Note that the address of a2 is pushed on the stack, not its value. With this information, Sum can access the variable a2 directly. (Hint: use an index register to hold the offset, then use a memory access to access the variable).
Assembly can return values to the C calling program using only the EAX register. If the returned value is only four bytes or less, the result is returned in register EAX. If the item is larger than four bytes, a pointer is returned in EAX which points to the item. Here is a short table of the C variable types and how they are returned by the assembly code:
Temporary storage space for local variables or data can be created by decreasing the contents of ESP just after setting up a stack frame at the beginning of the procedure. It is important to restore the stack space at the end of the procedure. The following code fragment illustrates the basic idea:
push ebp ; Save caller's stack frame mov ebp, esp ; Establish new stack frame sub esp, 4 ; Allocate local data space of ; 4 bytes push esi ; Save critical registers push edi ... pop edi ; Restore critical registers pop esi mov esp, ebp ; Restore the stack pop ebp ; Restore the frame ret ; Return to caller
In most cases, calling C library routines or functions from an assembly program is more complex than calling assembly programs from C. An example of how to call the printf library function from within an assembly program is shown next, followed by comments on how it actually works.
global _main extern _printf section .data text db "291 is the best!", 10, 0 strformat db "%s", 0 section .code _main push dword text push dword strformat call _printf add esp, 8 ret
Notice that the procedure is declared global, and its name must be _main, which is the starting point of all C code.
Since C pushes its arguments onto the stack in reverse order, the offset of the string is pushed first, followed by the offset of the format string. The C function can then be called, but care must be taken to restore the stack once it has completed.
When linking the assembly code, include the standard C library (or the library containing the functions you use) in the link. For a more detailed (and perhaps more accurate) description of the procedures involved in calling C functions, refer to another text on the subject.
|16-bit C Programming|
movl %esp, %ebp
movl 12(%ebp), %eax
movl 8(%ebp), %edx
leal (%edx,%eax), %eax
mov ebp, esp
mov eax, [ebp + 8]
mov ecx, [ebp + 12]
add eax, ecx