This is a quick tutorial on Assembly Language and some related lab work I got from my TA way back when and have edited:
General Info:
To
compile a C program, use the following command:
gcc -g -c -o drv.o drv.c
gcc is the compiler command. -g tells the compiler to prepare for the debugger.
-c says to compile, but do not link. -o tells the compiler to rename the output
to the following string. So this command is compiling the file drv.c into an
object file but not linking it, readying it for debugging, and renaming the
output to drv.o.
To assemble an assembly program, use the following command:
as -g -o asm.o asm.s
Similarly, as is the assembler command. -g prepares it for debugging, and -o
renames the object file. So this command assembles asm.s, readies it for
debugging, and renames the output to asm.o.
You can link those outputs using:
gcc -g -o myprog asm.o drv.o
This uses the gcc command to link two object files, ready it for debugging, and
renames the output executable to myprog.
Or, you can do it the easy way:
gcc -g -o myprog drv.c asm.s
This does everything in one step. It compiles, links, and readies for debugging
both files, puts the output executable in myprog.
You can change input/output names as needed, but C programs need a .c
extension, and Assembly programs need a .s extension.
If these commands work, you'll get dropped back to the command line with no
output. If there are errors, they'll be listed for you. You can either try to
fix errors by reading through the code, or you can run the debugger.
You run the program itself by typing either the name of the program (myprog
from above) and hitting enter, or typing ./myprog, which tells the terminal to
look in the current directory (./) for the executable name (myprog).
To run the debugger, type "gdb myprog" without quotes and
substituting whatever program name you used. From there, you'll be brought to
another prompt for the debugger. To run the program as normal from here, type
"run". You can set break points by typing "break [line
number]". You can step through the code by typing "step". There
are a bunch more options here as well, which you can read about by typing
"help" or getting more information at one of the links at the
beginning of this email. Typing "quit" exits the debugger. Stepping
through the program can be really useful when trying to locate the place where
a control structure acted incorrectly.
Lab2:
Lab 2
introduces simple assignments, variables, registers, and instructions. The eventual
goal here is to be able to implement C commands such as "a = ((b + c) - (d
+ e)) - 10;" in Assembly. For each program you'll be working on, you will
be given a piece of C driver code, and you will have to implement an Assembly
function that implements the a C function described in the assignment.
In most computer programs, you deal with memory locations that store data
(variables) and instructions that manipulate that data. In Assembly, you also
deal with registers. You can store values in registers, as well as use
registers in computations. Registers are fast, temporary variables located
within the microprocessor itself, where normal variables exist in memory.
There's a limited number of registers that can be used, but if those can be
used, it is desirable due to speed. You can't just use memory variables either.
The 8086 processor limits the use of memory variables to a MAXIMUM of ONE
memory variable per computation. I'm going to repeat that for emphasis: you
cannot use two memory variables in a computation. You must use
AT LEAST one register or a constant.
In Intel 8086 assembly language, you can only do one computation at a time, and
you can only use one memory variable per computation. So the command from
above, "a = ((b + c) - (d + e)) - 10;" cannot be implemented in a
single line. In fact, Assembly is so low level that this takes many lines.
For the processor we're using, there are four general purpose registers: A, B,
C, and D. That previous line is implemented in Assembly like so:
.comm a, 4
.comm b, 4
.comm c, 4
.comm d, 4
.comm e, 4
.text
movl b,%eax ; move b into register ax
addl c,%eax ; add c to register ax
movl d,%ebx ; move d into register bx
addl e,%ebx ; add e to register bx
subl %ebx,%eax ; subtract register bx from register ax
subl $10,%eax ; subtract 10 from register ax
movl %eax,a ; move register ax out to a
Lets go through this bit by bit. Like in C, you need to declare variables
before you can use them. There are several assembler directives you can use,
depending on what you're trying to do, but the most common one (and the only
one we should need this semester) is the .comm command. It takes two arguments.
The first is the variable name, and the second is the number of bytes the
variables is. Note, there is no type information. The command format for most
of what we'll be doing is "instruction firstarg, secondarg".
A note here about writing in assembly: there is no symbol to end a line of code
(in C, you end all lines with a ";" ). To comment your code,
multi-line comments use the C-Style /*comment goes here*/, and a single line
comment should be able to use either ";" or "#", but I know
for a fact that "#" works.
The above commands do not initialize the data either. If you want to initialize
your variables, you can use a command like "b: .int 10",
which declares an integer b and initializes it to 10. More examples can be
found in the manual. So all those .comm commands up above declare variables
a,b,c,d, and e of size 4 bytes (integers). For most of the programs I'll be
sending you, there should be sections at the bottom labeled "declare
variables here." That is where you should put the declarations.
The next bit is arithmetic operators. As you may have noticed, assembly doesn't
use standard symbols (+,-,*,/,etc). Each operation has its own instruction.
Addition uses the add instruction, subtraction uses sub, multiplication mul,
and division div. The closest thing to an "=" is the move command, or
mov. It allows you to move the value from one register/variable to another,
essentially setting them equal. However, there are suffixes that are attached
to each of these depending on the size of the data being operated on. Each
piece of data being used above is 4 bytes, which is a "long-word". So
each command has an "l" attached to the end of it. For 2 byte data,
you would attach a "w" for "word". For 1 byte data, you
would use "b" for "byte".
The mov instruction is formatted thus (for 4 byte data): "movl source,
destination". This is the equivalent of "destination = source".
Only one of the source or destination can be a variable, so the other must most
likely be a register. One use of this command is to move data between variables
and registers.
Arithmetic operands always operate on a value. So "addl c, %eax" is
the equivalent of "%eax = %eax + c" or "%eax += c". For
these math instructions, one of the arguments will always be the destination.
The add command is used thus (for 4 byte data): "addl source,
destination". This is equivalent to "destination += source".
Similarly, subtraction is "subl source, destination" which is the
equivalent of "destination -= source". This is the most complicated
you can get with math commands, one addition/subtraction/equals per line.
I mentioned earlier that there are registers that we can use. The general
purpose registers are A, B, C, and D. You can access these registers by saying
%eax, %ebx, %ecx, and %edx, respectively. These are the 4 Byte registers.
However, we can operate on several different sizes of data. 4 byte, 2 byte, and
1 byte. The 2 Byte registers are similarly accessed using %ax, %bx, %cx, and
%dx. These are the lower 16 bits of the corresponding 32 bit registers. They
are NOT separate. Each of those 16 bit registers can be thought of as a pair of
8 bit registers. So %ax is split into %ah and %al (a, high and low bytes), %bx
into %bh, %bl, then you have %ch, %cl, %dh, %dl similarly. These are NOT
different registers, they just allow you to access smaller chunks of the larger
registers. So, for instance, "addb $2, %al" is an 8 bit instruction,
and "addw $2, %ax" is a 16 bit instruction.
Hopefully, by now you can look at the C code and Assembly equivalents above and
figure out how they are equivalent. Note, $10 denotes a constant number 10.
Multiplication and division get a little bit more complicated, however. If you
take two 32 bit numbers and multiply them, what do you get? A 64 bit number. So
you would need two registers to store the result. The multiplication operand
we'll be using is "mul". This takes ONE argument. That argument is
multiplied against the A register (it's ALWAYS the A register). The result is
placed in the A register (and if the number is big enough, the D register). Two
32 bit numbers being multiplied together will put the lower 32 bits of the
result into the A register and the upper 32 bits of the result into the D
register. To multiply two values, you must first move one of them into the A
register. For an 8bit multiplication, the result goes in %ah:%al. For 16 bits,
%dx:%ax. 32, %edx:%eax.
Formulas:
mulb X8bit ⇒
%ax = %al * X 8bit
mulw X16bit ⇒
%dx:%ax = %ax * X 16bit
mull X32bit ⇒
%edx:%eax = %eax * X 32bit
Those formulas show the results of each possible operand. For instance, if you
use "mull some32bitnumber", the result gets put in the concatenation
of %edx:%eax, and that result is %eax * that32bitnumber.
Divide is similar. It takes one operand, and the source and destination
registers are chosen automatically. Again, the operation is performed against
the A register. However, it also pulls in the D register. So for a 32 bit
division, you're actually dividing the 64bit %edx:%eax by a 32 bit number. The
32 bit result gets placed in %eax, and the remainder gets put in %edx. Note, before
dividing, you need to zero the second register (D), unless you have something
there that you want. If you're storing a totally different value in the D
register, it could throw off your division.
Division formulas:
divb X8bit ⇒
%al = %ax / X 8bit , %ah = remainder
divw X16bit ⇒
%ax = %dx:%ax / X 16bit , %dx = remainder
divl X32bit ⇒
%eax = %edx:%eax / X 32bit , %edx = remainder
Example use of formula: If you use the command "divl
some32bitnumber", the result that is placed in the register %eax is the
concatenation %edx:%eax divided by that32bitnumber. %edx gets the remainder of
the division. Note, the remainder is how you do modulus.
Lab3:
Lab 2
introduces simple assignments, variables, registers, and instructions. The
eventual goal here is to be able to implement C commands such as "a = ((b
+ c) - (d + e)) - 10;" in Assembly. For each program you'll be working on,
you will be given a piece of C driver code, and you will have to implement an
Assembly function that implements the a C function described in the assignment.
In most computer programs, you deal with memory locations that store data
(variables) and instructions that manipulate that data. In Assembly, you also
deal with registers. You can store values in registers, as well as use registers
in computations. Registers are fast, temporary variables located within the
microprocessor itself, where normal variables exist in memory. There's a
limited number of registers that can be used, but if those can be used, it is
desirable due to speed. You can't just use memory variables either. The 8086
processor limits the use of memory variables to a MAXIMUM of ONE memory
variable per computation. I'm going to repeat that for emphasis: you cannot use
two memory variables in a computation. You must use
AT LEAST one register or a constant.
In Intel 8086 assembly language, you can only do one computation at a time, and
you can only use one memory variable per computation. So the command from
above, "a = ((b + c) - (d + e)) - 10;" cannot be implemented in a
single line. In fact, Assembly is so low level that this takes many lines.
For the processor we're using, there are four general purpose registers: A, B,
C, and D. That previous line is implemented in Assembly like so:
.comm a, 4
.comm b, 4
.comm c, 4
.comm d, 4
.comm e, 4
.text
movl b,%eax ; move b into register ax
addl c,%eax ; add c to register ax
movl d,%ebx ; move d into register bx
addl e,%ebx ; add e to register bx
subl %ebx,%eax ; subtract register bx from register ax
subl $10,%eax ; subtract 10 from register ax
movl %eax,a ; move register ax out to a
Lets go through this bit by bit. Like in C, you need to declare variables
before you can use them. There are several assembler directives you can use,
depending on what you're trying to do, but the most common one (and the only
one we should need this semester) is the .comm command. It takes two arguments.
The first is the variable name, and the second is the number of bytes the
variables is. Note, there is no type information. The command format for most
of what we'll be doing is "instruction firstarg, secondarg".
A note here about writing in assembly: there is no symbol to end a line of code
(in C, you end all lines with a ";" ). To comment your code,
multi-line comments use the C-Style /*comment goes here*/, and a single line
comment should be able to use either ";" or "#", but I know
for a fact that "#" works.
The above commands do not initialize the data either. If you want to initialize
your variables, you can use a command like "b: .int 10",
which declares an integer b and initializes it to 10. More examples can be
found in the manual. So all those .comm commands up above declare variables
a,b,c,d, and e of size 4 bytes (integers). For most of the programs I'll be
sending you, there should be sections at the bottom labeled "declare
variables here." That is where you should put the declarations.
The next bit is arithmetic operators. As you may have noticed, assembly doesn't
use standard symbols (+,-,*,/,etc). Each operation has its own instruction.
Addition uses the add instruction, subtraction uses sub, multiplication mul,
and division div. The closest thing to an "=" is the move command, or
mov. It allows you to move the value from one register/variable to another,
essentially setting them equal. However, there are suffixes that are attached
to each of these depending on the size of the data being operated on. Each
piece of data being used above is 4 bytes, which is a "long-word". So
each command has an "l" attached to the end of it. For 2 byte data,
you would attach a "w" for "word". For 1 byte data, you
would use "b" for "byte".
The mov instruction is formatted thus (for 4 byte data): "movl source,
destination". This is the equivalent of "destination = source".
Only one of the source or destination can be a variable, so the other must most
likely be a register. One use of this command is to move data between variables
and registers.
Arithmetic operands always operate on a value. So "addl c, %eax" is
the equivalent of "%eax = %eax + c" or "%eax += c". For
these math instructions, one of the arguments will always be the destination.
The add command is used thus (for 4 byte data): "addl source,
destination". This is equivalent to "destination += source".
Similarly, subtraction is "subl source, destination" which is the
equivalent of "destination -= source". This is the most complicated
you can get with math commands, one addition/subtraction/equals per line.
I mentioned earlier that there are registers that we can use. The general
purpose registers are A, B, C, and D. You can access these registers by saying
%eax, %ebx, %ecx, and %edx, respectively. These are the 4 Byte registers.
However, we can operate on several different sizes of data. 4 byte, 2 byte, and
1 byte. The 2 Byte registers are similarly accessed using %ax, %bx, %cx, and
%dx. These are the lower 16 bits of the corresponding 32 bit registers. They
are NOT separate. Each of those 16 bit registers can be thought of as a pair of
8 bit registers. So %ax is split into %ah and %al (a, high and low bytes), %bx
into %bh, %bl, then you have %ch, %cl, %dh, %dl similarly. These are NOT
different registers, they just allow you to access smaller chunks of the larger
registers. So, for instance, "addb $2, %al" is an 8 bit instruction,
and "addw $2, %ax" is a 16 bit instruction.
Hopefully, by now you can look at the C code and Assembly equivalents above and
figure out how they are equivalent. Note, $10 denotes a constant number 10.
Multiplication and division get a little bit more complicated, however. If you
take two 32 bit numbers and multiply them, what do you get? A 64 bit number. So
you would need two registers to store the result. The multiplication operand
we'll be using is "mul". This takes ONE argument. That argument is
multiplied against the A register (it's ALWAYS the A register). The result is
placed in the A register (and if the number is big enough, the D register). Two
32 bit numbers being multiplied together will put the lower 32 bits of the result
into the A register and the upper 32 bits of the result into the D register. To
multiply two values, you must first move one of them into the A register. For
an 8bit multiplication, the result goes in %ah:%al. For 16 bits, %dx:%ax. 32,
%edx:%eax.
Formulas:
mulb X8bit ⇒
%ax = %al * X 8bit
mulw X16bit ⇒
%dx:%ax = %ax * X 16bit
mull X32bit ⇒
%edx:%eax = %eax * X 32bit
Those formulas show the results of each possible operand. For instance, if you
use "mull some32bitnumber", the result gets put in the concatenation
of %edx:%eax, and that result is %eax * that32bitnumber.
Divide is similar. It takes one operand, and the source and destination
registers are chosen automatically. Again, the operation is performed against the
A register. However, it also pulls in the D register. So for a 32 bit division,
you're actually dividing the 64bit %edx:%eax by a 32 bit number. The 32 bit
result gets placed in %eax, and the remainder gets put in %edx. Note, before
dividing, you need to zero the second register (D), unless you have something
there that you want. If you're storing a totally different value in the D
register, it could throw off your division.
Division formulas:
divb X8bit ⇒
%al = %ax / X 8bit , %ah = remainder
divw X16bit ⇒
%ax = %dx:%ax / X 16bit , %dx = remainder
divl X32bit ⇒
%eax = %edx:%eax / X 32bit , %edx = remainder
Example use of formula: If you use the command "divl
some32bitnumber", the result that is placed in the register %eax is the
concatenation %edx:%eax divided by that32bitnumber. %edx gets the remainder of
the division. Note, the remainder is how you do modulus.
Lab 4:
Lab 4 is about different addressing modes, arrays, and pointers. Addressing
modes are how the computer selects the data being used by an instruction. They
are determined by how you specify the operand of the instruction.
Data are the numerical values used in computations. For instance, if you have
the value 3 in the register %eax, your data value is 3. If you used %eax, the
operand would be %eax, while the data is 3. The addressing mode is the
relationship between the operands and the data.
The most basic versions of addressing are register addressing and immediate
addressing. In register addressing, the data is in a register. In immediate
addressing, the data is supplied in the instruction.
The next type of addressing is direct addressing, where the memory address is
supplied with the instruction. In most cases, the address will be supplied at
compile time when a memory variable is given an address. Thus, the operand is
the memory address corresponding to your variable. The data is the contents of
that memory address.
So we know how to address normal variables, such as ints, chars, or floats.
What about arrays? How does one use an array in assembly? It's almost the exact
same as declaring a single variable in assembly.
.comm a, 4
declares a single integer.
.comm a, 40
declares an array of ten integers. The only difference is how much space is
declared. There are 4 bytes per integer. We want 10 integers, so we declare 40
bytes. The symbol 'a' is the equivalent of the address of the first byte of
data. This also initializes all ten integers to zero.
To initialize the values to something else, you could also use the .fill
directive.
a: .fill 10, 4, 0
should create 10 integers of size 4, with the value 0.
To initialize the first values to 1,2,3, and the rest to zero, we would use
a: .int 1,2,3
.fill 7,4,0
which creates 3 ints with the values 1, 2, and 3. And then 7 variables of size
4, with the value 0.
So how does one access the elements of the array? Let's take a look at a piece
of C code here:
int a[10]; /* an array of 10 integers */
int i;
main()
{
a[0] = 10;
a[4] = 20;
a[i] = 30;
}
We've already discussed declaring the array, so here's how to access the
different pieces.
a[0] is the first integer in the array. Since a is the address of the first
integer, we can use direct addressing:
movl $10, a
The second example is just a touch trickier. We're trying to access the fifth
element in the array. However, this will still be direct addressing, because
we'll be using an address. The command is
movl $20, a+16
Remember, a is the address of the first integer. Each integer is 4 bytes. So if
you increment the address by 4 bytes, you get the second integer. 8, the third.
12, the fourth. Incrementing the address by 16 gets you the fifth integer.
Again, this is direct addressing. At compile-time, a is translated into an
address and 16 is added to it, which results in another address.
If you want to reference an address directly, you can simply write:
movl 4096, %eax
A number by itself, with no $, is considered an address.
The hardest bit of the short example above is the a[i] = 30. This is different
because the i is a variable, so this address isn't a constant (it changes
during execution). What we're going to be using to access this is called
indexed (or direct indexed) addressing. We're going to use a register along
with a constant to compute the memory address during run-time. The constant
(the displacement) is going to be a number (address) or a label (variable name,
equivalent to an address). To use indexed addressing, you can use two special
index registers: %esi and %edi. %esi is the source index, and %edi is the
destination index. For the most part, you can use these interchangeably.
To implement the C command a[i] = 30; you would write:
movl $30, a(,%edi,4)
You should also be able to use %esi for that if you want. The code above uses a
(the displacement), and then increments the address of a by the value of %edi
multiplied by 4. 4 is the size of the variables we have (ints). So the i from
the C command would be in %edi in the assembly command. If you were using, say,
chars, the 4 above would be a 1 instead. Note, there is an initial comma after
the opening parenthesis. This is there because there is actually a third field
that we are leaving blank for this addressing mode.
The entirety of the C snippet translated into assembly would look like this:
.comm a, 40
.comm i, 4
movl $10, a
# a[0] = 10;
movl $20, a+16 #
a[4] = 20;
movl i, %edi
movl $30, a(,%edi,4) # a[i] = 30;
Another situation in which the address of the data is not known at assembly
time is pointers. Pointers are variables that hold an address of another piece
of data. They do not hold the piece of data themselves, but they point to
somewhere else. To access pointers (dereferencing), use %ebx.
You can use pointers as below:
C:
int *p;
main()
{
*p = 40;
}
Assembly:
.comm p, 4
movl p, %ebx
movl $40, (%ebx)
The addresses we're using are four byte addresses. Therefore, the pointer p
needs to be four bytes. To dereference the pointer p, move it to %ebx, and
surround the register with parentheses. The final command above says to move 40
into the data that %ebx points to. This is called register indirect addressing.
You're indirectly addressing the data through a pointer held in a register.
You cannot dereference a pointer held in a variable, it must be in a register
first.
There is one final case to worry about: What if we have a pointer to an array?
We use base indexed addressing, which is similar to indexed addressing. We'll
be using two registers, %ebx (this one must be %ebx), which holds the base
index, and %esi or %edi, which will hold the index. You can also use a constant
offset to index into a struct, depending on the situation.
Example:
int *ap;
struct {int a,b;} *asp; //struct has two vars, a
and b
int i;
main()
{
ap[3] = 50;
ap[i] = 60;
asp[i].b = 70;
}
which is coded in assembly language as:
.comm ap, 4
.comm asp, 4
.comm i, 4
. . .
movl ap, %ebx
movl $50, 12(%ebx)
# ap[3] = 50;
movl i, %edi
movl $60, (%ebx, %edi, 4)
# ap[i] = 60;
movl asp, %ebx
# i is still in edi
movl $70, 4(%ebx, %edi, 8) #
asp[i].b = 70;
How does this work? ap is a pointer. The second command accesses the integer at
a 12 byte offset from where ap is pointing (12 byte offset is the fourth
integer, ap[3]). For the next command, the index i is moved into %edi, one of
the index registers. Now, the next command is using base indexed addressing:
movl $60, (%ebx, %edi, 4). This command figures out where %ebx (the pointer ap)
is pointing to, and adds an offset equal to %edi (index i) * 4 (size of each
variable) to it to get the address that is needed to access that slot in the
array.
The next command is actually even more complicated, since it's accessing an
array of structs. Again, you're using %ebx (pointer to array), and adding %edi
* 8 (pretty sure the 4 in the manual is a typo, as the struct is two ints, or
size 8) to the address. However, int b in the struct is desired, so there's
another 4 byte offset (grab the second byte in the struct, b), which goes
outside the parentheses. Thus the final command is:
movl $70, 4(%ebx, %edi, 8) #
asp[i].b = 70;
Note, for most of these addressing modes (indexed, register indirect, etc),
%esi, %edi, and %ebx can be used interchangeably. However, when specifying two
registers (base indexed) one of the registers MUST BE %ebx.
Recap (straight from the manual):
immediate addressing
The instruction actually contains the operand itself rather than an address or
other information describing where the operand is.
movl $4, %eax
register addressing
The data is in a register.
movl %eax, %ebx
direct addressing
The memory address of the data is supplied with the instruction. The assembler
does this by assigning an address to the symbol variable with respect to an
offset.
movl %eax, mem_variable
or
movl %eax, mem_variable+16
indexed addressing
The contents of a register along with a constant are used to compute the memory
address of the data. The constant is called the displacement.
movl $30, a(,%edi,4)
register indirect addressing
The %ebx register holds the address of the data to be addressed.
movl $40, (%ebx)
base-indexed addressing
The %ebx register holds the base of the address (like that of an array) and
%esi or %edi hold the index. The constant (4 in the following example)
corresponds to the number of bytes of each element in the array.
movl $60, (%ebx, %edi, 4)
Lab 5:
Lab 5 is
all about subroutines (functions). Any program label can be a subroutine. This
means any of the labels you've been able to jump to. To make a label into a
subroutine, you simply use the "call" command rather than the jmp (or
variations thereof) command. The only difference between a jmp and a call is
that call stores the return address of the next instruction in line to be
executed. That way, if you use the "ret" command, the program knows
where to go back to: the next instruction in line before the call. ret, by the
way, is how you return from a subroutine. So "call labelname" to
start a subroutine, "ret" to return.
You can also nest subroutines, where one subroutine calls another which calls
another, and so on. The return addresses for all these nested calls are all
stored on something called the "stack."
The stack is special data storage that has two basic functions:
"push" and "pop." Push adds something onto the stack, pop
removes something from the stack. Order, however, is important here. Like a
stack of papers, or a stack of plates, if you push something onto the stack, it
goes on top. Likewise, if you pop (remove) something from the stack, you remove
the object that is on top. You wouldn't want to yank the bottom plate out of a
stack of fine china, would you? This is a last-in, first-out way of doing
things. If you push 1, 2, 3, onto the stack, and then pop three times, you get
3, 2, 1.
So what does this have to do with subroutines? When you do a "call"
instruction, the return address (the address of the next instruction) is pushed
onto the top of the stack. The call command is, essentially, a combination of a
push and a jmp command. If ret is used, it pops the return address (stored by
call). ret is a combination of pop and jmp. So, let's look at the example from
the manual.
Your program executes a call instruction at address 100 (each instruction
exists in memory and has its own address) that jumps to an instruction at
address 200. The return address, 104 (the next instruction to be executed after
the call, 4 bytes ahead), is automatically pushed onto the stack by the call
command. Now, this subroutine has another call command at address 220 that
jumps to a third subroutine at address 300. This automatically pushes the
return address 224 onto the stack. This subroutine uses ret, which pops 224 off
the stack and then jumps to it. The previous subroutine continues executing
where it left off. The second subroutine executes a ret command, which pops the
previous return address, 104, off the stack and jumps to it. The first routine
continues executing.
To implement subroutines in assembly, you really only need labels, call, and
ret. There are different things we can do, though.
Let's look at the stack. The stack is just a section of memory. Which section
of memory it is is defined by the %ess (stack segment) register. The top of the
stack is pointed to by the %esp (stack pointer) register. This points to the
location where the last value was pushed, which also contains the next value to
be popped. The stack pointer register if an offset from %ess. The ordering of
the stack is kind of weird, as you can see from the diagrams below. %ess is the
the current section, and %esp is an offset from that current section. %ess is
at the top of the diagram, and as you push stuff onto the stack, %esp actually
gets closer. This may be counter-intuitive, as it makes more sense that pushing
would make the stack larger, meaning higher addresses. This is not so. %esp
initially points to the highest address (offset) in the stack segment. You may
notice that the numbers are counting up as you go down the page.
When something is pushed onto the stack, the stack pointer is FIRST
decremented, THEN the value is written into the new location being pointed to.
For a pop, the value is read FIRST, and then %esp is incremented. This keeps
values from being lost or overwritten (move to an empty space before writing,
or read before moving away and losing the location). As more and more things
get added to the stack, the closer %esp gets to %ess.
Here is the sequence of steps if we
pushw $1000:
0 | | <- %ess
4 | |
8 | |
12| |
16| XXXX | <- %esp before push
________________________________________
0 | | <- %ess
4 | |
8 | |
12| | <- %esp first decrement SP
16| XXXX |
________________________________________
0 | | <- %ess
4 | |
8 | |
12| 1000 | <- %esp finally, write 1000
16| XXXX |
You can see that the decrement occurs first, and then the write, keeping XXXX
from being overwritten.
Pops work just the opposite:
0 | | <- %ess
4 | |
8 | |
12| 1000 | <- %esp before pop
16| XXXX |
________________________________________
0 | | <- %ess
4 | |
8 | |
12| | <- %esp first read 1000
16| XXXX |
_________________________________________
0 | | <- %ess
4 | |
8 | |
12| | finally, increment SP
16| XXXX | <- %esp
You can see here that the value is read out of the stack (and stored,
presumably), and then the stack pointer is incremented, losing the location of
the data we just read. The 1000 is actually still there in memory, and will
remain there until it is overwritten.
The stack is used to store return addresses, but can also be used to store data
for your program. You can directly call both push and pop, the syntax being as
follows:
pushl variablename
popl variablename
The first command takes the value of the variable and pushes it onto the top of
the stack. The second removes the value at the top of the stack and puts in in
variablename. You need to be careful here, because call and ret don't
necessarily know if you're pushing and popping. This could cause ret to pop the
wrong address and jump to the wrong instruction.
So what uses could the stack have as memory storage? Sometimes you need to store
data for a subroutine will another is executing. Suppose we call a subroutine
from a program, but have some value in the registers we need saved? We don't
want to store it in a variable because it's an intermediate value, and the
registers will be needed in the next subroutine. What do we do? Push it onto
the stack. Then call the subroutine. When the function returns, pop the
variables back off in reverse order. Like so:
.
.
pushl %eax
pushl %ebx
call my_routine
popl %ebx
popl %eax
.
.
Just like that, you've saved %eax and %ebx. Now it doesn't matter what
my_routine does with the registers, because you've got copies in case it
overwrites something. You can use the stack the same way if you run out of
registers in the middle of a computation.
So now we get to the complicated part: local variables and parameters. In C,
you can have different functions or different loops all with their own set of
variables, inaccessible to the other code blocks. There can be more than one
"instance" of a variable with the same name, because they each have a
different scope. This often happens when a function calls itself (recursion).
This can be done by putting the variables on the stack.
The processor we're using, the 8086, has a special register to access the
stack: the %ebp (base pointer) register. It points the the stack just like
%esp, and can be used as a base register just like %ebx. %ebp is very often
used to set up something called a "stack frame" that contains all of
the data for a single subroutine.
Basically, when a subroutine is entered, any space needed for local variables
can be allocated on the stack. If you think of the stack as a pile of boxes,
allocating space for local variables can be thought of as putting empty milk
crates (that are on their side) on top of the stack of boxes. You can put more
boxes on top of the milk crates, but you can still fill the crates up later.
You allocate space in a subroutine by decrementing the stack pointer. The base
pointer is set to point to the allocated space, and each local variable is
accessed using an offset from the base pointer. Example:
main()
{
int a, b;
a = 10;
doubleit();
b = a;
}
doubleit()
{
int c, d;
c = 20;
halfit();
d = c * 2;
}
halfit()
{
int e, f;
e = 20;
f = e / 2;
}
Translated into assembly code, this looks as follows:
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp ;
movl $10, -4(%ebp) ; a
call doubleit
movl -4(%ebp), %eax ; a
movl %eax, -8(%ebp) ; b
movl %ebp, %esp
popl %ebp
ret
doubleit:
pushl %ebp
movl %esp, %ebp
subl $8, %esp ;
movl $20, -4(%ebp) ; c
call halfit
movl -4(%ebp), %eax ; c
shll $1, %eax ; c
* 2
movl %eax, -8(%ebp) ; d
movl %ebp, %esp
popl %ebp
ret
halfit:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl $20, -4(%ebp) ; e
movl -4(%ebp), %eax
shrl $1, %eax ; e
/ 2
movl %eax, -8(%ebp) ; f
movl %ebp, %esp
popl %ebp
ret
All references to the variables local to each subroutine are accessed by using
an offset from %ebp. As in, -4(%ebp), -8(%ebp), etc. The first local var is at
-4, the second at -8, and so on. So what does the stack look like for this
example?
| |
| RA | <- %esp
| |
| | <- %ebp before main
begins
As with all subroutines, %esp starts by pointing to the return address. This is
because of the push to get the return address on the stack. So the first thing
to do is save the old base pointer location, so that we keep our old local
variable section accessible (it'd be correct for the previous subroutine). We
do this by pushing %ebp. As %ebp is a pointer, this is storing an address.
Thus:
| OLD_BP| <- %esp
| RA |
| |
| | <- %ebp before main
begins
Now, we set the value of %ebp to the current value of %esp. These are both
pointers, so setting the value of one to the other value CHANGES THE LOCATION
TO WHICH IT IS POINTING, RATHER THAN THE ACTUAL VALUE BEING STORED THERE.
IMPORTANT. We set them equal by moving the value of %esp into %ebp. At this
point, %esp and %ebp are both pointing to the same location on the stack, thus:
| OLD_BP| <- %esp <- %ebp
| RA |
| |
| |
The local variable locations will be right on top of the base pointer
here. But that's also the space that pushed values will go, because of the
stack pointer (%esp)! What do we do? We allocated local variable room by
decrementing the stack pointer. Main has two ints, so we need eight bytes of
space. To allocated space, we subtract 8 from %esp (moving where it points to).
The stack now looks like this:
low addresses
| |
| |
b | | <- %esp
a | |
|OLD_BP| <- %ebp
| RA |
| |
high addresses
OLD_BP is the address that was previous stored in %ebp, and RA is the return
address. Decrementing %esp by 8 has moved the %esp pointer to two slots up from
%ebp, as each slot is 4 bytes. Now, if something is pushed, nothing will be
overwritten (%esp is decremented BEFORE the write). The next thing we want to
do is set a to 10. a is the first local variable, and can thus be accessed by
using the address 4 less than %ebp (the next slot). Thus our next command is "movl
$10, -4(%ebp)"
This works, because you're using a constant offset from a pointer (which we
learned how to access in the previous lab).
So now what does the stack look like? We just filled one of our empty milk
crates with the value "10".
low addresses
| |
| |
b | | <- %esp
a | 10 |
|OLD_BP| <- %ebp
| RA |
| |
high addresses
Now what? The next bit in the C code calls another subroutine, so that's what
we're going to do using "call doubleit". As mentioned earlier, call
is a combo push/jmp. So the return address is pushed onto the next slot in the
stack.
low addresses
| |
| RA | <- %esp
b | |
a | 10 |
|OLD_BP| <- %ebp
| RA |
| |
high addresses
Now we're actually inside of doubleit. doubleit has its own local variables,
though, so we need to build the stack frame for this subroutine (the whole
section, RA, BP, local variable space is the "stack frame"). So what
do we do? Save %ebp (so we can get it again later), set %ebp to %esp (moving
where it points, not the value contained at that address), and allocate space
by decrementing %esp. Now we have this:
low addresses
d | | <- %esp
c | |
|OLD_BP| <- %ebp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
high addresses
Where we've stored the old base pointer (OLD_BP), the return address is there
(RA), and we have space for local vars (c,d). Now doubleit can start doing
calculations. We want to store the number 20 in c, the first local variable.
The first local variable is, again at four addresses less than the base
pointer: movl $20, -4(%ebp)
low addresses
d | | <- %esp
c | 20 |
|OLD_BP| <- %ebp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
high addresses
We then start the process over, when we call halfit, which requires its own
stack frame. The call pushes the return address:
| |
| RA | <- %esp
d | |
c | 20 |
|OLD_BP| <- %ebp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
The base pointer is saved, moved, and then space is allocated, leaving us with:
| |
f | | <- %esp
e | |
|OLD_BP| <- %ebp
| RA |
d | |
c | 20 |
|OLD_BP|
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
| |
halfit then stores 20 in the first local variable (-4), and stores 10 (20,
right shifted once), in the second local variable (-8), f. Then we have a bunch
of stuff that needs to occur to leave the stack frame exactly as it was before
halfit was called, so everything works in doubleit.
First things first: restore the stack pointer (remove the allocated space). You
can easily do this by moving the value of %ebp to %esp. This moves the %esp
pointer down to point back right at where %ebp is pointing.
| |
f | |
e | |
|OLD_BP| <- %ebp <- %esp
| RA |
d | |
c | 20 |
|OLD_BP|
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
| |
Next, pop the old base pointer (popl %ebp). As you may recall, the memory is
read and THEN %esp is incremented. This pops the value of OLD_BP into %ebp,
moving %ebp to point to where it used to point before halfit was called. This
pop also moves %esp down one (incremented) to point at the return address.
| |
f | |
e | |
|OLD_BP|
| RA | <- %esp
d | |
c | 20 |
|OLD_BP| <- %ebp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
| |
The last thing that occurs in halfit is ret, which is a combo pop/jump. It pops
the return address, which increments %esp, and then the program is back in
doubleit.
| RA |
d | | <- %esp
c | 20 |
|OLD_BP| <- %ebp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
| |
The rest is quite similar: finish computations, tear down the stack frame, and
then return.
movl -4(%ebp), %eax ; c
shll $1, %eax ; c
* 2
movl %eax, -8(%ebp) ; d
movl %ebp, %esp
popl %ebp
ret
The first command moves the first local variable back into a register. The
second command shifts it left by one bit (multiply by two). The third stores it
in the second local variable (-8). Then the tearing down of the stack frame
begins:
| RA |
d | 40 | <- %esp
c | 20 |
|OLD_BP| <- %ebp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
| |
Deallocate local variables by moving %ebp into %esp (move the location to which
it points):
| RA |
d | 40 |
c | 20 |
|OLD_BP| <- %ebp <- %esp
| RA |
b | |
a | 10 |
|OLD_BP|
| RA |
| |
| |
Pop the old base pointer into %ebp, moving %ebp to point to where it used to
and %esp down one.
| RA |
d | 40 |
c | 20 |
|OLD_BP|
| RA | <- %esp
b | |
a | 10 |
|OLD_BP| <- %ebp
| RA |
| |
| |
ret. This pops the old return address, moving %esp down one again, and
returning to main.
b | | <- %esp
a | 10 |
|OLD_BP| <- %ebp
| RA |
| |
| |
The rest of main is as follows:
movl -4(%ebp), %eax ; a
movl %eax, -8(%ebp) ; b
movl %ebp, %esp
popl %ebp
ret
The first local variable is moved into a register. That register is copied into
the second local variable (one memory address per command max). Now main's
stack frame is getting torn down. Local vars are de-allocated, and the old base
pointer is popped.
The final stack looks like it did before main started:
| |
| RA | <- %esp
| |
| | <- %ebp before main
begins
And then ret occurs, returning to whatever process called main.
It is generally assumed that any calling functions will have saved any of the
register information they need stored. However, the %ebx register is not saved
this way. In all of the programs thus far, %ebx has been manually saved in the
prolog. Without this, the program can randomly crash. The epilogs also restore
%ebx. These two things are NOT part of setting up the stack frame (which is
what the prolog and epilog do), but can help with random crashes on our lab
machines.
In summary, all of our routines get started with a prolog that sets up the
stack frame:
subr:
pushl %ebp
pushl %ebx
movl %esp, %ebp
subl $SIZE_OF_LOCAL_VARS, %esp
Similarly, at the end, the stack frame is torn down.
movl %ebp, %esp
popl %ebx
popl %ebp
ret
Also, any registers to be changed in the subroutine should be pushed and popped
(after allocating local var space). Such as %esi or %edi, if they're changed.
Congratulations. If you followed all of that, you now understand what the
prolog and epilog in all our programs are doing (setting up and tearing down
the stack frame).
Lab 6:
Lab 6 is
basically about parameter passing. How can we pass parameters to a function,
and how can we return parameters from it? You may have noticed that last lab,
we didn't use parameters; we used globals. This time, we learn how to use
parameters.
There are two basic ways to pass parameters: registers, and the stack. Let's
look at the former first. Passing the parameters via registers is a simple
method, and often quite fast. However, functions/languages must agree precisely
on what registers are being used. Variables are stored before a call, and
during the called functions the variables can be read straight out of the
registers again.
Here's an example:
extern int a,b;
foo()
{
bar(a);
}
bar(p1)
int p1;
{
b = b + p1;
}
Now in assembly language, we will pass the argument a in the %eax register as
follows:
foo:
pushl %ebp
movl %esp, %ebp
; begin foo
movl a, %eax
call bar
; end foo
movl %ebp, %esp
popl %ebp
ret
bar:
pushl %ebp
movl %esp, %ebp
; begin bar
addl %eax, b
; end bar
movl %ebp, %esp
popl %ebp
ret
As you can see, a is stored in the %eax register and then when bar runs, the
%eax register is added to b. That's it. Here's an example with returns:
extern int a,b;
foo()
{
b = bar(a);
}
bar(p1)
int p1;
{
return p1 + 2;
}
Here, %eax is used both to pass in the argument, and to pass out the return
value, thus making the implementation of bar very simple.
foo:
pushl %ebp
movl %esp,%ebp
; begin foo
movl a, %eax
call bar
movl %eax, b
; end foo
movl %ebp,%esp
popl %ebp
ret
bar:
pushl %ebp
movl %esp, %ebp
; begin bar
addl $2, %eax
; end bar
movl %ebp, %esp
popl %ebp
ret
This is almost as simple as the first example. a is placed in %eax, and then
bar is called. bar then adds 2 to %eax and returns. foo then reads %eax, which
is where the return value lies, and stores it in b.
So why not always use registers to pass arguments? What if you need more
arguments than you have registers, or if an argument won't fit in a register?
The other option in passing, as mentioned earlier, is the stack. This is how
you can pass many arguments or large arguments.
How do we do this? Before calling the function that you're passing variables
to, push the arguments onto the stack IN REVERSE ORDER. If you do this, they'll
be in order when you read from the called function. Once the prolog executes,
you can find the parameters at a fixed offset from the base pointer, just like
local variables. The only difference is that it will be a positive offset,
rather than a negative offset.
Example:
extern int a,b,c,d,e;
foo()
{
e = bar(a, b, c, d);
}
bar(p1, p2, p3, p4)
int p1, p2, p3, p4;
{
return (p1 + p2) * (p3 + p4);
}
Here, you can see there are four arguments to be passed into bar. We can still
use the %eax register to return the result.
foo:
pushl %ebp
movl %esp, %ebp
; begin foo
pushl d
pushl c
pushl b
pushl a
call bar
addl $16, %esp ; remove
arguments from stack
movl %eax, e
; end foo
movl %ebp, %esp
popl %ebp
ret
bar:
pushl %ebp
pushl %ebx
movl %esp, %ebp
; begin bar
movl 12(%ebp), %eax ; p1
addl 16(%ebp), %eax ; p2
movl 20(%ebp), %ebx ; p3
addl 24(%ebp), %ebx ; p4
mull %ebx ; result in ax
; end bar
movl %ebp, %esp
popl %ebx
popl %ebp
ret
You can see that foo pushes all four arguments onto the stack in reverse order.
bar is called. At this point, I'm going to put in a picture of the stack so you
can see what's going on. I've also modified the example from the lab slightly
to match more closely what you'll be doing. Addresses are arbitrary.
32| %ebx | <-%esp <-%ebp
36| old_base_ptr|
40| foo_ret_addr|
44| a |
48| b |
52| c |
56| d |
This is what the stack looks like after bar's prolog. As you can see, %ebp is pointing
at address 32. The parameters that were passed are in addresses 44, 48, 52, and
56. We can access these off of a constant offset from %ebp. You can look at the
prolog, figure out what the stack looks like, and determine the offset of the
first variable. From there, it's a matter of adding 4 to get the second
variable, and so on.
So as you can see, the difference between where %ebp is pointing (32) and where
a is on the stack (44) is 12. Thus, to move a (p1) into %eax, we just
access 12(%ebp), the same way we used local variables last time. If you want
the second variable, simply use 16(%ebp). etc.
This is a general mechanism and can be used in most situations. Returns are
usually put in a register (in our case, %eax). Some languages can return complex
types by putting the return value on the stack and putting a pointer to it in a
register. However, that's not something we're going to go into.
Based on a given prolog, you should be able to draw what the stack looks like
and figure out any offsets you need.
Short version: parameters go on the stack, returns go in %eax.
For the assembly files referenced in the labs:
Consider donating to further my tinkering.
Places you can find me