Introduction To Assembler

1 Introduction To Assembler

  • In this Worksheet you will use GCC to output assembly language from C source code.
  • In order to make sure everyone is using the same setup we have built a VirtualBox Image of Centos Linux that you can use.
  • Download the image here then run it to set it up in Virtualbox.
  • Once it's installed, launch it and you will be logged in automatically to a terminal.
  • The Linux Image is Xubuntu. It has everything you need installed.
  • The password is 120ct. If you use sudo and mess something up, re-download the image.

1.1 Commands we will be using

First point, any text in square brackets just means 'type stuff here', at no point do you actually type the square brackets themselves.

  • cd
    • change Linux directory
    • Specific examples
      • cd [dirname]
        • Go to a specified directory.
      • cd ..
        • Go up one directory
      • cd ~
        • Go back to your home directory.
  • GCC [inputfile.c]
    • GCC is a C compiler. Detailed explanation of this is beyond the scope of this worksheet, but everything you need is specified.
  • ls
    • List directory contents.

1.2 Text Editors

  • This Linux Image has Nano, Vim and Emacs installed
  • To use those google for instructions.
  • Of the choices presented, emacs and vim are recommended, Nano is a bit too basic, so you shouldn't use it for any kind of programming.

2 Worksheet Exercise outline

  • This Worksheet will introduce you to assembly language. Since assembly language is hard to write, we will write C code and compile it into assembler.
  • We will demonstrate several simple programming elements - variables, addition and looping, by writing the C version and inspecting the generated assembler version.

3 Worksheet Exercise

  • cd into the assembler directory, and open the file main.c in a text editor.
int main() {

return 0;
}
  • Save the file, then compile it into assembler with the command
  • gcc -S main.c
  • The -S part tells gcc we want an assembly language version of our code produced.
  • If all goes well this will finish.
  • Typing ls will show you have succeeded in generating a file called main.s that contains the assembler version of your c code.
  • Read the output by loading main.s.
  • To read it you can either use a text editor or use the command cat main.s to display the file contents, but cat will just display it, you won't be able to edit it. (not that you have to).
  • Some of the register addresses may differ from those shown here, but not the structure.
        .file   "main.c"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-4)"
        .section        .note.GNU-stack,"",@progbits
  • This assembler is just setting up the process to run.
  • Use this output as a baseline to compare against for further exercises.
  • The section that has the new code (beyond fundanental setup and process closing) is:
.cfi_def_cfa_register 6
// new code will be here
movl    $0, %eax
  • Next we will be typing code in the main function that actually does something, so in tha gap above return 0;
  • Add this code:
int a = 2;
  • Save and compile the code again, then inspect the output as before.
        .file   "main.c"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $2, -4(%rbp)
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-4)"
        .section        .note.GNU-stack,"",@progbits
  • So now the code
movl    $2, -4(%rbp)
  • Has appeared.
  • This is assembler storing the literal value 2 ($ means literal) in the variable we have decided to call a in one of its 15 counting registers, in the fourth one along specifically and exited.
  • So lets retreive and increment the variable next
a = a+5;
  • Now we get, after recompiling main.c:
movl    $2, -4(%rbp)
addl    $5, -4(%rbp)
  • So now we see that the value in -4(%rbp) has had the literal value 5 added to it.
  • So, lets add a new variable to the program
int b = 0;
  • This now gives us
movl    $2, -4(%rbp)
addl    $5, -4(%rbp)
movl    $0, -8(%rbp)
  • So the 8th register in %rbp has been initialised with 0.
  • So far we've only worked with literals, lets add a to b.
  • add the line
b+=a;
movl    $2, -4(%rbp)
addl    $5, -4(%rbp)
movl    $0, -8(%rbp)
movl    -4(%rbp), %eax
addl    %eax, -8(%rbp)
  • Now the value in -4(%rbp) (our a variable), has been moved into %eax register (an accumulator), then from there it has been added to -8(%rbp) (our b variable).

3.1 Subtraction

  • Add the code
int c = b - 3;
  • Recompile, and our assembly block becomes
movl    $2, -4(%rbp)
addl    $5, -4(%rbp)
movl    $0, -8(%rbp)
movl    -4(%rbp), %eax
addl    %eax, -8(%rbp)
movl    -8(%rbp), %eax
subl    $3, %eax
movl    %eax, -12(%rbp)
  • Of which the new code is
movl    -8(%rbp), %eax
subl    $3, %eax
movl    %eax, -12(%rbp)
  • The value in -8(%rbp) is moved into eax
  • The literal value 3 is subtracted from it, and the result is stored in the register -12(%rbp) as our new variable c.

3.2 Looping

  • In programming we often need to repeat operations. For this we use several forms of loop.
  • We will add a simple for loop to our program so we can examine it in assembler.
int main () {                                                          
  int i;                                                                   
  int a = 2;                                     
  a = a +5;                                       
  int b = 0;                                     
  b+=a;                                          
  int c = b - 3;                                 
  for (i=0;i<5;i++) {                                      
    c +=2;                                       
  }                                              
  return 0;                                             
}
  • Edit your c program so it looks like this, adding a new var i to use in the for loop, and a loop that adds 1 to c five times.
  • Compile, then view the assembler.
  • You will see the places variables are stored has moved. This is because the compiler is deciding where to store things, not us.
  • The new block of interest is:
        movl    $0, -4(%rbp)
        jmp     .L2
.L3:
        addl    $2, -8(%rbp)
        addl    $1, -4(%rbp)
.L2:
        cmpl    $4, -4(%rbp)
        jle     .L3

3.3 Looping - Line by Line

movl    $0, -4(%rbp)
  • This is a for loop, so first line of assembler sets up the loop control var i.
movl    $0, -4(%rbp)
jmp     .L2
  • Loops in assembler work by jumping around the code, using labels to set destination points.
  • This loop starts by jumping to label L2.
.L2:
        cmpl    $4, -4(%rbp)
        jle     .L3
  • At L2 there is a comparison to see whether the loop has ended (is i still less than or equal to 4).
jle     .L3
  • jle means 'Jump if less than or equal' The jump target is L3, which contains the logic the loop is performing (minimal in this example).
.L3:
        addl    $2, -8(%rbp)
        addl    $1, -4(%rbp)
  • Here 2 is being added to -8(%rbp) (var c), and one is being added to the iteration varable i -4(%rbp)
  • The loop ends when jle returns false (i>4).

Author: carey pridgeon

Created: 2016-11-24 Thu 08:09

Emacs 25.1.1 (Org mode 8.2.10)

Validate