@shijitht

October 21, 2010

Inline assembly basics

Filed under: C — Tags: , — shijitht @ 1:34 am

Inline assembly functionality allows embedding assembly code in C program. This is like an inline function, where the corresponding value gets substituted. Here assembler substitutes our assembly code in proper place with no change. GCC follows AT&T systax for assembly code.

AT&T syntax

  1. Register prefixed with % and $ for immediate/constant.(%eax and $10)
  2. Source operand comes first.(opcode source destination)
  3. Size of operand as a suffix to opcode. i.e. b -> byte, w -> word,
    l -> long.(movl)
  4. Indirect memory reference using parenthesis, ‘( ‘ and ‘ )’.(  (eax)  )

In C

syntax:   asm(” assembly code “);
function asm is used to write assembly code in C.

test.c
-------
#include<stdio.h>
int fun()
{
 asm("mov $24, %eax");
}
int main()
{
 int n = fun();
 printf("%d\n", n);
 return 0;
}

In test.c function fun has no return statement. But it returns 24. When a function returns, its return value is places in eax register. But we can explicitly set eax using asm. So a value can be returned without a return statement. The move instruction used is darkened above.

Operation which are very difficult or unable to perform in C can be achieved easily using inline assembly. Rotation of a block of bytes is done in a single step using  asm.( ror or rol ). But in C, it takes an effort. And all machine level instructions can be used, which can’t be produced with gcc. eg: logical and arithmetic shift. Architecture dependent coding and optimization is done using inline assembly. Speed of code can be further improved with hand written assembly.

Advertisements

October 20, 2010

Valgrind

Filed under: C, Commands, GNU/Linux, Tools — Tags: , , , , — shijitht @ 1:18 pm

Valgrind is a collection of tools to check the correctness of a program. The main tool in it is memcheck. It reports memory leak, out of bound writes, improperly initialized variables etc. This provides a report which pin points the correct location of the error. So this is a good tool to debug programs with unpredictable behavior and crash.

Using

Inorder to see the exact line number of error, compile the code with -g option and reports could be misleading if optimization above level 1 are used(-O1). The -g option compiles the code with debugging symbols enabled, this helps valgrind to locate line number.
Use this program prog.c

prog.c
-------
void fun()
{
    char *a = (char *)malloc(10 * sizeof(char));
    a[10] = 'a';
}
main()
{
    fun();
    return 0;
}

prog.c has two major errors,
1. a[10]  = ‘a’;
a[10]  is out of the allocated region. Writing to this region could produce mysterious behavior. This is called heap block overrun.
2. 10 byte block pointed by a is never freed. So on return to main, that block remains inaccessible and unusable. This is a serious memory leak.

Lets use valgrind to detect these errors,
Compile the code with -g option

$ cc -g prog.c

Generate report

$ valgrind –leak-check=yes   ./a.out
can use 2>&1 to redirect report to a file( $ valgrind –leak-check=yes  ./a.out > report   2>&1 )

Analyzing report

Various error messages and summaries can be found. error messages are generated in case of out of bound writes, here a[10].
The corresponding report is
==4836== Invalid write of size 4
==4836==    at 0x80483FF: fun(prog.c:6)
==4836==    by 0x8048411: main (prog.c:11)
==4836==  Address 0x419a050 is 0 bytes after a block of size 40 alloc’d
==4836==    at 0x4024F20: malloc (vg_replace_malloc.c:236)
==4836==    by 0x80483F5: fun(prog.c:5)
==4836==    by 0x8048411: main (prog.c:11)
4836 is the process id. First line shows, error is due to an invalid write of size 4. Below it is a complete stack trace. The error happened at line 6 of  prog.c. Read stack trace from bottom to up. Started from main, then a function call to fun, malloc and error at last. Error shows the address we tried to write is beyond the allocated 40 byte block. This information is quite useful to make the code correct.

The Leak summery show the memory leaks.
Here,
==4836== LEAK SUMMARY:
==4836==    definitely lost: 40 bytes in 1 blocks
==4836==    indirectly lost: 0 bytes in 0 blocks
==4836==      possibly lost: 0 bytes in 0 blocks
==4836==    still reachable: 0 bytes in 0 blocks
==4836==         suppressed: 0 bytes in 0 blocks
Second line shows the 40 byte block lost in function fun. Report includes other types of leaks also.

Valgrind checks these errors and leaks in runtime like a virtual machine executing each instruction of a code. So it is time consuming for large code. But the report generated is very much useful and can be used to correct mistakes which are otherwise very difficult to detect.

October 19, 2010

Compiler optimizations

Filed under: C, GNU/Linux — Tags: , , , , — shijitht @ 5:01 pm

To improve the performance, compiler optimizes the code while compilation. Compiler inline optimization and common subexpression elimination are discussed here. The assembly code is given for further clarifications. Compiler does these optimizations on the basis of a cost/benefit calculation.
Compiler used GCC 4.4.3

Code inlining

Code inlining embeds the functions body in the caller. This eliminates call and return steps and helps to put some extra optimization in both codes.
Lets see the difference.
Compile opt1.c with and without optimization to generate assembly code.

opt1.c
-------
int sqr(int x)
{
 return x*x;
}

main()
{
 printf("%d\n", sqr(10));
}

Without optimization
$ cc  -S  opt1.c  -o  wout_opt1.s
With optimization
$ cc  -S  -O3  opt1.c  -o  with_opt1.s

Compare both files. The function call to sqr in wout_opt1.s is replaced with its value in with_opt1.s. The corresponding  lines are darkened.

wout_opt1.s
-------------
main:
 pushl   %ebp
 movl    %esp, %ebp
 andl    $-16, %esp
 subl    $16, %esp
 movl    $10, (%esp)
 call    sqrc
 movl    %eax, 4(%esp)
 movl    $.LC0, (%esp)
 call    printf
 leave
 ret

with_opt1.s
-----------
main:
 pushl    %ebp
 movl    %esp, %ebp
 andl    $-16, %esp
 subl    $16, %esp
 movl    $100, 4(%esp)
 movl    $.LC0, (%esp)
 call    printf
 leave
 ret

But the code for sqr remains in both .s file because it could be referenced by some other functions where inline optimization can’t be applied. Only the linker can detect and remove unreferenced functions.
In inlining, the value of the function is found while compilation instead of runtime. Call instruction is replaced by a move instruction which loads the immediate value to the required location. An immediate value($100) equivalent to function sqr can be seen here and the call statement removed.

Common subexpression elimination

Compiler scans the code and finds identical subexpressions. These are evaluated only once and replaced with a single variable holding its value.
For example, take opt2.c

opt2.c
-------
main()
{
 int i, j, k, r;

 scanf("%d%d", &i, &j);

 k = i + j + 10;

 r = i + j + 30;

 printf("%d %d\n", k, r);

}

opt2.c has the subexpression i + j.

Compile opt2.c with and without optimization

Without
$ cc  -S  opt2.c  -o  wout_opt2.s
With
$ cc  -O3  -S  opt2.c  -o  with_opt2.s

wout_opt2.s
------------
main:
 pushl   %ebp
 movl    %esp, %ebp
 andl    $-16, %esp
 subl    $32, %esp
 leal    24(%esp), %eax
 movl    %eax, 8(%esp)
 leal    28(%esp), %eax
 movl    %eax, 4(%esp)
 movl    $.LC0, (%esp)
 call    scanf
 movl    28(%esp), %edx
 movl    24(%esp), %eax
 leal    (%edx,%eax), %eax
 addl    $10, %eax
 movl    %eax, 20(%esp)
 movl    28(%esp), %edx
 movl    24(%esp), %eax
 leal    (%edx,%eax), %eax
 addl    $30, %eax
 movl    %eax, 16(%esp)
 movl    16(%esp), %eax
 movl    %eax, 8(%esp)
 movl    20(%esp), %eax
 movl    %eax, 4(%esp)
 movl    $.LC1, (%esp)
 call    printf
 leave
 ret

with_opt2.s
------------
main:
 same as above
 call    scanf
 movl    24(%esp), %eax
 addl    28(%esp), %eax
 movl    $.LC1, (%esp)
 leal    30(%eax), %edx
 addl    $10, %eax
 movl    %edx, 8(%esp)
 movl    %eax, 4(%esp)
 call    printf
 leave
 ret

In wout_opt2.s, two variables are read as usual. The value i + j is calculated in two places to add with 10 and 30. leal  (%edx,%eax),  %eax is to add i and j. Evaluating the expression twice wastes CPU time.
In optimized with_opt2.s, the first value read is stored in eax. It gets added with the value read next. Now eax the value of i + j. leal adds 30 to it and stores in edx. addl adds 10 and eax.
Common subexpression elimination is a powerful technique to optimize code performance. Programmers can eliminate such subexpressions while coding.  But there will be compiler generated expressions for array index calculation, macro expansion etc. A programmer can’t do optimization in this level. These are the cases where a compiler does its trick to improve performance.

August 19, 2010

C boolean

Filed under: C — Tags: , — shijitht @ 3:52 pm

C do not have a built-in boolean type. In C, zero means false and non-zero means true. So we don’t have to use boolean types in comparison operations. If a comparison say  using  “== ”   succeeds, one would be returned. Zero in case of a failure. Run the program to remove debuts,

main()
{
         printf("False = %d True = %d\n",1==0,1==1);
}

July 31, 2010

Fork BOMB

Filed under: C — Tags: — shijitht @ 10:40 am

This bomb can take down system in seconds, causing unintended denial of service attack in our labs(mainly system lab). 🙂
We can avoid this by programming carefully and using sleep().
The fire powder is
while(1) {
fork();
}

Blog at WordPress.com.