Linker templates

Introduction

One of the hardest parts of getting the correct binary for an embedded system is the linker template. This little write up comes from my experience of getting a c++ unit test framework (cpputest) to run on the target hardware.

I build my linker templates in multiple parts as some of it is specific to a single processor and some is generic to virtually any embedded system.

Requirements

There are a number of things that you need to get right for an embedded target.

Memory

In an embedded target the system will have a number of different types of memory, e.g. Static RAM and FLash. There may be multiple blocks of each memory type with different characteristics, i.e. access speed.

So the challenges are to get the correct compiler output into the correct memory block. This may involve some work from the startup code, i.e. copying initialised data from Flash to SRAM.

Hardware Startup

On reset the processor will start running code from a known point, not totally accurate for cortex M - the code address is stored at a known location. We need to have startup code that configures the hardware for us such as memory controllers for external memory.

C Runtime

The C runtime library will have some startup requirements. These are responsible for handling internal setup. This may also setup code to be run on exit from the application main function.

C++ runtime startup

The runtime requirements for c++ is greater and this is where we need to be more aware of what we are doing. For c++ you need to ensure that constructors are called for static and global objects. This is where you need to read the compiler documentation as each compiler will generate this information in a different way. See link under References for gcc.

Compiler neccesities

There may be other requirements from the compiler, i.e. the use of a compiler specific library.

Compiler options

There are a number of compiler options that can be used to control the compiler output. These options can also be used to help reduce the image size.

Linker sections

The input to the compiler is a number of compiled objects and object libraries containing named sections. It is these named sections that are pulled into the resulting binary. For some of these the required sections are easy to identify as they will be used to satisfy external link requirements of other objects.

There will also be some sections that must be included even without any reference to them.

Debug information

The final information we may choose to keep in the image file is debug information.

The linker template.

The references section links to the linker template files.

The main one of interest is stm32f-sections.ld

1) First of all we need to make sure the interrupt vectors are placed at the beginning of Flash memory.

We then include all referenced text sections (executable code).
We also include read only data (const declarations)

4) We then have the init and fini sections, these are not sorted but included in the order the objects are provided to the linker. These include the various crt files that must be kept in order as the compiler provided parts do not constitute a valid procedure unless included in this correct order. i.e. crti provides a function prolog for the init function.

5) We then have the preinit_array, init_array and fini_array. These are addresses generated by the compiler to be run at certain phases of the program.

6) We then create the section of code to run constructors followed by a section for destructors. As this will be an embedded system we could probably exclude them. Again there are compiler provided prolog and epilog code (crtbegin and crtend).

7) We then pull in code to do with exception handling and unwinding. Thats is we compile with exception support.

8) We then pull in all the initailised data. This is given addresses in RAM but stored in flash. The startup code has to copy this from Flash to RAM before main is called.

We then allocate space inRAM for uninialised data. (bss)

10) And as a final check we allocate stack space. This is not where the stack will be unless we really are using all the RAM. its purpose is to cause a link error if the RAM size is too small.

Notes

The linker template declares some labels that are used by the startup code, i.e. how to find initialised data.

Improvements

Some arm cortex M cores have multiple RAM banks. e.g. STM32F4 has Core Coupled RAM (CCM) that is only accessible from the arm core. This CCM could be used for processor stack to reduce contention when using DMA.

Compiler & Linker switches

--fno-rtti --fno-exceptions

References

stm32f-sections.ld

http://gcc.gnu.org/onlinedocs/gccint/Initialization.html