The TIGCC Linker

This is the documentation of the TIGCC linker, the tool that reads the compiled and assembled files and merges them into a single program. Most parts of this documentation only need to be read by experts; you may safely skip over all parts that you do not understand.


TIGCC Linker Purpose and Operation

The purpose of a linker is to take executable code and data from different files and merge it into a single program. It must resolve dependencies between the files, and there are many architecture-dependent features linkers are required to support. The complicated part about linking is that binary code can be encapsulated in different formats; most linkers, including the TIGCC linker, can import code from several formats and export it to yet another one (see TIGCC Linker File Formats).

The TIGCC linker can handle two kinds of files: object files and archive files. Object files are produced by the compiler or assembler. They contain the code and global variables of the program; all object files passed to the linker are processed and then included in the final output. Archive files, also known as static libraries, are collections of object files. An archive member is included only if this is requested by another file.

Files can reference each other via text strings called symbols. These symbols are usually labels in assembly code, or functions and global variables in C code. The most popular way of referencing symbols is to ask that the final address or offset of a specific symbol be inserted at or added to a specific location in the code. This is called relocation.

When the linker is executed, it first reads all object files passed to it and imports their contents into its internal data structures. Then it tries to resolve the references to symbols defined in another file. If a symbol cannot be resolved, it is looked up in the symbol tables of all archive files passed to the linker; if it is still not found, this is an error. Archive members are imported immediately if required, and they may reference symbols defined in other files as well. Some special actions are performed based on the contents of the imported files, code and data blocks (sections) are sorted and merged if required, and offsets between different locations in the code are inserted whereever the object files requested this. Finally, the program is exported to an executable file.


Invoking ld-tigcc and ar-tigcc

The TIGCC linker is available in two forms: As a standalone command line program or as a dynamic link library (DLL). TIGCC uses the DLL version for improved speed and to avoid maintainance problems, but experienced programmers might want to use the command line version in some cases because it is a little more advanced.

In the IDE, linker options can be set in the project settings. The tigcc command line compiler accepts most of the options of ld-tigcc and passes them to the linker.

ld-tigcc Command-Line Options

In ld-tigcc, options and input files may appear in any order in the command line. Input files may be either object or archive (static library) files. They are handled differently depending on the type of the file: Object files are read completely in the order they are supplied; archive file members are read only if the program references a symbol they export. If multiple archive files export the same symbol, ld-tigcc uses the archive that is supplied first.

Output file names and variable names are usually set according to the name of the first object file in the command line, but they may be changed using the '--output' and '--varname' options described below. The file extensions depend on the exact output format used; they are usually what the transferring software expects them to be.

ld-tigcc recognizes the following options:

-h
--help

Print a short description of all available options.

--version

Print the version number of the tool and a short copyright notice.

-v
--verbose

Print statistics about the linked program before terminating. These include the target calculators, program variable name and size, data variable size, BSS size (size of all uninitialized global variables), the total number of absolute relocs, the number of relocs which appear in the native format of the target OS, and optimization possibilities or results. In the IDE, these statistics are displayed automatically, unless this is turned off in the preferences or the program is run automatically after successful linking. If the linking process fails, no statistics are shown.

--dump

Display all dumps of the program contents during the entire linking process. For details about dumps, see ld-tigcc Program Dumps.

--dumpn

Display the n-th dump of the program contents. For details on the different linking stages and the associated dump numbers, see ld-tigcc Program Dumps.

--native

Use TIGCC native mode by default. Without this option, the linker starts in kernel mode (for compatibility with existing programs), but this may be changed using special symbols (see Symbols to Control the Linker). For more information about modes, see TIGCC Linker Modes.

--fargo

Use Fargo II mode and compile for the TI-92. This option is only available if Fargo support is compiled in. It exists for compatibility with existing Fargo II programs, which do not explicitly specify a mode and a target calculator.

--flash-os

Use Flash OS mode. This mode creates an unsigned Flash operating system upgrade for the TI-89, TI-89 Titanium, TI-92+ and Voyage 200 calculators. This option is only available if Flash OS support is compiled in.

--remove-unused

Remove unused sections. If a section is not referenced by another section, this option causes it to be removed. Startup sections are never removed; neither is the first section in Nostub mode. Note that in some cases, the linker cannot determine whether a section can be removed before merging it with another section; in this case, it is not removed even though it may not be referenced at all.

--optimize-relocs

Update the destination symbol of relocation entries to the nearest available symbol, thereby making the offset as small as possible. This improves the readability of dumps and some diagnostic messages, but should not have any other effect than this.

--optimize-code

Perform all code optimizations, including NOP, return, branch, move, test, and calculation optimization. Note that it is possible for code optimization to create invalid code or accidentally change data instead of code. The probability is not very high, so you should really enable code optimization at least partially, but if your program crashes for no apparent reason, try turning off code optimization. For more information about optimization, see TIGCC Linker Binary Code Fixup.

--optimize-nops

Perform NOP instruction removal.

--optimize-returns

Perform return sequence optimization.

--optimize-branches

Perform branch optimization.

--optimize-moves

Perform move/load/push optimization.

--optimize-tests

Perform compare/test optimization.

--optimize-calcs

Perform calculation optimization.

--cut-ranges

Optimization has two effects: It can reduce the number of relocation entries, and it can make instructions smaller. Usually, the space gained from making the instructions smaller is filled with NOPs. If this option is used, the linker will attempt to cut out these ranges of code instead, making the size of the executable even smaller. This only works for input files that were assembled in all-relocs mode, but this is handled automatically if this option is used via the tigcc front-end or the TIGCC IDE.

--reorder-sections

Reorder sections to make references shorter. The fixups (see above) can then fit those shorter references into smaller and faster addressing modes, so using this option can improve both size and speed. It can also allow the fixups to remove relocations or to turn F-Line jumps into faster branches. Since computing the optimal reordering is NP complete and very expensive in practical terms (the factorial of the number of sections is a huge factor), section reordering is implemented through heuristics. Therefore, the result is not guaranteed to be optimal. There are rare cases where section reordering can still take exponential time, these are due to hardcoded short references between sections rendering some reorderings impossible. In this case, the linker will emit warnings as impossible reorderings are encountered so you can follow the process, or stop it (and go fix your program, hardcoded short references between sections are not a good idea, that's what linker optimization is for!) if it takes too long. Startup sections can be reordered only with other startup sections with the same startup number. Non-startup sections can be reordered only with other non-startup sections. Sections which are emitted separately (e.g. a dynamically allocated BSS section or a data section in an external file) cannot be reordered at all.

--merge-constants

Merge identical constants (including strings) to avoid duplication. Constant merging works on all symbols (actually, the ranges included within 2 symbols, where symbols at the same position are considered the same symbol) in sections marked mergeable. Constants can be merged if they are identical or if one of them is a prefix of the other. This can be used by the compiler to avoid duplicating string literals (and, if desired by the user, other constants) in multiple object files. Unaligned and aligned sections are distinguished to keep the linker from accidentally breaking the alignment while merging aligned constants with unaligned ones which happen to contain them as a prefix.

--omit-bss-init

Skip the initialization of the BSS section, which holds uninitialized global variables. Older versions of TIGCC never initialized the BSS section, so many older programs do not rely on the initialization. If you use this option, you must be sure that there is really no code that relies on the initialization of global variables to zero (which means, among other things, that you have to use the compiler option '-fno-zero-initialized-in-bss'). For a safer alternative, try the __ld_omit_bss_init control symbol.

--outputbin

Instead of creating a wrapped calculator variable that includes a folder and variable name, a checksum, and some extra information, write only the raw contents of the variable to the file. The file extension is changed in a way that allows different files to be generated for each target calculator but prevents confusion between raw and wrapped data.

-o file
--output file

Write the output to the file named file.ext, where ext is the extension that fits the file type. file may include a path, but if it includes its own extension, ext will be appended anyway. This also sets the variable name to something that resembles file as closely as possible. Note that it does not do any error checking on the characters of the file parameter.

-n [folder\]name
--varname [folder\]name

Include the folder name folder (main if unspecified) and variable name name in the wrapper file. If the file is not wrapped (i.e. if '--outputbin' has been specified), this option has no effect.

-d [folder\]name
--data-var [folder\]name

Exclude all non-executable data (global variables) from the program and create an external variable for it. Note that you are absolutely required to make sure that no code is executed from the data section; otherwise it will crash depending on the calculator model: Newer calculators have a protection device that lets the operating system restrict the areas code can be executed from. name is the variable name to be assigned to the data variable. folder defines the folder of the variable; if it is not specified, the folder from the '--varname' option is used.

--data-var-copy=condition

Defines when to create a copy of the data variable in RAM. If condition is always, the program will always work on a copy in RAM, which means that you may rely on the data being the same on every start of the program. However, if the data variable is not archived, you may easily run out of available memory. archived causes a copy to be created only if the data variable is archived; this is the default. If the variable is not archived, the program will work on the actual contents of the variable, so the values of all global variables will be kept even after the program finishes. never tells the linker to work on the original variable unconditionally, but since you may not write to the archive memory, you have to make sure that you never modify the value of a global variable. If the '--data-var' option is not specified as well, this option has no effect.

ar-tigcc Command-Line Options

The ar-tigcc tool can be used to create archives recognized by ld-tigcc. The output format is the format used by GNU ar, for which this tool is a replacement. This allows for maximum compatibility between archives and programs created with different versions of TIGCC.

In ar-tigcc, options and input files may appear in any order in the command line. Input files can have any file format; they are simply written into the archive in the order specified in the command line. However, object files whose format is recognized are searched for exported symbols, so that ar-tigcc can create a symbol table for the archive.

If no output file name is specified, ar-tigcc uses the name of the first input file and appends a '.a' extension to it. It is highly recommended that you specify a different name with the '--output' option.

ar-tigcc recognizes the following options:

-h
--help

Print a short description of all available options.

--version

Print the version number of the tool and a short copyright notice.

--dump

Display a small dump of the archive file contents. This includes the members as well as the symbols they export.

-o file
--output file
-rc file
-qc file

Write the output to the file named file. Unlike ld-tigcc, ar-tigcc does not append a file extension to file. '-rc' and '-qc' are recognized for compatibility with GNU ar, so that certain command lines work with GNU ar as well as ar-tigcc.

--no-names

Omit the file names of the input files in the archive. The archive will only contain names of the form fln.o, where n is the index of the file starting at 1. Omitting file names may be a good idea, especially if you use long file names, since the traditional archive format imposes a maximum of 15 characters on the length of file names. Otherwise, if a file name exceeds this maximum, it will be cut off at the 16th character. The IDE and the tigcc command line compiler always use this option.


Linking Modes of the TIGCC Linker

A linking mode defines how the linker treats the contents of the program after they have been read from the object files. The TIGCC linker has several different modes; some of them are related to specific output file formats, and some of them are present only for historical reasons.

The recommended mode for normal TIGCC programs is TIGCC-native mode. It is the simplest mode; the program is basically an empty sheet of paper, which can be filled with code of all sorts. The default mode is actually kernel mode unless you set the appropriate command-line option, to make existing programs work without modifications.

TIGCC-Native Linking Mode

This mode (which can be enabled using the '--native' command-line option or the control symbol _tigcc_native) is the recommended mode for all new programs. When operating in this mode, the TIGCC linker does not process the program in any special way, except that it requires the definition of at least one startup section. The idea is that stub code should not be handled by the linker itself but rather included manually by the program as needed. However, every program (regardless of the architecture and operating system) needs to have a prolog that either contains the location of the main entry point or is itself the startup code of the program. On the TI platforms with official assembly program support, execution always starts at the beginning of the program, so the prolog is actually the startup code.

Ideally, every program should be able to use this mode. However, at the moment, the output file format cannot be specified other than by switching to the appropriate mode. If you want to use a different output file format than the default TIOS format (i.e. Nostub DLL or Fargo II), you cannot use TIGCC-native mode. At the moment, adding support for explicit selection of the output file format does not have any particular benefit, since it does not permit the removal of any of the other modes. However, as soon as the need for another output format (e.g. raw) arises, there should be a command-line option to select the format.

Nostub Linking Mode

If this mode is activated using the _nostub control symbol, execution will start at the very beginning of the program. The exact entry point depends on the order of the object files as passed to the linker as well as the order of the sections inside an object file. Because of this insecurity, this mode should never be used in new programs. Programs written in assembly should define a small startup section including a jump to the actual main function and use TIGCC-native mode instead. If the main function follows immediately, the jump can even be optimized away by the linker.

If a startup section is defined in nostub mode, the linker emits a warning and switches to TIGCC-native mode. This ensures that nostub mode really means that no stub is added to the program.

Nostub DLL Linking Mode

If the linker is told to use Nostub DLL mode using the __nostub_dll control symbol, it acts like in TIGCC-native mode, except that it causes the linker to use the Nostub DLL output format instead of the default output format. Since the stub for nostub DLLs is defined as conventional C code rather than imported as a startup section, this mode does not require the definition of a startup section.

Kernel Linking Mode

In this mode, which is enabled by default, the linker acts the same way as in TIGCC-native mode, but it creates a global import asking for the appropriate kernel format header. See Automatically Created Global Imports for more information.

Fargo II Linking Mode

This mode is used to create a program that can be run on a TI-92 with Fargo II installed. It can be turned on using the '--fargo' command-line option or the _fargo control symbol. It uses the Fargo II output format, which is binary data wrapped in an empty TI-BASIC program. It creates a global import asking for the appropriate Fargo II header (see Automatically Created Global Imports for more information). And it also causes the value of __ld_entry_point to be decreased by two, to point to the two size bytes in the TIOS file format rather than the beginning of the program data.

Note: Fargo support must be compiled in for this mode to be available.

Flash OS Linking Mode

This mode creates an unsigned Flash operating system upgrade for the TI-89, TI-89 Titanium, TI-92+ and Voyage 200 calculators. It can be turned on using the '--flash-os' command-line option or the _flash_os control symbol. It currently supports only the raw TIB output format, which is enabled by the '--outputbin' option. Support for the current 89u/9xu/v2u format is planned and will be the default. It creates a global import asking for the appropriate Flash OS header (see Automatically Created Global Imports for more information). Since Flash operating systems are composed of 2 discontiguous parts, a small (24 KB) startup segment and a large (1944 KB for 2 MB FlashROMs, 3992 KB for 4 MB FlashROMs) main segment, startup sections are handled in a special way in this mode: Startup sections are placed into the startup segment, all other sections are merged into the main segment.

Note: Flash OS support must be compiled in for this mode to be available.


TIGCC Linker File Formats

The TIGCC linker recognizes several file formats. Currently, it can import COFF and AmigaOS files and export TIOS ASM files, Nostub DLL files (which are TIOS custom files with a special format), and Fargo II files (which are TIOS PRGM files with special hidden data). A small overview of the capabilities of each format is described in the following table:

Format Sections Relocations Unresolved Relocations Symbols ROM Calls RAM Calls Library Calls Library Exports Debug Information Version Number Additional Information
COFF Yes Yes Yes Yes Yes (through unresolved relocations) Yes (through unresolved relocations) Yes (through unresolved relocations) Yes (through symbols) Yes Yes (through symbols) Yes (through symbols)
AmigaOS Yes Yes (except 1-byte absolute) Yes Yes Yes (through unresolved relocations) Yes (through unresolved relocations) Yes (through unresolved relocations) Yes (through symbols) Yes Yes (through symbols) Yes (through symbols)
TIOS ASM No 4-byte absolute only No No No (but kernels exist that interpret a special header format) No (but kernels exist that interpret a special header format) No (but kernels exist that interpret a special header format) No (but kernels exist that interpret a special header format) No No (but kernels exist that interpret a special header format) No (but kernels exist that interpret a special header format with a comment, and a header for additional information may be inserted manually)
Nostub DLL No 4-byte absolute only No No No No No Yes (but required header is not inserted directly by the linker) No Yes (but required header is not inserted directly by the linker) No
Fargo II No 4-byte absolute only (but required header is not inserted directly by the linker) No No Yes (through library calls, but required header is not inserted directly by the linker) Yes (through library calls, but required header is not inserted directly by the linker) Yes (but required header is not inserted directly by the linker) Yes (but required header is not inserted directly by the linker) No No Single comment only (but required header is not inserted directly by the linker)
TI Flash OS (TIB, 89u/9xu/v2u) 2 fixed sections (24 KB startup, 1944/3992 KB main) No, runs from fixed address No No No No No No No Yes (but not yet supported by the linker) Product name and date stamp only (but not yet supported by the linker)


Symbols to Control the Linker

In addition to options specified in the command line, the TIGCC linker can be controlled using special symbol names. They should be used directly only in assembly programs; C programs should rely on the appropriate library facilities if they are available.

These are the symbols the linker treats as control symbols:

Symbols can be created in a variety of ways; they can be:

Not all assemblers support all types of symbols; for example, the A68k assembler does not support exporting symbols which are not defined somewhere in the same file. This assembler is also somewhat special from the linker's point of view: It only outputs exported and imported symbols by default; local labels can be supplied in a symbol table, but since it is optional, the linker does not use it to receive control information.

If a symbol is detected as a control symbol, it is not imported into the internal data structures as usual. There are two reasons for this: First, if a user accidentally defines a control symbol somewhere (some traditional control symbol names are quite short), the resulting error can help detect this problem. Second, if common symbols are used, they would waste space in the executable otherwise.

__ref_all_...

Defining __ref_all_name creates a global import for all symbols named name. Importing the same symbol twice does not have any particular effect; however, sometimes this is necessary if you want a global import to succeed as early as possible.

If none of the archives supplied to the linker exports a symbol with this name (or a related name using conditional reaction), the linker outputs a warning.

_tigcc_native

Defining this symbol switches to TIGCC native mode. We recommend that you define this in all new programs, and then create or import startup sections.

_nostub

Defining this symbol switches to NoStub mode. It can only be used in assembly files, since it is impossible to guarantee for some code to be at the beginning of the file.

_nostub_dll

Defining this symbol switches to NoStub DLL mode and tells the linker to compile a library instead of a program. If NoStub DLL support is not compiled in, the symbol is not treated in a special way.

_fargo

Defining this symbol switches to Fargo II mode. If Fargo support is not compiled in, the symbol is not treated in a special way.

_flash_os

Defining this symbol switches to Flash OS mode. If Flash OS support is not compiled in, the symbol is not treated in a special way.

_library

Defining this symbol causes a library to be created instead of a program. The linker will warn about program startup sections being included, and depending on the linking mode some different automatic global imports will be created.

_ti92, _ti89, _ti92plus, _v200

You need to define one or more of these symbols to specify the calculator for which the program is to be linked. This only controls which output files are created; the linker does not check whether a file format really exists for a given calculator. Kernel compatibility flags (see _flag_...) are added according to the symbol:

_ti89 Flag 0 (0x01)
_ti92plus Flag 1 (0x02)
_v200 Flag 5 (0x20)

_flag_...

Defining _flag_n sets the n-th bit in the kernel compatibility flags. Some flags are reserved for calculator compatibility information (see here); the others can be used to pass additional information to the kernel. Note that the meaning of a particular flag may vary between kernels. Kernel flags occupy a single byte; therefore the range of n is 0 through 7.

See also: __ld_kernel_flags

_version...

If you define the symbol _versionver, the program/library version number is set to ver, interpreted as a hexadecimal value. Note that the kernel format limits the reserved space for the version number to one byte; this means that kernels only accept two digits for ver.

See also: __ld_file_version

...@version..., ...__version...

To specify a required minimum version number for a library used by the program, you can define lib@versionver or lib__versionver. lib is the name of the library (see Library Calls); ver is the minimum version number to be accepted, interpreted as a hexadecimal value (see _version...).

See also: Library Calls

__ld_use_fline_jumps

Defining this symbol tells the linker to use relative F-Line branches for branches which would otherwise need to be absolute. These are supported by AMS 2.04 or higher and by various F-Line emulators, including TIGCC's own emulator.

See also: __ld_use_4byte_fline_jumps

__ld_use_4byte_fline_jumps

Defining this symbol tells the linker to use program-relative F-Line branches for branches which would otherwise need to be absolute. These branches are 4 bytes long, whereas normal relative F-Line branches have a size of 6 bytes. However, since they are relative to the program's entry point, the program must install its own emulator to handle them.

4 byte F-Line branches are useful only if range-cutting is enabled.

See also: __ld_use_fline_jumps

__ld_omit_bss_init

Defining this symbol in a source file tells the linker that this file does not depend on the initialization of the BSS section to zero. The result is that all uninitialized global variables defined in that file may contain garbage at the beginning of the program. This does not guarantee that the initialization is skipped; in fact, if at least one file needs the initialization, it is easier to initialize even the variables that were declared to not need it.

For pointer-based object file formats (such as COFF, the format used by the GNU tools included in TIGCC), this symbol really affects all variables in the file it is defined in. For sequential formats (such as the AmigaOS format used by the A68k assembler), it affects only the parts that follow the symbol. Since BSS data usually appears at the end of the object file, this restriction should not have any effect.

Note: If you define this symbol, you should use the compiler option '-fno-zero-initialized-in-bss'; otherwise even variables explicitly initialized to zero will contain garbage.

__ld_ignore_global_imports

Defining this symbol in a source file tells the linker that from this file on, defining a __ref_all_... symbol has no effect. This should not be used except in very special circumstances.


Startup Sections

The concept of startup sections is unique to the TIGCC linker. It only makes sense in low-resource environments like calculators. The idea is that a file that is imported should be able to specify that it needs certain code to be executed at the beginning of the program. Usually, there are two approaches to address this situation: constructors and main function wrappers.

If constructors are used to handle this, a lot of memory is wasted: A constructor table needs to be created with appropriate code to handle its contents; every item needs to save all registers except a few, and sharing data between two constructors requires global variables. Moreover, using constructors, it is not possible to specify the order in which startup code is to be called; however, parts of the startup code often need to rely on other parts to be executed first.

Main function wrappers appear in almost every environment. But since these wrappers are fixed, they need to handle all startup code that might possibly be needed, instead of letting each file choose its own startup code. For example, such fixed startup code would need to handle exceptions even if the program never generates them, or fill certain global variables that are never read.

Startup sections are actually a wrapper around the main function, but they can achieve even more flexibility than constructors: They are numbered and executed in the exact order specified by the numbers, and no extra code is executed between two consecutive startup sections, so registers can be used to pass data between two sections.

Startup sections can be used not only to insert code at the beginning of the program, but also to generate the required headers for certain file formats. Sometimes this is easier than writing the linker code to insert the required headers (see TIGCC Linker File Formats).

Since libraries may need to contain a header, some stub code that is called when the user tries to execute the library, and possibly some startup code, they may also have startup sections. However, it does not really make sense to include a startup section designed for a program in a library. Therefore, there are library startup sections, which may appear in both libraries and programs, and program startup sections, which may appear only in programs. Library startup sections are always included before program startup sections.

Startup sections are detected based on their name. To declare a program startup section, name the section _stn, where n is a value from 1 to 99999 (higher values for n may be accepted if the object file format supports section names longer than 8 characters, but it is not recommended to use them). To declare a library startup section, name it _stln, where n is a value from 1 to 9999 (higher values are not permitted). Startup sections are included in ascending order; if two startup sections use the same index, their order is undefined.


Global Imports

Just like startup sections, the concept of global imports is unique to the TIGCC linker. Global imports and startup sections are closely related to each other: It is best to keep startup sections in archive files, so they can be imported as needed, but the existing method of importing archive file members does not work. Usually, an archive file member is imported if a symbol it exports is referenced in a relocation entry. However, code that requires a specific startup section to be included does not necessarily reference any of the symbols in the corresponding archive member; it just needs the startup code to be there. A global import solves this problem by importing an archive member without inserting its address anywhere.

Actually, a global import imports all archive members that export the symbol referenced by the import (and even more, see Conditional Reaction to Global Imports). If no archive member exports this symbol, a warning is emitted. This way, it is very easy to create archive files that react to multiple imports; for example:

A global import for A would cause the files 1 and 3 to be included in the program. If it could only import one file, 6 files would be needed to get the same result:

Global import can be created using the __ref_all_... control symbol. Some global imports are also created automatically by the linker on certain conditions.

Conditional Reaction to Global Imports

Sometimes, it is convenient for an archive member to react to a global import differently than just to be included whenever the import is created. For example, a startup section may be optimized better if a certain register already holds a certain value, otherwise it must compute the value by itself. The TIGCC linker defines two symbol operators for this purpose:

To be included only if the global imports A and B are defined, a file must export the symbol A_AND_B. To be included only if the global import A is not defined, a file must export the symbol NOT_A. Symbol operators may be combined, i.e. a file exporting NOT_A_AND_B is included only if no global import A exists, and a global import B is defined.

There is a small quirk related to negated conditions: At some point, the linker needs to assume that no global imports A and B exist, and any file exporting the symbol NOT_A or NOT_B needs to be imported. However, the file which exported NOT_B may actually create a global import A after the file exporting NOT_A has been imported. The linker does not detect this, so you need to be careful not to create such situations. They are especially difficult to detect if a lot of combinations of AND and NOT operators are used, and if a lot of files that react to global imports create imports of their own.

Automatically Created Global Imports

In addition to user-defined global imports, the TIGCC linker also defines some global imports of its own, whenever special code is needed to handle a situation:

__kernel_program_header

This global import is created automatically if the linker is operating in kernel mode, and if the file is not declared as a library (see the _library control symbol).

__kernel_library_header

This global import is created automatically if the linker is operating in kernel mode, and if the file is declared as a library (see the _library control symbol).

__fargo_program_header

This global import is created automatically if the linker is operating in Fargo II mode, and if the file is not declared as a library (see the _library control symbol).

__fargo_library_header

This global import is created automatically if the linker is operating in Fargo II mode, and if the file is declared as a library (see the _library control symbol).

__flash_os_header

This global import is created automatically if the linker is operating in Flash OS mode.

__nostub_comment_header

This global import is created automatically if NoStub data (comment) exports are defined for the program. The file reacting to this import must use __ld_nostub_comment_count and __ld_insert_nostub_comments to insert the actual data exports into the header.

__handle_constructors

This global import is created automatically if constructors are defined for the program. The file which handles this import must query the constructor section using the __ld_constructors_start, __ld_constructors_end, __ld_constructors_size, and __ld_constructor_count symbols.

__handle_destructors

This global import is created automatically if destructors are defined for the program. The file which handles this import must query the destructor section using the __ld_destructors_start, __ld_destructors_end, __ld_destructors_size, and __ld_destructor_count symbols.

__handle_bss

This global import is created automatically if the program contains a BSS section (a section containing uninitialized global variables). No file needs to react to this import; if the BSS section is not created at run time (using __ld_bss_size, __ld_bss_ref_count, and an appropriate insertion for the relocation), it is simply passed on to the output file. If the output format does not support sections, the BSS section is merged with the other sections. However, if the program reacts to this import, it absolutely must handle the relocation entries pointing into the BSS section.

__initialize_bss

This global import is created automatically if the program contains a BSS section, and the program requires the contents of this section to be initialized to zero. The file reacting to this import must use __ld_bss_start, __ld_bss_end, and __ld_bss_size to query the location and size of the BSS section.

__handle_relocs

This global import is created automatically if the program contains absolute relocation entries. If no file reacts to this import, relocation entries have to be handled by the output format. The file reacting to this import must use __ld_reloc_count and an appropriate insertion to get information about the necessary relocation.

__handle_rom_calls

This global import is created automatically if the program contains ROM calls. If no file reacts to this import, ROM calls are handled by the output format. The file reacting to this import must use __ld_rom_call_count and an appropriate insertion to get information about the ROM calls.

__handle_ram_calls

This global import is created automatically if the program contains RAM calls. If no file reacts to this import, RAM calls are handled by the output format. The file reacting to this import must use __ld_ram_call_count and an appropriate insertion to get information about the RAM calls.

__handle_libraries

This global import is created automatically if the program references at least one library. If no file reacts to this import, library calls are handled by the output format. The file reacting to this import must use __ld_lib_count and an appropriate insertion to get information about the libraries.

__handle_data_var

This global import is created automatically if the data section of the program is not included in the program itself but in an external file. The file that handles this import must open this file and relocate the program accordingly. It must refer to __ld_data_var_name_end, __ld_data_size, and an appropriate insertion for the relocation.

__data_var_create_copy

This global import is created automatically if the data section of the program is not included in the program itself but in an external file, and this file needs to be copied into memory (either always or under certain circumstances).

__data_var_copy_if_archived

This global import is created automatically if the data section of the program is not included in the program itself but in an external file, and this file needs to be copied into memory only if it is archived. This import is created only in combination with __data_var_create_copy.


Symbols Built into the TIGCC Linker

The TIGCC linker is capable of resolving references to certain built-in symbols. These symbols act just like normal externally defined symbols; for example, it is possible to specify an offset to be added to the symbol in the reference. The symbols may resolve to numbers or addresses. The kind of symbol should be obvious for each individual symbol; for example, it does not make sense to jump to __ld_bss_size because it resolves to a number. All numbers have to be used as immediate values; if a symbol resolves to a number, treating it as an address and reading the value at this address will return garbage.

The following symbol names are treated as built-in symbol names, and resolved in a special way:

_ROM_CALL_...

The symbol _ROM_CALL_index is resolved to the ROM call with the index index, interpreted as a hexadecimal value. The operating system translates references to such symbols in a way that they point to the specified ROM call. This is usually a function, but it can also be a variable.

See also: tiamsapi_..., _RAM_CALL_...

tiamsapi_...

The symbol tiamsapi_index is resolved to the ROM call with the index index, interpreted as a decimal value. The operating system translates references to such symbols in a way that they point to the specified ROM call. This is usually a function, but it can also be a variable.

See also: _ROM_CALL_...

_RAM_CALL_...

The symbol _RAM_CALL_index is resolved to the RAM call with the index index, interpreted as a hexadecimal value. If RAM calls are supported, the operating system translates references to such symbols in a way that they either point to a specific location in memory, or are replaced by a specific value.

See also: _ROM_CALL_..., Extra RAM Addresses

_extraramaddr@..., _extraramaddr__...

_extraramaddr@index and _extraramaddr__index are treated as references to an extra RAM address with the index index interpreted as a hexadecimal value. index is an index into the extra RAM table defined by the program (using the _extraram symbol). The value which __extraramaddr... symbols are resolved to is either the TI-89 or the TI-92(+)/V200 value of the table row specified by index.

Internally, extra RAM addresses are stored as RAM calls and treated the same way. __ld_insert_kernel_ram_calls and __ld_insert_preos_compressed_tables output RAM calls and extra RAM addresses similarly to each other.

See also: _extraram, _RAM_CALL...

...@????, ...__????

The symbol libname@index or libname__index is resolved to a call to the library libname with the index index, interpreted as a hexadecimal value. index must have exactly four hexadecimal digits; otherwise it will not be recognized. If library calls are supported, the operating system loads the specified libraries and translates references to such symbols in a way that they point to the appropriate exported symbol in the library.

Library symbols are exported in the same way they are imported, except that the first part of the symbol (the library name) is not checked. If an exported symbol in an object file has the form libname@index or libname__index, where index is a four-digit hexadecimal number and libname does not start with a dot or an underscore, it is automatically exported from the program or library. The reason for this somewhat ambiguous pattern is purely traditional.

See also: Minimum Library Versions, _ROM_CALL_..., _RAM_CALL_...

__ld_calc_const_...

__ld_calc_const_constants, where constants is an underscore-separated list of positive integer values in decimal or hexadecimal notation (prefixed with 0x), resolves to one of the values in constants. The actual value depends on the calculator belonging to the file that is generated. This feature adds the possibility to compile a program for multiple calculators at once and still generate different files for each calculator.

The order of the calculator-specific values in constants is as follows:

  1. TI-92

  2. TI-89

  3. TI-92 Plus

  4. V200

Values for calculators which the linker is currently not generating any output file for may be omitted. If a significant value is omitted, the value is assumed to be zero, and a warning is emitted.

__ld_entry_point

References to this symbol are resolved to the first address of the program that contains executable code. If startup sections are defined, this is the address of the first startup section. However, the exact meaning of the symbol is defined by the mode the linker is operating in, since startup sections do not necessarily need to contain code, and an "entry point" in the sense of this symbol does not exist in all types of files.

Note: In Fargo II mode, all references to this symbol are manually shifted by two bytes in the negative direction, since Fargo heavily uses the program variable's address as a base address, instead of the address of the Fargo header.

__ld_entry_point_plus_0x8000

This symbol is the same as __ld_entry_point plus 0x8000 (32 KB).

__ld_program_size

This built-in symbol represents the size of the main section after all sections are merged into one. If all sections are merged into a single section, this is the size of the final linked program (without any headers or footers required by the output format). If an external data variable is used, this would be the size of the main executable only (but see the note about automatic insertion below).

Resolving of this symbol is delayed until the last pass of the linker in order to ensure the size isn't changed by later range-cutting. Currently, this is the same pass which also does automatic insertions, so insertions may or may not be counted.

This symbol is currently used by the Flash OS support to write the OS size into the header which is sent to the calculator. Flash operating systems do not use relocation, so the lack of support for automatic insertions is not a problem in this context.

__ld_constructors_start

This built-in symbol represents the beginning of the constructor section of the program. A constructor section contains an array of pointers to functions, all of which do not take any parameters. These functions are to be executed at program startup. If no constructors are used, an error is reported.

See also: __ld_constructors_end, __ld_constructors_size, __ld_constructor_count, __ld_destructors_start

__ld_constructors_end

This built-in symbol represents the end of the constructor section of the program. A constructor section contains an array of pointers to functions, all of which do not take any parameters. These functions are to be executed at program startup. If no constructors are used, an error is reported.

See also: __ld_constructors_start, __ld_constructors_size, __ld_constructor_count

__ld_constructors_size

This built-in symbol represents the size of the constructor section of the program. A constructor section contains an array of pointers to functions, all of which do not take any parameters. If no constructors are used, the symbol resolves to a value of 0. The value equals the value of __ld_constructor_count multiplied by the size of a pointer, which is 4.

See also: __ld_constructors_start, __ld_constructors_end, __ld_constructor_count

__ld_constructor_count

This built-in symbol represents the number of constructors the program uses. If no constructors are used, it resolves to a value of 0. The value equals the value of __ld_constructors_size divided by the size of a pointer, which is 4.

See also: __ld_constructors_start, __ld_constructors_end, __ld_constructors_size, __ld_destructor_count

__ld_destructors_start

This built-in symbol represents the beginning of the destructor section of the program. A destructor section contains an array of pointers to functions, all of which do not take any parameters. These functions are to be executed at program startup. If no destructors are used, an error is reported.

See also: __ld_destructors_end, __ld_destructors_size, __ld_destructor_count, __ld_constructors_start

__ld_destructors_end

This built-in symbol represents the end of the destructor section of the program. A destructor section contains an array of pointers to functions, all of which do not take any parameters. These functions are to be executed at program exit. If no destructors are used, an error is reported.

See also: __ld_destructors_start, __ld_destructors_size, __ld_destructor_count

__ld_destructors_size

This built-in symbol represents the size of the destructor section of the program. A destructor section contains an array of pointers to functions, all of which do not take any parameters. If no destructors are used, the symbol resolves to a value of 0. The value equals the value of __ld_destructor_count multiplied by the size of a pointer, which is 4.

See also: __ld_destructors_start, __ld_destructors_end, __ld_destructor_count

__ld_destructor_count

This built-in symbol represents the number of destructors the program uses. If no destructors are used, it resolves to a value of 0. The value equals the value of __ld_destructors_size divided by the size of a pointer, which is 4.

See also: __ld_destructors_start, __ld_destructors_end, __ld_destructors_size, __ld_constructor_count

__ld_reloc_count

This built-in symbol resolves to the number of absolute relocation entries in this program/library, except relocation entries to sections which are handled separately. If the program contains an external BSS and/or data section, relocs to this section are not counted, since they cannot be handled in the same manner as relocs into the program code.

See also: __ld_insert_kernel_relocs, __ld_insert_mlink_relocs, __ld_insert_compressed_relocs, __ld_insert_fargo021_relocs, __ld_insert_preos_compressed_tables

__ld_data_start

This built-in symbol represents the starting address of the data section, if the program does not mix text and data (for example if the data is written into an external file). It points to the location behind the last item in the section. If the program does not contain an explicit data section, an error is reported.

See also: __ld_data_end, __ld_data_size

__ld_data_end

This built-in symbol represents the end of the data section, if the program does not mix text and data (for example if the data is written into an external file). It points to the location behind the last item in the section. If the program does not contain an explicit data section, an error is reported.

See also: __ld_data_start, __ld_data_size

__ld_data_size

This built-in symbol represents the size of the data section in bytes, if the program does not mix text and data (for example if the data is written into an external file). If the program does not contain a data section, the symbol resolves to the value 0.

See also: __ld_data_start, __ld_data_end

__ld_data_ref_count

This built-in symbol represents the number of references to the data section, if the program does not mix text and data (for example if the data is written into an external file). If the program does not contain an explicit data section, it resolves to the value 0. If the data section is merged with the other sections, the references counted by this symbol and the references counted by __ld_reloc_count will overlap.

See also: __ld_insert_kernel_data_refs, __ld_insert_mlink_data_refs, __ld_insert_compressed_data_refs

__ld_bss_start

This built-in symbol represents the starting address of the BSS section. If the program does not contain a BSS section, an error is reported.

See also: __ld_bss_end, __ld_bss_size

__ld_bss_end

This built-in symbol represents the end of the BSS section. It points to the location behind the last item in the section. If the program does not contain a BSS section, an error is reported.

See also: __ld_bss_start, __ld_bss_size

__ld_bss_size

This built-in symbol represents the size of the BSS section in bytes. If the program does not contain a BSS section, the symbol resolves to the value 0.

See also: __ld_bss_start, __ld_bss_end

__ld_bss_ref_count

This built-in symbol represents the number of references to the BSS section. If the program does not contain a BSS section, it resolves to a value of 0. If the BSS section is merged with the other sections, the references counted by this symbol and the references counted by __ld_reloc_count will overlap.

See also: __ld_insert_kernel_bss_refs, __ld_insert_mlink_bss_refs, __ld_compressed_bss_refs, __ld_fargo021_bss_refs, __ld_insert_preos_compressed_tables

__ld_rom_call_count

This built-in symbol resolves to the number of ROM calls in this program/library. If the program/library does not reference any ROM code exports, it resolves to a value of 0.

See also: __ld_insert_kernel_rom_calls, __ld_insert_mlink_rom_calls, __ld_insert_compressed_rom_calls, __ld_insert_preos_compressed_tables, ROM Calls

__ld_ram_call_count

This built-in symbol resolves to the number of RAM calls in this program/library. If the program/library does not reference any RAM exports, it resolves to a value of 0. Extra RAM address references count as RAM calls as well. Note that RAM calls are handled purely by kernels, i.e. by middleware residing in the RAM.

See also: __ld_insert_kernel_ram_calls, __ld_insert_preos_compressed_tables, RAM Calls, Extra RAM Addresses

__ld_lib_count

This built-in symbol resolves to the number of libraries needed by this program/library. The libraries do not actually need to be used; it is enough for a file to specify a required minimum version for a specific library (see Minimum Library Versions). The idea is that a program should be able to specify a minimum version for a library even if the library is only referenced indirectly via another one. See __ld_referenced_lib_count for a way to count only the libraries that are actually used.

See also: __ld_referenced_lib_count, __ld_insert_kernel_libs, __ld_insert_fargo020_libs, __ld_insert_fargo021_libs, __ld_insert_preos_compressed_tables, Library Calls, Minimum Library Versions

__ld_referenced_lib_count

This built-in symbol acts like __ld_lib_count, except that it counts only the libraries that are actually used. If the program only specified a minimum version for a library, but did not use any of its exported symbols, this library is not counted.

See also: __ld_lib_count, __ld_insert_kernel_libs, __ld_insert_fargo020_libs, __ld_insert_fargo021_libs, __ld_insert_preos_compressed_tables, Library Calls, Minimum Library Versions

__ld_export_count

This built-in symbol resolves to the number of exported items in the current program/library. This equals the highest export number present in the linked files (see Library Calls) plus 1. For example, defining a single exported entry named mylib@00FF causes the export table to be 0x100 functions long, and therefore this symbol to be resolved to a value of 0x100.

See also: __ld_insert_kernel_exports, Library Calls

__ld_nostub_comment_count

This built-in symbol resolves to the number of NoStub data exports in the current program. Unlike for library exports, skipped indices are not counted.

See also: __ld_insert_nostub_comments

__ld_has_...

The symbol __ld_has_items resolves to a nonzero value if __ld_item_count is greater than zero. Currently this value is -1, but do not rely on this!

__ld_file_version

This symbol resolves to the version number of the current program, defined with _version.... If no version number has been defined, it resolves to a value of 0.

See also: _version...

__ld_kernel_flags

This symbol resolves to the value of the kernel flags. Kernel flags may be specified with _flag_... and with calculator control symbols.

See also: _flag_...

__ld_kernel_bss_table

Usually, this symbol is simply resolved to a user-defined symbol named __kernel_bss_table. However, if the program does not contain a BSS section, it is redirected to the entry point of the program. The effect is that constructs of the form

.word __ld_kernel_bss_table-entry_point

resolve to 0 if no BSS section is used.

Note: If a program/library defines __kernel_bss_table, it absolutely must handle the BSS section. See __ld_insert_kernel_bss_refs for a way to get information about references into the BSS section.

__ld_kernel_export_table

Usually, this symbol is simply resolved to a user-defined symbol named __kernel_export_table. However, if the program does not contain any exported symbols, it is redirected to the entry point of the program. The effect is that constructs of the form

.word __ld_kernel_export_table-entry_point

resolve to 0 if no exports exist.

__ld_data_var_name_end

This symbol is resolved to the address of the user-defined symbol __data_var_name_start, plus the length in bytes of the data variable name, plus 1 for the zero byte at the beginning. If no data variable is specified for the program, an error is reported. The symbol __data_var_name_start must be defined and exported by the program.

__ld_hardware_id

This built-in symbol resolves to the hardware ID of the target calculator type, a number used in the Flash OS stub.

__ld_link_time_year

This built-in symbol represents the current year when starting the linking process. It is written in binary form, not in string form.

__ld_link_time_month

This built-in symbol represents the current month when starting the linking process. It is written in binary form, not in string form.

__ld_link_time_day

This built-in symbol represents the current day when starting the linking process. It is written in binary form, not in string form.

__ld_link_time_timestamp

This built-in symbol represents the number of seconds elapsed between January 1st 1997, 00:00:00 UTC and the beginning of the linking process. It is written in binary form, not in string form.

This built-in is provided so that the Flash OS support library can define a meaningful value for the 0x900 subfield of the 0x320 certificate field contained in the Flash OS header, thus mimicking AMS and its FlashApps.

_exit

Usually, this symbol is not handled in a special way. However, if it does not exist at all, it is redirected to the entry point of the program. The effect is that constructs of the form

.word _exit-entry_point

resolve to 0 if the symbol is undefined.

The kernel headers of the standard library reference this symbol as the program/library destructor. For kernel programs and libraries, it is called whenever the program exits or the library is unloaded.

_comment

Usually, this symbol is not handled in a special way. However, if it does not exist at all, it is redirected to the entry point of the program. The effect is that constructs of the form

.word _comment-entry_point

resolve to 0 if the symbol is undefined.

The kernel headers reference this symbol as the comment string of the program. If it exists, it must be a zero-terminated ASCII string.

_extraram

Usually, this symbol is not handled in a special way. However, if it does not exist at all, it is redirected to the entry point of the program. The effect is that constructs of the form

.word _extraram-entry_point

resolve to 0 if the symbol is undefined.

The kernel headers of the standard library reference this symbol as an extra RAM table. The table is organized in pairs of 16 bit values. Of each pair, the first value is relevant for the TI-89, and the second value is relevant for the TI-92(+)/V200 calculator family. In C, you would define an extra RAM table like this:

struct {
  short value89, value9x;
} _extraram[] = {{v1_89, v1_9x}, {v2_89, v2_9x}, ...};

However, extra RAM tables are barely usable in C: The compiler does not support using external symbols as immediate values, except if you take their address.

See also: Extra RAM Addresses

_library

Usually, this symbol is not handled in a special way. However, if it does not exist at all, it is redirected to the entry point of the program. The effect is that constructs of the form

.word _library-entry_point

resolve to 0 if the symbol is undefined.

Note: _library is also a control symbol, which means that under normal circumstances, references to it are not allowed. However, in Fargo II mode, programs and libraries have special permission to use this symbol.


Automatically Inserted Section Contents

The TIGCC linker can insert certain variable-length data into the contents of sections. If a symbol (i.e., a label) at the end of a section is recognized as an insertion point, then the linker appends the data specified by the symbol name. If the symbol is not at the end of a section, the insertion will fail without notice, since these contents may have been inserted automatically already.

You may refer to an insertion symbol even if you did not put a label at a specific place. In this case, the data is written to an arbitrary place (usually the end of the program, but do not rely on this). However, all object files and archives are searched for exported symbols with this name first, to avoid duplication of the data.

This method is only used if some program-related data cannot be expressed using simple built-in symbols. It should be used with care, as the inserted data may be invalid under certain circumstances. The format of the data is fixed and usually represents some already established data format, but new data formats may be developed on demand. The insertion symbols which are currently recognized are:

__ld_insert_kernel_relocs

Relocation entries indicate that the program needs to know some addresses which are only available at run time. In this case, the addresses referred to are locations inside the program code or data. Usually, all references to absolute addresses inside the program are inserted into the section if one of these two insertions is used; however, sections may be marked as being handled in another way, which prevents relocation entries to them from being output in this way. The data section is automatically marked as handled if it is externalized; the BSS section is marked as handled by referencing __ld_kernel_bss_table, by using __ld_insert_kernel_bss_refs before inserting the relocation entries, or by reacting to the __handle_bss global import.

__ld_insert_kernel_relocs uses the kernel format for storing relocation entries, which is used by kernels on the TI-89, TI-92 Plus, and V200, and by Fargo v0.2.0:

If a program uses this insertion, it must process it as follows:

Before program termination, this process has to be reverted, so that it can be repeated the next time the program starts. Since programs may be moved in memory while they are not executed, they may not simply deactivate the relocation code. This would also prevent programs from being transferred between devices.

Note: Relocation entries may only be inserted at a single place in the program. The reason for this is that the linker may have to add new relocation entries after they have been written into the section. Instead of keeping track of which entries have already been processed, we thought it would be easier to remove them once they have been written into a section. Also, it is dangerous to use this insertion from anything other than a startup section.

See also: __ld_insert_kernel_bss_refs, __ld_insert_kernel_data_refs, __ld_insert_mlink_relocs, __ld_insert_compressed_relocs

__ld_insert_mlink_relocs

__ld_insert_mlink_relocs inserts relocs in a compressed format known from mlink. For more information on inserting and processing relocs, see __ld_insert_kernel_relocs.

In the following format description, offset refers to the difference in words (half of the difference in bytes) between the start of this reloc and the start of the previous reloc. If there is no previous reloc (i.e. for the first reloc), offset is the distance in words between this reloc and the symbol __ld_mlink_relocs_ref. This symbol must be exported to be found. If it is not found, the entry point is used instead (see __ld_entry_point).

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_mlink_bss_refs, __ld_insert_mlink_data_refs, __ld_insert_kernel_relocs

__ld_insert_compressed_relocs

__ld_insert_compressed_relocs inserts relocs in a compressed format known from Fargo. For more information on inserting and processing relocs, see __ld_insert_kernel_relocs.

In the following format description, offset refers to the difference in words (half of the difference in bytes) between the start of this reloc and the end of the previous reloc. If there is no previous reloc (i.e. for the first reloc), offset is the distance in words between this reloc and the symbol __ld_compressed_relocs_ref. This symbol must be exported to be found. If it is not found, the entry point is used instead (see __ld_entry_point).

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_compressed_bss_refs, __ld_insert_compressed_data_refs, __ld_insert_kernel_relocs

__ld_insert_fargo021_relocs

__ld_insert_fargo021_relocs inserts relocs in the compressed format used by Fargo 0.2.1. For more information on inserting and processing relocs, see __ld_insert_kernel_relocs.

This insertion is the same as __ld_insert_compressed_relocs, except that the the reference symbol used if there is no previous reloc (i.e. for the first reloc) is __ld_fargo021_relocs_ref. It is expected by Fargo to be at a fixed position: the position of the format flag in the Fargo header. This is currently handled by the definition of the Fargo header.

Fargo support must be compiled in for this insertion to be defined.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_compressed_relocs, __ld_insert_kernel_relocs

__ld_insert_kernel_bss_refs

__ld_insert_kernel_bss_refs outputs references to the BSS section in the format defined in __ld_insert_kernel_relocs. The only difference is that the relocation address is not the entry point of the program but the beginning of the BSS section.

If you insert these references, the linker assumes that the BSS section is handled by you; that is, you have to allocate it dynamically using __ld_bss_size and use a pointer to it as the relocation address.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_kernel_data_refs, __ld_insert_kernel_relocs, __ld_insert_mlink_bss_refs, __ld_insert_compressed_bss_refs

__ld_insert_mlink_bss_refs

__ld_insert_mlink_bss_refs inserts relocs in the format defined in __ld_insert_mlink_relocs. The only differences are that the relocation address is not the entry point of the program but the beginning of the BSS section and that the reference symbol used if there is no previous reloc (i.e. for the first reloc) is __ld_mlink_bss_refs_ref.

If you insert these references, the linker assumes that the BSS section is handled by you; that is, you have to allocate it dynamically using __ld_bss_size and use a pointer to it as the relocation address.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_mlink_data_refs, __ld_insert_mlink_relocs, __ld_insert_kernel_bss_refs

__ld_insert_compressed_bss_refs

__ld_insert_compressed_bss_refs inserts relocs in the format defined in __ld_insert_compressed_relocs. The only differences are that the relocation address is not the entry point of the program but the beginning of the BSS section and that the reference symbol used if there is no previous reloc (i.e. for the first reloc) is __ld_compressed_bss_refs_ref.

If you insert these references, the linker assumes that the BSS section is handled by you; that is, you have to allocate it dynamically using __ld_bss_size and use a pointer to it as the relocation address.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_compressed_data_refs, __ld_insert_compressed_relocs, __ld_insert_kernel_bss_refs

__ld_insert_fargo020_bss_refs

__ld_insert_fargo020_bss_refs acts like __ld_insert_kernel_bss_refs, except that it always outputs the two terminating zero bytes, even if no references into the BSS section exist.

Fargo support must be compiled in for this insertion to be defined.

See also: __ld_insert_kernel_bss_refs

__ld_insert_fargo021_bss_refs

__ld_insert_fargo021_bss_refs inserts relocs in the compressed format used by Fargo 0.2.1. It acts like __ld_insert_compressed_bss_refs, except that the size of the BSS section is automatically output (as a 2-byte entry) in front of the actual relocation table, and that the reference symbol used if there is no previous reloc (i.e. for the first reloc) is __ld_fargo021_bss_refs_ref. It is expected by Fargo to be at a fixed position: the position of the format flag in the Fargo header. This is currently handled by the definition of the Fargo header.

For more information on inserting and processing relocs, see __ld_insert_kernel_relocs.

Fargo support must be compiled in for this insertion to be defined.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_fargo021_relocs, __ld_insert_kernel_relocs

__ld_insert_kernel_data_refs

__ld_insert_kernel_data_refs outputs references to the data section in the format defined in __ld_insert_kernel_relocs. The only difference is that the relocation address is not the entry point of the program but the beginning of the data section.

If you read the data from an external variable (see __handle_data_var), you have to use the address of the variable (or a copy) as the relocation address.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_kernel_bss_refs, __ld_insert_kernel_relocs, __ld_insert_mlink_data_refs, __ld_insert_compressed_data_refs

__ld_insert_mlink_data_refs

__ld_insert_mlink_data_refs outputs references to the data section in the format defined in __ld_insert_mlink_relocs. The only differences are that the relocation address is not the entry point of the program but the beginning of the data section and that the reference symbol used if there is no previous reloc (i.e. for the first reloc) is __ld_mlink_data_refs_ref.

If you read the data from an external variable (see __handle_data_var), you have to use the address of the variable (or a copy) as the relocation address.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_mlink_bss_refs, __ld_insert_mlink_relocs, __ld_insert_kernel_data_refs

__ld_insert_compressed_data_refs

__ld_insert_compressed_data_refs outputs references to the data section in the format defined in __ld_insert_compressed_relocs. The only differences are that the relocation address is not the entry point of the program but the beginning of the data section and that the reference symbol used if there is no previous reloc (i.e. for the first reloc) is __ld_compressed_data_refs_ref.

If you read the data from an external variable (see __handle_data_var), you have to use the address of the variable (or a copy) as the relocation address.

Note: The limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_compressed_bss_refs, __ld_insert_compressed_relocs, __ld_insert_kernel_data_refs

__ld_insert_kernel_rom_calls

__ld_insert_kernel_rom_calls can be used to handle ROM calls. It inserts references to ROM calls in the format used by kernels:

If a program uses this insertion, it must process it as follows:

Before program termination, this process has to be reverted, so that it can be repeated the next time the program starts. Simply deactivating the relocation code would prevent programs from being transferred between devices.

See also: __ld_insert_mlink_rom_calls, __ld_insert_compressed_rom_calls

__ld_insert_mlink_rom_calls

__ld_insert_mlink_rom_calls can be used to handle ROM calls. It inserts references to ROM calls in a compressed format known from mlink but specifically altered for TIGCC:

For more information on processing ROM call relocation, see __ld_insert_kernel_rom_calls.

See also: __ld_insert_kernel_rom_calls

__ld_insert_compressed_rom_calls

__ld_insert_compressed_rom_calls can be used to handle ROM calls. It inserts references to ROM calls in a compressed format known from Fargo but specifically altered for TIGCC:

For more information on processing ROM call relocation, see __ld_insert_kernel_rom_calls.

See also: __ld_insert_kernel_rom_calls

__ld_insert_kernel_ram_calls

__ld_insert_kernel_ram_calls can be used to handle RAM calls. It inserts references to RAM calls in the format used by kernels:

If a program uses this insertion, it must process it as follows:

Before program termination, this process has to be reverted, so that it can be repeated the next time the program starts. Simply deactivating the relocation code would prevent programs from being transferred between devices.

__ld_insert_kernel_libs

__ld_insert_kernel_libs can be used to handle library calls. It inserts references to libraries in the format used by kernels:

If a program uses this insertion, it must process it as follows:

Before program termination, this process has to be reverted, so that it can be repeated the next time the program starts. Since programs and libraries may be moved in memory while they are not executed, they may not simply deactivate the relocation code. This would also prevent programs from being transferred between devices.

See also: __ld_insert_fargo020_libs, __ld_insert_fargo021_libs

__ld_insert_fargo020_libs

__ld_insert_fargo020_libs can be used to handle library calls. It inserts references to libraries in the format used by Fargo v0.2.0:

The libraries have to be processed using the method described in __ld_insert_kernel_libs, except that library versions are not implemented by this format.

Note: This insertion is available only if Fargo support is compiled in.

See also: __ld_insert_fargo021_libs, __ld_insert_kernel_libs

__ld_insert_fargo021_libs

__ld_insert_fargo021_libs can be used to handle library calls. It inserts references to libraries in the format used by Fargo v0.2.1:

The libraries have to be processed using the method described in __ld_insert_kernel_libs, except that library versions are not implemented by this format.

Note: This insertion is available only if Fargo support is compiled in.

See also: __ld_insert_fargo020_libs, __ld_insert_kernel_libs

__ld_insert_kernel_exports

__ld_insert_kernel_exports can be used to export symbols from a library. It treats all symbols that are declared external and look like "libname@index" or "libname__index" as exported entries. index is a hexadecimal number which must have exactly 4 digits.

__ld_insert_kernel_exports inserts library exports in the format used by kernels:

Note: Since exported entries are stored one after another, skipped entries will take up additional space in the export table. For example, if you only define one symbol called "libname@0010", then there will be 16*2=32 bytes of zeroes in the export table.

See also: __ld_insert_fargo_exports

__ld_insert_fargo_exports

__ld_insert_fargo_exports can be used to export symbols from a library. It treats all symbols that are declared external and look like "libname@index" or "libname__index" as exported entries. index is a hexadecimal number which must have exactly 4 digits.

__ld_insert_fargo_exports inserts library exports in the format used by the Fargo II kernel:

Note: Since exported entries are stored one after another, skipped entries will take up additional space in the export table. For example, if you only define one symbol called "libname@0010", then there will be 16*2=32 bytes of zeroes in the export table.

This insertion is available only if Fargo support is compiled in.

See also: __ld_insert_kernel_exports

__ld_insert_preos_compressed_tables

__ld_insert_preos_compressed_tables is the most complex of the automatic insertions. It inserts all relocation-related tables in the compressed format expected by PreOs 0.68 or higher. PreOs expects those tables to be pointed to by the same pointer, so they need to be inserted all at once. The reference address expected by PreOs is the same for all relocation tables: 36 (0x24). It is defined as the end address of the smallest possible header/stub combination. (However, the smallest possible stub is not usable in practice because it does not emit any error messages. Therefore, the address does not correspond to any actual address in TIGCCLIB, so it is hard-coded in the linker.) The tables it inserts are, in order:

PreOs uses a special format for the indices. It is not the same as for the relocation table. Instead, a PreOs index is encoded using one of the following formats:

where index is the actual index, and offset is index + 1 for the first index and the difference between index and the previous index for the following ones.

Note: Since parts of this insertion are dealing with relocs, the limitations of __ld_insert_kernel_relocs also apply to this insertion.

See also: __ld_insert_compressed_relocs, __ld_insert_compressed_rom_calls, __ld_insert_kernel_ram_calls, __ld_insert_compressed_bss_refs

__ld_insert_nostub_comments

__ld_insert_nostub_comments is used to export data symbols in the NoStub comment header. It treats all symbols that are declared external and look like "_nostub_data__index" as exported entries. index is a hexadecimal number which must have exactly 4 digits.

__ld_insert_nostub_comments inserts data exports in the format used by the NoStub comment specification:

__ld_insert_data_var_name

__ld_insert_data_var_name inserts an ANSI string containing the name of the data variable as specified during the invocation of the linker. A terminating zero byte is appended; however, the name does not automatically start with a zero byte.


TIGCC Linker Binary Code Fixup

The TIGCC linker can do many operations on binary code. If you want it to behave correctly in all cases, you need to make sure that no executable code is included in a data section, and no data is included in a code section. However, TIGCC usually merges all data into the code section to optimize references to it, so in rare cases it is possible for the linker to generate incorrect code.

Binary code fixup is divided into several categories:

NOP Instruction Removal

The TIGCC linker features the removal of unnecessary NOP instructions. For file formats which may insert a NOP (No OPeration) instruction at the end of a section in order to align its size to a specific boundary, it can remove this instruction to save a little space. Currently only the AmigaOS format is known to do this (see TIGCC Linker File Formats). If the section ends with more than one NOP instruction, all instructions are kept.

Return Sequence Optimization

The TIGCC linker can optimize function return sequences. If a section ends with a subroutine branch followed by a simple return instruction, the subroutine branch is converted into a simple unconditional branch (jump), and the return instruction is removed. Note that this may fail easily if there is a branch to the return instruction somewhere; if the return instruction is removed, the branch will point to arbitrary code or data. You can make this less likely by telling the assembler to emit all local labels, so the linker knows it cannot optimize a return sequence because there is a label in front of the return instruction. With the GNU Assembler, this is done by using the '--keep-locals' option, which is included automatically if range-cutting is enabled. With the A68k Assembler, the '-d' switch does the job.

Branch Fixup and Optimization

On some architectures, certain branches are not permitted. For example, on the MC68000 processor, it is not possible to branch to the next instruction using a short branch. While the assembler usually detects such invalid situations, they may still occur if the branch target is in a different section or file. The TIGCC linker detects such invalid situations and tries to resolve them as well as possible: If it is invalid for a branch at the end of a section to point to the beginning of the next section, it is removed unless it is a subroutine branch. For subroutine branches, a NOP instruction is inserted instead.

In addition to fixing invalid branches, the TIGCC linker can optimize branch instructions to reduce the number of absolute relocations needed. If an absolute branch (jump or subroutine branch) can be converted to a relative branch, the operating system does not need to insert the destination address at run time; therefore this will save space. Moreover, if range-cutting is enabled, optimizing branches can reduce the size of the code.

F-Line Branch Optimization

The linker can convert absolute branches (which would normally need a relocation entry) into special relative F-Line sequences. These sequences are handled by an interrupt handler. The fact that an interrupt is needed makes these branches significantly slower, but using them can save quite a bit of space in the program.

There are two types of F-Line branches: The default version can be activated using the __ld_use_fline_jumps control symbol. Each branch has a size of six bytes. They are relative to their own address, which means that they can be supported by the AMS, and in fact, the AMS implements an interrupt handler for these branches starting from version 2.04. The other version can be activated using the __ld_use_4byte_fline_jumps control symbol. As the name says, each branch has a size of four bytes. They are relative to the program's entry point, so only an emulator that is installed from the program can handle them. Since they use codes that are otherwise used for ROM calls, this might break applications that are called from the program, if any. However, this is very unlikely, as the two ROM calls used are not defined yet.

Move/Load/Push Instruction Optimization

If this type of optimization is turned on, then the TIGCC linker optimizes all instructions that move data between two places. This includes instructions to move data between memory and registers or between two places in memory, instructions to load the address a of memory location into a register, and instructions to push the contents of a memory location on the stack. Note that due to the great variety of such instructions, this optimization is more likely to cause errors than others.

This optimization can reduce the number of absolute references to locations inside the program, and it can also decrease the size of the code if range-cutting is enabled.

Compare/Test Instruction Optimization

If this type of optimization is turned on, then the TIGCC linker optimizes all instructions that compare the contents of a memory location with something. This includes operations that compare data and operations that test whether something is zero.

This optimization can reduce the number of absolute references to locations inside the program, and it can also decrease the size of the code if range-cutting is enabled.

Calculation Instruction Optimization

If this type of optimization is turned on, then the TIGCC linker optimizes all instructions that perform calculations based on data stored in memory. This includes addition, subtraction, multiplication, division, and bitwise manipulation instructions.

This optimization can reduce the number of absolute references to locations inside the program, and it can also decrease the size of the code if range-cutting is enabled.


ld-tigcc Program Dumps

If you turn on dumps in ld-tigcc (using the '--dump[n]' option), ld-tigcc prints the contents of the internal data structures to standard output between various events during the linking stage. These places are numbered, so that you can turn on a specific dump by adding an index to the option. The following table shows the current location of the different dumps:

Dump 0

This dump is produced just after all object files have been imported. Relocation entries that refer to different files have not been resolved yet; neither have ROM/RAM/library call symbols been translated into actual ROM/RAM/library calls. Archive members have only been imported if this was specified by a global import.

Dump 1

Relocation entries have been resolved to the maximum extent possible. If they could not be resolved to existing symbols or had been treated as RAM/ROM/library calls, archive members have been imported for them.

Dump 2

All uninitialized and zero-data sections have been merged, as well as all data sections if this was necessary. Automatic global imports have been added.

Dump 3

All global imports have been processed; even the ones which contained negations.

Dump 4

Relocation entries from the archive members which were just imported by global imports have been resolved, possibly importing new archive members. If relocation optimization is enabled, another dump is appended with the same contents after relocation optimization. If removal of unused sections is enabled, a third dump is inserted after this removal.

Dump 5

All sections which are not externalized have been merged. Note that some parts of the code fixup need to be done just before the sections are merged, so there is also a lot of code fixup and optimization between dumps 4 and 5.

Dump 6

The remaining code fixup and optimization has been performed.

Dump 7

Certain built-in symbols whose value depends on the program contents have been resolved. Insertions that were requested to be added at an arbitrary place have been added to the end of the program.

Dump 8

Relative relocation entries have been replaced by the actual distances they had represented.

The program dumps are usually self-explaining; however, a few items require special attention:

<nB: target [- relation] [+/- offset]>

Indicates a relocation entry or ROM/RAM/library call. target and relation have the form symbol[+/-offset], which stands for the address of symbol, corrected by offset. n specifies the number of bytes reserved for the address/offset of the target.

<nB: ... (rel)>

Indicates that the relocation entry is relative to its own address.

<nB: symbol (?) ...>

Indicates a relocation entry which has not been resolved yet.

<nB: symbol (->) ...>

Indicates a relocation entry pointing into another section.

<nB: ...> (!)

Indicates that the section data at the place of the relocation entry is nonzero. This usually indicates a problem, but in rare cases, it is valid. For example, if you use ROM/RAM/library calls with an offset, this offset is usually emitted into the section contents, but the actual call still exists.

address: (!)

Indicates an internal inconsistency (for example an item outside of a section, overlapping items, or an internal ordering error).

Section offsets and data are always output in hexadecimal notation. Question marks indicate uninitialized data, which may have random content.


Recompiling ld-tigcc and ar-tigcc

Recompiling ld-tigcc and ar-tigcc (or the corresponding link DLL) from source may be useful if you want to make the linker as efficient as possible by disabling certain features. Recompilation requires GCC and GNU make. If you are using Linux, you may simply run make in the source code directory; the same is probably true for other Unix variants. If you are using Windows, you need to download MSYS or make some minor modifications to the makefile.

If you only want to disable some features, you can take a look at the definitions in the makefile (the file called Makefile). The DEFINES variable contains the general features to be included; EXE_DEFINES contains the features that should only be included in the executable files (not in the DLL). All available definitions are documented at the top of generic.h.

For example, if you want to disable support for the AmigaOS files generated by the A68k Assembler, you may simply remove the '-DAMIGAOS_SUPPORT' definition from the DEFINES variable. Note that some combinations are invalid; for example, if you disable support for all object file formats, you will get a "file format not recognized" error whenever you try to link some files.


Return to the main index