Skip to main content

The FCML Library Reference Manual

Authors:
Sławomir Wojtasiak
Copyright © 2014-2016 Sławomir Wojtasiak. All Rights Reserved.

Table of Contents

What is FCML

FCML is an acronym for Free Code Manipulation Library. This is a general purpose machine code manipulation library for IA-32 and Intel 64 architectures. The library supports UNIX-like systems as well as Windows and is highly portable. The FCML library is free for commercial and non-commercial use as long as the terms of the LGPL license are met. Currently it supports such features as:

Features

All major features supported by the FCML library:

  • A one-line disassembler
  • A one-line assembler
  • An experimental multi-pass load-and-go assembler (Multi line!)
  • Support for Intel and AT&T syntax
  • Instruction parsers
  • Instruction renderers
  • Instructions represented as generic models
  • GNU/Linux and Windows support
  • Portable - written entirely in C
  • Supported instruction sets: MMX, 3D-Now!, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, SSE4A, AVX, AVX2, AES, TBM, BMI1, BMI2, HLE, ADX, CLMUL, RDRAND, RDSEED, FMA, FMA4, LWP, SVM, XOP, VMX, SMX

License

FCML (Free Code Manipulation Library).

Copyright (C) 2010-2014 Sławomir Wojtasiak

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public

License along with this library; if not, write to the Free Software

Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

Installation

The following chapters explain how to download and install FCML under GNU/Linux and Windows systems.

Downloading

Follows this link Download in order to download the current FCML distribution package. The distribution package contains source code and predefines binaries for windows.

GNU/Linux

FCML uses autotools, so its installing process is standardized and follows all autotools rules. As so everything you have to do is to execute the following commands:

./configure
make && make install

Then, in order to check if everything works you should also execute unit tests using the following command:

make check

It should return a test execution report, like this one:

Test [Internal unit tests]: 
Run Summary:      Type         Ran     Passed     Failed 
                suites          13         13          0 
                 tests         184        184          0 
               asserts        1772       1772          0 
PASS: fcml_internal_check 
============= 
1 test passed 
============= 
…
Test [Public API tests]: 
Run Summary:      Type         Ran     Passed     Failed 
                suites          28         28          0 
                 tests         610        610          0 
               asserts        8096       8096          0 
PASS: fcml_public_check 
============= 
1 test passed 
============= 

FCML do not have any external dependencies and as such the build process is quite straightforward and is very unlikely to fail. Anyway, if needed you can customize the build process by adding additional options to the configure script. For instance, in order to build only the static library, you can disable the shared one by adding –-disable-shared option as follows:

./configure --disable-shared

Use –-help parameter in order to display all available configuration options. Do not hesitate to read INSTALL file in the distribution archive, which describes installation process in every details.

If you have Doxygen installed the API documentation will be also generated.

Windows

Building Windows binaries is a bit more complicated and it is why there are dedicated archives with pre-built libraries.

MinGW

In case of MinGW the build process is quite straightforward as long as we do not want to build a library with undecorated symbol names. In such a case the configure script should be used the same way as in the case of UNIX-like systems. For instance:

./configure
make && make install

Remember that such libraries are not compatible with Visual Studio, because Visual Studio uses different symbol decorations. Of course you can generate undecorated symbols using the MinGW tools chain, but it is a bit more advanced task and it is out of scope of this manual (see dlltool and module definition files).

If you would like to build binaries for x86_64 architecture you have to install mingw-w64 and then set the appropriate host running the configure script:

$ ./configure --host=x86_64-w64-mingw32
…
***************************************************
fcml version 1.0.0 
Host CPU.........: x86_64 
Host OS:.........: mingw32 
Prefix:..........: /usr/local 
Debug Build......: no 
Shared Library...: yes 
Compiler.........: x86_64-w64-mingw32-gcc -std=gnu99 -g -O2 
Linker...........: C:/mingw64/x86_64-w64-mingw32/bin/ld.exe 
Doxygen..........: NONE 
***************************************************

In the report "Host CPU" should point to the appropriate CPU architecture.

If you have Doxygen installed the API documentation will also be generated. As you could see in the example above Doxygen was not found.

Visual Studio

There are solutions prepared for Visual Studio available in win32/vs2008, win32/vs2010 and win32/vs2013 directories. Everything you need to do is to load such a solution using your Visual Studio and click build. All paths are relative to the distribution directory so everything should be built without any problems. You can also choose between a few configurations in order to build static or dynamic libraries. Solutions supports x86 and x64 builds.

Remember that header files available in ${DIST_DIR}/include have to be added as include directory to the destination project in order to use the built libraries.

Visual Studio Express is fully supported, so if you do not have access to the full version, you can build the library using the express version.

Quick Guide

The following chapters show how to assemble and disassemble instructions with as little effort as possible without digging in every detail and function supported by the library. These chapters are completely independent, so if you are interested only in disassembling feel free to read only the disassembler section.

You will need a working FCML binaries, therefore if you do not have any you should head over to the following chapter first: Installing FCML.

Assembler

The FCML assembler allows us to assemble instructions encoded in a generic instruction model (called GIM in the next sections) prepared by the user or returned as a result of the instruction parsing process. Therefore the first thing is the GIM. For the purpose of this chapter we will prepare it on our own, but you can also convert a textual instruction to a GIM instance using FCML parsers (see: Parser for more details about parsing).

So for instance the GIM for a simple instruction: "adc ax, 0x8042" can be encoded as follows:

#include <fcml_common.h>

fcml_st_instruction instruction = {0};
instruction.mnemonic = "adc";
instruction.operands[0] = FCML_REG( fcml_reg_AX );
instruction.operands[1] = FCML_IMM16( 0x8042 );
instruction.operands_count = 2;

The structure fcml_st_instruction is defined in the fcml_common.h header file. It is the main structure of the GIM model. Utility macros FCML_REG and FCML_IMM16 are defined in the fcml_common_utils.h header file and can be used just to make source code shorter. See the page API where all utility macros and functions are described.

The second line allocates space for the instruction model. In the third line we specify an instruction by its mnemonic (remember that mnemonic is dialect dependant). Line four defines a register operand with one AX general purpose register. Line five sets the second operand to 16-bit unsigned immediate value and in the last line, the number of used operands is specified.

Now when the GIM is ready it is time to prepare the assembler which will be able to assemble the model. In order to initialize the assembler an initialized dialect instance is needed. Thanks to the dialects the library is able to use different instruction syntaxes like the Intel or AT&T. So let's prepare an instance of the Intel dialect:

#include <fcml_intel_dialect.h>

fcml_st_dialect *dialect;
fcml_ceh_error error = fcml_fn_dialect_init_intel( FCML_INTEL_DIALECT_CF_DEFAULT, &dialect );
if( error ) {
	printf( "Cannot initialize the Intel dialect, error: %d\n", error );
	exit(EXIT_FAILURE);
}

As you can see there are dedicated functions to create different dialects. Every function can also take additional parameters used to configure the initialized dialect instance. In theory it should be possible to implement every existing dialect for FCML but currently only the Intel and AT&T (called GAS) dialects are supported. For the sake of the example the Intel dialect (the preferred one) is used.

The dialect is ready, so let's initialize the assembler using the fcml_fn_assembler_init function:

fcml_st_assembler *assembler;
error = fcml_fn_assembler_init( dialect, &assembler );

To make the code a bit clearer error handling has been avoided in this specific case, but it should be implemented in the same way as in the case of dialect initialization. All possible error codes are defined in the fcml_errors.h header file.

Once both the dialect and the assembler are initialized, there is the last thing to be done before assembling is possible. It is an assembler result structure. This structure is reusable, therefore it has to be prepared in the right way in order to allow assembler to reuse it correctly. To do so, a manually allocated structure has to be passed to the fcml_sn_assembler_result_prepare function:

fcml_st_assembler_result asm_result;
fcml_fn_assembler_result_prepare( &asm_result );

Notice that we have not used the partial initialization to clear the memory held by the allocated structure this time. We could omit it because it is the mentioned function that clears the structure for us.

That is all, assembler is prepared to do its job, so let's try to assemble the model from the example above.

The main structure that has to be properly prepared for the assembler to work is fcml_st_assembler_context. It consists of the previously initialized assembler instance which should be used to assemble machine code, some configuration flags we can use to configure assembling process and an entry point which will be used to inform the assembler about the code segment the instruction is destined for. The assembler context itself can be initialized on the stack but it is very important to clear memory it uses before passing it to the assembler. We should do it just to set all configuration options and other parameters to their default values. For example the following source code shows the proper way to initialize the assembler context:

fcml_st_assembler_context context = {0};

Having the context set up, we should provide an assembler instance first:

context.assembler = assembler;

Now it is time to set the configuration flags. For beginners the most useful option is called enable_error_messages. It is used to enable support for textual error messages, which can be very useful when an error occurs, so let's set it to true:

context.configuration.enable_error_messages = FCML_TRUE;

As you noticed without any doubt we have used a strange constant FCML_TRUE in place of "true" for the boolean variable. This constant is defined inside the fcml_types.h header file and should be used whenever true (FCML_TRUE) or false (FCML_FALSE) should be directly set for given variable, but of course every not zero value can be used in case of true.

There is one more configuration option which can be interesting now, it is increment_ip but we will describe it later.

The last thing to do is to set up the entry point correctly:

context.entry_point.addr_form = FCML_MO_32_BIT;
context.entry_point.ip = 0x401000;

As you can see, processor operating mode has been set to 32 bits and EIP register to 0x401000 which is the default address for many assemblers/compilers. We have not set the operand size attribute or address size attribute so they both are set to their default values, which are 32 bits for both of them in case of the chosen processor operating mode.

The full source code for initializing the assembler context:

fcml_st_assembler_context context = {0};
context.assembler = assembler;
context.configuration.enable_error_messages = FCML_TRUE;
context.entry_point.addr_form = FCML_OM_32_BIT;
context.entry_point.ip = 0x401000;

Now we are ready to assemble the GIM we prepared earlier, so let's do that.

In order to assemble an instruction model, we have to use fcml_fn_assemble function defined by fcml_assembler.h header file. This is the definition of the function:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_assemble( 
	fcml_st_assembler_context *context, 
	const fcml_st_instruction *instruction, 
	fcml_st_assembler_result *result );

We have everything we need to fill its arguments. The following code shows how to invoke the function with structures we have already prepared:

error = fcml_fn_assemble( &context, &instruction, &asm_result );
if( error ) {
    ...
}

If everything succeeded, the error variable is set to FCML_CEH_GEC_NO_ERROR and asm_result contains the assembled machine code.

Let's take a look at fcml_st_assembler_result structure. The field errors contains textual error messages if function failed. Assembled instructions are stored as a chain of fcml_st_assembled_instruction structures. A chain was used, because there are instructions that can be assembled to more than one form. For example some of them can be assembled to even three different pieces of machine code. It is why the chain is just convenient here. Fortunately you do not have to analyse all available forms in order to identify the best one considering your processor operating mode, size attributes, etc. The most relevant piece of machine code is chosen by the assembler and is returned in chosen_instruction field. The last field number_of_instructions contains number of instruction forms available in the chain. Although it can be calculated going through all instructions in the chain, it is needed rather frequently so it is better to have it on hand.

The structure fcml_st_assembled_instruction holds information related to one assembled instruction and can contain optional warning messages if the instruction was assembled correctly, but FCML assembler had some objections to the generated machine code. The assembled machine code as a pointer to an array of bytes is available through the field code and the array length is stored in code_length field.

As you may remember I pointed out that fcml_st_assembler_result structure is reusable and as so the same structure can be used for every invocation of fcml_fn_assemble function. It is very convenient, because we do not need to allocate and free this structure for every instruction being assembled. It is the assembler which is responsible for freeing everything that assembler result contains like generated machine code or warning messages and reusing it.

When we are speaking about assembling multiple instructions one by one and reusing some assembler parameters, it is time to point out configuration flag increment_ip mentioned earlier. This flag can be used in order to force the assembler to increase the instruction pointer using the length of the chosen instruction (Length of the machine code generated for it to be more specific.) after every successful invocation of the assembler. It is very convenient if we assemble instructions that follows each other in the code segment, because we do not need to calculate the instruction pointer for them every time.

When the whole machine code is ready and there is nothing more to assemble, we should free all resources that are not needed any more. The first structure we have to free is fcml_st_assembler_result, because even if it is allocated on the stack it might still contain the assembled machine code and potential warning messages. You should call fcml_fn_assembler_result_free in order to free this information. Take into account that this function will not free the structure itself, so memory used by it has to be freed anyway or the structure can be still reused by another invocation of the assembler (Remember that you are owner of the structure and that you are responsible for freeing it at some point.)

fcml_fn_assembler_result_free( &asm_result );

We should also free the assembler itself and the dialect:

fcml_fn_assembler_free( assembler );
fcml_fn_dialect_free( dialect );

Remember that dialect has to be freed after the assembler.

The following source code assembles the generic instruction model from the example above:

#include <stdio.h>
#include <stdlib.h>

#include <fcml/fcml_intel_dialect.h>
#include <fcml/fcml_assembler.h>
#include <fcml/fcml_common_utils.h>

int main(int argc, char **argv) {

	fcml_ceh_error error;

	/* Initializes the Intel dialect instance. */
	fcml_st_dialect *dialect;
	if( ( error = fcml_fn_dialect_init_intel( FCML_INTEL_DIALECT_CF_DEFAULT, &dialect ) ) ) {
		fprintf( stderr, "Can not initialize Intel dialect: %d\n", error );
		exit(1);
	}

	fcml_st_assembler *assembler;
	if( ( error = fcml_fn_assembler_init( dialect, &assembler ) ) ) {
		fprintf( stderr, "Can not initialize assembler: %d\n", error );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	fcml_st_instruction instruction = {0};
	instruction.mnemonic = "adc";
	instruction.operands[0] = FCML_REG( fcml_reg_AX );
	instruction.operands[1] = FCML_IMM16( 0x8042 );
	instruction.operands_count = 2;

	/* Prepares the result. */
	fcml_st_assembler_result asm_result;
	fcml_fn_assembler_result_prepare( &asm_result );

	fcml_st_assembler_context context = {0};
	context.assembler = assembler;
	context.entry_point.ip = 0x401000;
	context.entry_point.op_mode = FCML_OM_32_BIT;

	/* Assembles the given instruction. */
	if( ( error = fcml_fn_assemble( &context, &instruction, &asm_result ) ) ) {
		fprintf( stderr, "Can not assemble instruction: %d\n", error );
		fcml_fn_assembler_free( assembler );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	/* Prints the instruction code. */
	if( asm_result.chosen_instruction ) {
		fcml_st_assembled_instruction *ins_code = asm_result.chosen_instruction;
		int i;
		printf("Chosen instruction code: ");
		for( i = 0; i < ins_code->code_length; i++ ) {
			printf("%2x", ins_code->code[i]);
		}
		printf("\n");
	} else {
		fprintf( stderr, "Hmm, where is the assembled instruction?\n" );
	}

	fcml_fn_assembler_result_free( &asm_result );
	fcml_fn_assembler_free( assembler );
	fcml_fn_dialect_free( dialect );

	return 0;
}
As you might have noticed, we have used a bit different location for the header files. In the examples above we firstly supposed that header files are placed directly in the include directory, but in the case of the last example files are located in the dedicated "fcml" directory. It depends on the configuration. By default in case of GNU/Linux and MinGW include files are installed in the dedicated subdirectory, but it can be changed. Just head over to the INSTALL file available in the distribution archive in order to investigate the subject in great depth.

The example should prints the following result:

Chosen instruction code: 66154280

Disassembler

FCML disassembler takes a piece of machine code as an argument and "converts" it to a GIM instance (see: Generic instruction model) which contains all information about the disassembled instruction. Such GIM can be used directly or for example can be rendered to the textual form of the instruction.

The first thing we have to do is to initialize a dialect which will be used by the disassembler to disassemble the provided machine code. Thanks to the dialects the library is able to use different instruction syntaxes like Intel or AT&T (Remember that GIM is dialect dependant.). So let's prepare an instance of the Intel dialect:

#include <fcml_intel_dialect.h>

fcml_st_dialect *dialect;
fcml_ceh_error error = fcml_fn_dialect_init_intel( FCML_INTEL_DIALECT_CF_DEFAULT, &dialect );
if( error ) {
	printf( "Can not initialize dialect, error: %d\n", error );
	exit(EXIT_FAILURE);
}

As you can see, there are dedicated functions to create different dialects. Every function can also take additional parameters used to configure the initialized dialect. In theory it should be possible to implement every existing dialect for FCML library, but currently only the Intel and AT&T (called GAS) dialects are supported. For the sake of example the Intel dialect (the preferred one) is used.

The Intel dialect was the first dialect which was supported by FCML library and as such it is more mature. Anyway AT&T dialect is also fully supported and unit tested and can be safely considered as a stable one.

The dialect is ready, so let's initialize the disassembler instance using fcml_fn_disassembler_init function.

fcml_st_disassembler *disassembler;
error = fcml_fn_disassembler_init( dialect, &disassembler );

To make code a bit clearer, error handling has been avoided in this case, but it should be implemented in the same way as in case of the dialect initialization. All possible error codes are defined in the header file fcml_errors.h

Having initialized the dialect and disassembler, there is the last thing to be done before disassembling is possible. It is the disassembler result structure. This structure is reusable so it has to be prepared in the right way in order to allow the disassembler to reuse it correctly. To do so, a manually allocated structure has to be passed to fcml_fn_disassembler_result_prepare function:

fcml_st_disassembler_result result;
fcml_fn_disassembler_result_prepare( &result );

That is all, the disassembler is prepared to do its job, so let's try to disassemble a piece of example machine code.

The main structure which has to be properly prepared for the disassembler to work is fcml_st_disassembler_context. It consists of the previously initialized disassembler instance which should be used to disassemble the machine code, some configuration flags we can use to configure disassembling process, entry point which will be used to inform the disassembler about the code segment the instruction is located in and a piece of the instruction machine code. The disassembler context itself can be initialized on the stack, but it is very important to clear the memory it uses before passing it to the disassembler. We should do it just to set all configuration options and other parameters to its default values. For example the following source code shows the proper way to initialize the disassembler context:

fcml_st_disassembler_context context = {0};

Let's start by setting the configuration options. For now there is only one flag we are interested in. It is enable_error_messages which is responsible for enabling textual error messages, which can be used to identify potential errors and as such they should be very helpful for beginners. The second flag that might be interesting here is short_forms and it has to be set to true in order to instruct the assembler to use a short instruction forms whenever possible (For instance 'cmpsw' instead of 'cmps word ptr [si],word ptr [di]'.) The short_forms flag impacts the generated GIM, so it should be used carefully.

context.configuration.enable_error_messages = FCML_TRUE;
context.configuration.short_forms = FCML_TRUE;

The disassembler is configured but we still have not provided any machine code yet. It can be done by setting two additional context fields code and code_length:

context.code = code;
context.code_length = sizeof( code );

The code should be a pointer to an array of bytes which contains instruction machine code and code_length of course holds the length of the array in bytes.

The machine code is configured now, but we known nothing about the code section, so it is time to set the instruction pointer and processor addressing mode by setting the entry point structure correctly (If you do not known what the instruction pointer, address size attribute or processor operating mode are, you definitely should at least read this chapter: Understanding entry point)

The structure fcml_st_entry_point holds basic information about the code section and the instruction pointer of the instruction. Thanks to it we can set value of the IP/EIP or RIP register that points to the instruction machine code in the memory. It is very important to set it correctly, because this information is used to calculate relative offsets for example. The first required field is op_mode which describes the processor operating mode (FCML_OM_16_BIT, FCML_OM_32_BIT or FCML_OM_64_BIT). We can also set default values for the address size attribute and operand size attribute for our "virtual" code segment:

context.entry_point.op_mode = FCML_OM_32_BIT;
context.entry_point.address_size_attribute = FCML_DS_UNDEF;
context.entry_point.operand_size_attribute = FCML_DS_UNDEF;
context.entry_point.ip = 0x00401000;

The disassembler context is almost initialized, we have left the most important thing at the end. It is the disassembler itself. It has to be also put into the context, because it will be used to do the whole job:

context.disassembler = disassembler;

The following piece of source code shows how the whole context initialization should looks like:

fcml_st_disassembler_context context = {0};
context.disassembler = disassembler;
context.configuration.enable_error_messages = FCML_TRUE;
context.configuration.short_forms = FCML_TRUE;
context.code = code;
context.code_length = sizeof( code );
context.entry_point.op_mode = FCML_OM_32_BIT;
context.entry_point.address_size_attribute = FCML_DS_UNDEF;
context.entry_point.operand_size_attribute = FCML_DS_UNDEF;
context.entry_point.ip = 0x00401000;

Now we are ready to disassemble the first piece of machine code, so let's do it. In order to disassemble anything we have to call the function fcml_fn_disassemble (Definition Below).

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_disassemble(
	fcml_st_disassembler_context *context, 
	fcml_st_disassembler_result *result );

The function gets the disassembler context and disassembler result as arguments:

error = fcml_fn_disassemble( &context, &result );
if( !error ) {
	…
}

If everything succeeded, the error code is FCML_CEH_GEC_NO_ERROR and the result contains the disassembled instruction in the form of the generic instruction model. Let's take a look at fcml_st_disassembler_result structure. Field errors contain error messages if the function failed (Or potential warnings in case of success).

There is also fcml_st_instruction_details structure which consists of additional information which is not relevant for the general instruction model but anyway can be useful through the process of the instruction analysis.

Now we have a general instruction model, but what if we would like to print a textual representation of the instruction for the user? There is nothing easier than that. You only have to configure a instruction renderer and render the GIM to the provided buffer. So let's do it, but at first take a look at the function we will use to render our instruction model:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_render( fcml_st_dialect *dialect,
	fcml_st_render_config *config, 
	fcml_char *buffer, 
	fcml_usize buffer_len,
	fcml_st_disassembler_result *result );

This function needs quite a few parameters, but take into account that there is only one argument that has to be carefully prepared. It is fcml_st_render_config structure which configures some aspects of the rendering process. It is also not so complicated, because it contains only a few rendering flags (see: Instruction renderer) and padding configuration. So let's prepare the configuration:

fcml_st_render_config render_config = {0};
render_config.render_flags = FCML_REND_FLAG_HEX_IMM | 
							 FCML_REND_FLAG_HEX_DISPLACEMENT;

The next two parameters that follow the configuration: buffer and buffer_length point to the output buffer, where textual representation of the instruction will be rendered. You can allocate this buffer in the following way:

fcml_uint8_t buffer[FCML_INSTRUCTION_SIZE];

Remember that the buffer is also reusable and does not have to be cleaned between multiple calls to the rendering function.

We have prepared all needed parameters, so let's render the instruction from the disassembler result:

fcml_ceh_error error = fcml_fn_render( dialect, render_config, 
	buffer, sizeof( buffer ), result );

As you can see we pass the whole disassembler result structure to the renderer. It is important to note that in order to render the instruction we need the whole result, a GIM is not enough here. It is why the whole result is passed as the parameter, because it contains fcml_st_instruction_details structure which is also used by the renderer. Although it is possible to prepare such a disassembler result by hand and pass it the renderer, it would be very risky (mostly because there is some information destined only for renderers like hints), so remember to use renderers only with structures prepared by FCML disassembler. The last thing to remember is to use the same dialect that was used by the disassembler.

When resources are no longer needed they have to be freed. The following source code frees the disassembler result, disassembler itself and the dialect:

fcml_fn_disassembler_result_free( &dis_result );
fcml_fn_disassembler_free( disassembler );
fcml_fn_dialect_free( dialect );

Remember that fcml_fn_disassembler_result_free function does not free the result structure itself. It is only responsible for freeing all structures allocated by the disassembler which are accessible through the disassembler result, like error messages for instance. It is why you can still reuse the structure, even if it was freed before.

See the Instruction renderer chapter for working example.

Manual

This manual covers every aspect of FCML library in very detail. Take into account that good knowledge of x68_64 architecture is required to understand the following chapters. Of course if everything you need is to disassembler a piece of machine code and to render it to the textual form or to assemble a bit of machine code without playing much with the generic instruction models the Quick Start and maybe Parser chapters should be fair enough for you.

Generic instruction model

The generic instruction model (GIM) is a common structure which is used to describe the instruction in a common way used by FCML assembler and disassembler. That is, in case of FCML library assembling and disassembling operations are symmetrical. It means that the instruction model returned by the disassembler can be then assembled back to get the same piece of machine code that was disassembled earlier. The generic instruction model consists of an instruction mnemonic, optional prefixes, condition (in case of conditional instructions) and of course instruction operands. This chapter describes every field of the GIM structure in details. The following code shows the GIM structure declaration:

typedef struct fcml_st_instruction {
	fcml_prefixes prefixes;
	fcml_hints hints;
	fcml_char *mnemonic;
	fcml_bool is_conditional;
	fcml_st_condition condition;
	fcml_st_operand operands[FCML_OPERANDS_COUNT];
	fcml_int operands_count;
} fcml_st_instruction;
I've decided to copy various structure declarations through the manual, because in my opinion it is a lot easier to remember structure details and overall idea having the declarations in front of our eyes when they are discussed. Nevertheless, all comments have been removed to avoid redundancy and increase readability.

Prefixes

The first field prefixes defines all explicitly set prefixes than are used by the instruction. This is the list of allowed prefixes:

#define FCML_PREFIX_LOCK            0x0001
#define FCML_PREFIX_REPNE           0x0002
#define FCML_PREFIX_REPNZ           FCML_PREFIX_REPNE
#define FCML_PREFIX_REP             0x0004
#define FCML_PREFIX_REPE            FCML_PREFIX_REP
#define FCML_PREFIX_REPZ            FCML_PREFIX_REP
#define FCML_PREFIX_XACQUIRE        0x0008
#define FCML_PREFIX_XRELEASE        0x0010
#define FCML_PREFIX_BRANCH_HINT     0x0020
#define FCML_PREFIX_NOBRANCH_HINT   0x0040

These definitions are self-describing so, there is no need to waste time describing them. One thing worth mentioning here is that they all are defined as bit masks so you can set more than one prefix for the instruction if needed.

Instruction level hints

The second field hints is used to set some instruction level hints for the assembler/renderer (Every disassembler instance also sets them, so fell free to use them whenever possible). Currently there are following hints supported:

typedef enum fcml_en_instruction_hints {
    FCML_HINT_FAR_POINTER = 0x0001,
    FCML_HINT_NEAR_POINTER = 0x0002,
    FCML_HINT_LONG_FORM_POINTER = 0x0004,
    FCML_HINT_INDIRECT_POINTER = 0x0008
} fcml_en_instruction_hints;

FCML_HINT_FAR_POINTER

This hint is set for instructions which use far pointers:

  • LDS, LSS, LES, LFS, LGS
  • CALL and JMP instructions which use far absolute indirect addresses.
  • CALL and JMP instructions which use far absolute direct addresses.
If the hint is set, the Intel renderer adds "far" keyword just after the mnemonic, and GAS/AT&T renderer adds "l" as a prefix to the call and jmp mnemonics. Parsers also set this hint if the appropriate keyword or mnemonic (In case of AT&T) is used.
Every time I refer to "Intel renderer" or "AT&T renderer" I mean a renderer called with an appropriate dialect.

FCML_HINT_NEAR_POINTER

This flag is always set if the ModR/M based near addressing is used.

FCML_HINT_LONG_FORM_POINTER

This hint is interpreted only by the assembler and can be used in order to force it to generate three-byte VEX/XOP prefix even if prefix fields fit into two bytes. For now it is supported by the Intel syntax only and can be set using "long_form" keyword just after the instruction mnemonic. For instance: "vrcpps long_form xmm2,xmmword ptr [eax]" assembles to: C4E1785310 (VEX/XOP prefixes in bold), but "vrcpps ymm2,ymmword ptr [rax]" assembles to: C5FC5310. If you do not know what these prefixes are just ignore that. This is very specialized functionality and you will probably never need it.

FCML_HINT_INDIRECT_POINTER

This hint is set for instructions which use indirect pointers:

  • CALL and JMP which use near absolute indirect addresses.
  • CALL and JMP which use far absolute indirect addresses.

Mnemonic

The next field mnemonic stores an instruction mnemonic. For instance: "cmps", "mov", "vrcpps", etc. Take into account that the instruction mnemonic is allocated and built by the disassembler, so the disassembler is responsible for freeing it while fcml_fn_disassembler_result_free function is called, or when the disassembler result is reused by the disassembler. So if you would like to store it somewhere, you have to make a duplicate.

Conditional instructions

The next two fields describe conditional instructions: is_conditional and condition. The first one is set to FCML_TRUE for all conditional instructions: CMOV, Jcc (JA, JG, JNG, ...), SETcc (SETG, SETNG, …). If the first one is set to FCML_TRUE, the second one describes the condition used by the instruction.

typedef struct fcml_st_condition {
    fcml_en_condition_type condition_type;
    fcml_bool is_negation;
} fcml_st_condition;

The condition type enumerator contains following values:

typedef enum fcml_en_condition_type {
	/* 0 Overflow*/
	FCML_CONDITION_O = 0,
	/* 1 Below*/
	FCML_CONDITION_B,
	/* 2 Equal*/
	FCML_CONDITION_E,
	/* 3 Below or equal*/
	FCML_CONDITION_BE,
	/* 4 Sign*/
	FCML_CONDITION_S,
	/* 5 Parity*/
	FCML_CONDITION_P,
	/* 6 Less than*/
	FCML_CONDITION_L,
	/* 7 Less than or equal to*/
	FCML_CONDITION_LE
} fcml_en_condition_type;

As you can see they are in consonance with suffixes used by conditional mnemonics. Of course mnemonics can also negate these conditions and it is where the next field is_negation plays a part in describing the condition as a whole.

For example the mnemonic: SETNBE has condition_type field set to FCML_CONDITION_BE and is_negation to FCML_TRUE.

Operands

The next instruction field operands is an array of available operands. The number of operands available in the array is defined by another field operands_count. Every operand is an instance of the following structure:

typedef struct fcml_st_operand {
    fcml_en_operand_type type;
    fcml_hints hints;
    fcml_st_immediate immediate;
    fcml_st_far_pointer far_pointer;
    fcml_st_address address;
    fcml_st_register reg;
} fcml_st_operand;

The first field type stores the operand type:

typedef enum fcml_en_operand_type {
	FCML_OT_NONE,
	FCML_OT_IMMEDIATE,
	FCML_OT_FAR_POINTER,
	FCML_OT_ADDRESS,
	FCML_OT_REGISTER
} fcml_en_operand_type;
FCML_OT_NONE

Operand is not defined. It means that it is just not used.

FCML_OT_IMMEDIATE

Immediate integer value (Can be also used to specify near relative addressing in some cases).

FCML_OT_FAR_POINTER

Describes far absolute address given in operand.

FCML_OT_ADDRESS

Effective address.

FCML_OT_REGISTER

One of the supported registers.

Hints

We will get through all operands types later, for now let's take a look at operand hints which are counterparts to the instruction hints described in the case of fcml_st_instruction structure:

typedef enum fcml_en_operand_hints {
	FCML_OP_HINT_MULTIMEDIA_INSTRUCTION = 0x0001,
	FCML_OP_HINT_DISPLACEMENT_RELATIVE_ADDRESS = 0x0002,
	FCML_OP_HINT_PSEUDO_OPCODE = 0x0004,
	FCML_OP_HINT_ABSOLUTE_ADDRESSING = 0x0008,
	FCML_OP_HINT_RELATIVE_ADDRESSING = 0x0010,
	FCML_OP_HINT_SIB_ENCODING = 0x0020
} fcml_en_operand_hints;
FCML_OP_HINT_MULTIMEDIA_INSTRUCTION

All operands which use SIMD registers (mmx, xmm, ymm) have this flag set. It is for instance used by the Intel renderer for rendering following data size operators: mmword ptr, xmmword ptr, ymmword ptr. For more details head over to the following section: Size operators for Intel dialect.

FCML_OP_HINT_DISPLACEMENT_RELATIVE_ADDRESS

This hint is set for all branches which use jumps calculated by the displacement relative to the instruction pointer of the next instructions. XBEGIN also uses such addressing mode and has this hint set (See RTM – Restricted Transactional Memory for more details.). This flag is set by the disassembler only, so you can silently ignore it in the case of hand made generic instruction models.

FCML_OP_HINT_PSEUDO_OPCODE

This hint is set for the last operand (Intel syntax) which contains a comparison predicate of the following instructions: CMPSD, VCMPSD, CMPSS, VCMPSS, VPCOMB, VPCOMW, VPCOMD, VPCOMQ, VPCOMUB, VPCOMUW, VPCOMUD, VPCOMUQ. It is used only internally by the disassembler in order to remove the mentioned operand if pseudo-ops forms of these instructions are used (See: Shortcuts). For instance the instruction "cmpsd xmm1, xmm2, 4" can be encoded using the pseudo-op form "cmpneqsd xmm1, xmm2" which encodes the condition from the last operand (4) inside the instruction mnemonic. In the first case the comparison predicate will be available in the last FCML_OT_IMMEDIATE operand and the operand will have this hint set. Nonetheless in the pseudo-op form this operand will not be available in the GIM at all. So finally this hint is fairly useless and essentially can only be used to check if the operand contains comparison predicate for the mentioned instructions when short forms are disabled.

FCML_OP_HINT_ABSOLUTE_ADDRESSING, FCML_OP_HINT_RELATIVE_ADDRESSING

As opposed to the previous hint these hints are really important and both the assembler and disassembler can make a good use of them. They are usable only in 64 bit mode where RIP addressing was introduced. In general, they can be used to hint the assembler whether certain immediate integer value should be interpreted as absolute offset or displacement relative to RIP (See Dialects chapter for more information about how to use these hints with Intel and AT&T dialects).

FCML_OP_HINT_SIB_ENCODING

There are instructions which can be encoded with or without the SIB byte. For example the following instruction "add dword ptr [eax+00000001h],eax" can be encoded to the form without the SIB byte: 014001, or to the longer form with the SIB byte 01442001 encoded in it. Using this hint together with the operand which contains effective address you can force assembler to use the SIB byte even if it is superfluous. FCML Intel parser also supports this hint, so for instance you can parse the following instruction correctly "add dword ptr [sib eax+00000001h],eax”, but AT&T dialect currently does not handle this syntax and probably will not in the future.

Immediate value operand

Immediate operands are encoded using the following structure:

typedef struct fcml_st_integer {
    fcml_usize size;
    fcml_bool is_signed;
	fcml_int8_t int8;
	fcml_int16_t int16;
	fcml_int32_t int32;
	fcml_int64_t int64;
} fcml_st_integer;

The field size contains the immediate value size in bits, so it should be set to 8,16, 32 or 64 (See: FCML_DS_16, FCML_DS_32, etc. macros). The value can also be set signed or unsigned using is_signed flag. This flag is very important in cases when the value has to be extended. For example we have 8-bit 0xFF value set inside int8 field, but instruction needs 16-bit immediate operand. In such a case the value has to be extended to fit 16 bits. If the value was signed, it would be extended to 0xFFFF and in case of the unsigned value it would be extended to 0x00FF. Fields int8, int16, int32, int64 holds the integer values we would like to set for the operand. If we have set the size to 8 bits we have to use the int8 field and so on. Using different fields for every size has its advantages and disadvantages, but it can save us from certain problems related to the type casting.

For instance if we would like to set 16-bit signed value -256 for the operand, we can do it like this:

fcml_st_immediate imm = {0};
imm.imm_size = FCML_DS_16;
imm.is_signed = FCML_TRUE;
imm.imm16 = -256;

Far pointer operand

The structure below is used to specify the direct far pointer addressing for branch instructions (JMP, CALL):

typedef struct fcml_st_far_pointer {
	fcml_uint16_t segment;
	fcml_data_size offset_size;
	fcml_uint16_t offset16;
	fcml_uint32_t offset32;
} fcml_st_far_pointer;

A far pointer consists of a segment value and an offset relative to the segment value. The field segment contains a segment value, whereas offset_size and offset16/offset32 hold the offset relative to this value. The offset can be 16 or 32 bit in length. The size differs depending on the processor operating mode. All 16-bit code segments use 16-bit offsets and 32-bit code segments use 32-bit offsets and in case of 64-bit mode direct far pointers are not supported, so there is no need to provide offset64 field.

Address operand

The address operand is without a doubt the most advanced operand type here. It is used to specify an absolute/relative offset or a full effective address which consists of several different fields describing the memory addressing. The following structure is used to describe it:

typedef struct fcml_st_address {
	fcml_data_size size_operator;
	fcml_en_effective_address_form address_form;
	fcml_st_segment_selector segment_selector;
	fcml_st_effective_address effective_address;
	fcml_st_offset offset;
} fcml_st_address;

Every field here has a quite a long description, so let's break the overall convention and organise them as a simple list this time:

Field: size_operator

The first field defines the size of the data we would like to access using the encoded address (It corresponds to the "word ptr", "byte ptr" etc. size operators in case of the Intel syntax.).

Field: address_form

The second field address_form is used to inform the assembler about the type of the effective address being used:

typedef enum fcml_en_address_form {
	FCML_AF_UNDEFINED,
	FCML_AF_OFFSET,
	FCML_AF_COMBINED
} fcml_en_effective_address_form;
		
FCML_AF_OFFSET

The address type used when the operand represents a direct absolute offset (address) relative to the current code segment. Take into account that it tells nothing about the representation of the address on the instruction machine code level. It is up to the assembler to decide what addressing form would be the best to be used in a given case (Relative or absolute). For some instructions the assembler might decide to encode it as a displacement value relative to the instruction pointer, whereas for others it may be the best just to encode the offset as a direct absolute address. For 16 and 32 bit processor operating modes the offset is always encoded as an absolute address (In case of ModR/M addressing.), because it is the only way the assembler is able to encode it using the ModR/M fields, but in case of the 64-bit operating mode it may choose between an absolute addressing and RIP-relative addressing introduced for 64-bit processors.

So to sum up, we have to use it if we would like to pass a direct absolute address to the assembler and we do not care how it will be encoded at the end. If you choose this address form, use offset field of the fcml_st_address structure in order to specify the absolute offset.

The assembler can encode this operand type using one of the following forms:

  • The effective memory addressing encoded using the ModR/M fields.
  • A displacement relative to the instruction pointer encoded directly as the immediate operand of the encoded instruction (JMP, CALL etc.).
  • An absolute offset directly encoded as the immediate operand of the instruction (MOV).

I have pointed out that it is up to the assembler to choose the best way to decode the absolute offset. It can choose between the absolute or relative addressing if there is such possibility, but what if we would like to use the certain one? There are two ways to achieve it.

The first way is a global one and is based on the assembler configuration where choose_abs_encoding field can be used to choose the preferred way to encode offsets. Set it to FCML_TRUE to hint assembler to treat absolute addressing as the preferred one.

The second one is to use the following operand level hints: FCML_OP_HINT_ABSOLUTE_ADDRESSING, FCML_OP_HINT_RELATIVE_ADDRESSING. They are self-describing but for more information see: Hints.

Features above are especially useful when you need to generate position-independent machine code for example, but it is a little tricky solution which should be used carefully. Notice that the configuration as well as the operand flags always refer to the one instruction being assembled at a given moment, so if certain instruction can not be encoded using preferred addressing form the one possible will be chosen and currently there is no way to check the final decision made by the assembler.

There is one more thing you should be aware of. Even if FCML_AF_OFFSET address type has been used, in some cases you can still encode the absolute offset directly as the displacement value relative to the instruction pointer using fcml_st_effective_address structure, by setting the displacement directly in the displacement field. It is possible only for 16 and 32 bit processor operating modes, because in these modes the absolute offset is just encoded in the displacement (ModR/M field.) anyway. This trick is not possible in the 64-bit addressing mode. This feature is rather useless on day-to-day usage and should not be used due to its inconsistency! It just breaks the rule that FCML_AF_OFFSET is always encoded using fcml_st_offset structure.

FCML_AF_COMBINED

This type should be used if we would like to encode an effective address indirectly by computing following components: a displacement, base register, index register and scale factor. To encode the effective address use effective_address field of the fcml_st_address structure.

Field: segment_selector

The next field of fcml_st_address structure segment_selector contains a segment selector which can be used to define the segment selector register that should be used by the instruction. As you probably know there are default segment registers for certain segments. For example CS is the default register for code segments and DS is the default register for data segments, etc. Anyway in some cases these registers can be overridden using special instruction prefixes. If you would like to address the memory using a non-standard register, you are able to do it using this field. Take a look at the structure which describes the segment selector:

typedef struct fcml_st_segment_selector {
	fcml_st_register segment_selector;
	fcml_bool is_default_reg;
} fcml_st_segment_selector;

The first field segment_selector is the segment register and should be set to one of the available segment registers. Second field is_default_reg is filled only by the disassembler and can be used to check if register returned by the disassembler is the default one or maybe the one placed here as the result of the segment register overriding.

Field: effective_address

The next field effective_address has to be used when address_form field is set to FCML_AF_COMBINED. It consists of all components which take part in the effective address computation:

typedef struct fcml_st_effective_address {
	fcml_st_register base;
	fcml_st_register index;
	fcml_uint8_t scale_factor;
	fcml_st_integer displacement;
} fcml_st_effective_address;
		

There is nothing special about this structure. The fields: base and index should be set to general purpose registers and the scale factor to the 8-bit integer value: 2,4 or 8.

The displacement field may look very familiar to you if you have already read about immediate operands before, because it uses the same fcml_st_integer structure (See: Immediate value operand.).

One more thing to notice: in general, the effective address is used to describe an indirect addressing but you are perfectly allowed to fill the displacement structure only, forcing the assembler to encode displacement as an absolute address in case of 16, 32 and as a relative address in case of the 64-bit addressing. The displacement is just encoded as it is, without any additional calculations.

Field: offset

The last field of the fcml_st_address structure is offset. It uses the following structure to describe the memory address:

typedef struct fcml_st_offset {
	fcml_usize size;
	fcml_bool is_signed;
	fcml_int16_t off16;
	fcml_int32_t off32;
	fcml_int64_t off64;
} fcml_st_offset;

This structure follows the same model as the immediate value operand and displacement described above. There is only one difference. It does not allow to specify 8-bit values. It is because the lowest size for the absolute address is exactly 16 bits.

Register operand

Register operand specifies a register we would like to use as an instruction operand. The following structure describes them:

typedef struct fcml_st_register {
	fcml_en_register type;
	fcml_usize size;
	fcml_uint8_t reg;
	fcml_bool x64_exp;
} fcml_st_register;

The first field should be set to the appropriate register type:

typedef enum fcml_en_register {
	FCML_REG_UNDEFINED = 0,
	FCML_REG_GPR,
	FCML_REG_SIMD,
	FCML_REG_FPU,
	FCML_REG_SEG,
	FCML_REG_CR,
	FCML_REG_DR,
	FCML_REG_IP
} fcml_en_register;

All available types are described below:

FCML_REG_GPR

General purpose registers like FCML_REG_AL, FCML_REG_AX etc.

FCML_REG_SIMD

SIMD registers like FCML_REG_MM1, FCML_REG_XMM1, FCML_REG_YMM1.

FCML_REG_FPU

FPU registers FCML_REG_ST0, FCML_REG_ST2 etc.

FCML_REG_SEG

FCML_REG_ES, FCML_REG_CS, FCML_REG_SS etc.

FCML_REG_CR

Control registers FCML_REG_CR0, FCML_REG_CR2,FCML_REG_CR4 etc.

FCML_REG_DR

Debug registers FCML_REG_DR0, FCML_REG_DR1, FCML_REG_DR2 etc.

FCML_REG_IP

Instruction pointer (Used only with RIP addressing.)

The next two fields stores the register number (See table below.) and its size in bits. You can use the following defines to set appropriate size: FCML_DS_8, FCML_DS_16, FCML_DS_32, FCML_DS_64, FCML_DS_128, FCML_DS_256.

The last field x64_exp is an interesting one. In 64-bit mode in some circumstances processor can not reference following registers: AH, BH, CH, DH. This is an architecture limitation. The overall rule comes down to the fact that they cannot be used when a REX prefix exists for the instruction. Therefore always when the REX prefix is available, registers: AH, BH, CH, DH are interpreted as BPL, SPL, DIL, SIL (the low 8 bits for RBP, RSP, RDI and RSI). This field is set to FCML_TRUE by the disassembler when the rule affects the register and should be set to FCML_TRUE if we would like to reference these registers while assembling: RBP, RSP, RDI and RSI.

The following table shows all supported registers grouped by their types:

Registers supported by FCML library
Type Registers
8-bit GPR FCML_REG_AL, FCML_REG_CL, FCML_REG_DL, FCML_REG_DL, FCML_REG_BL, FCML_REG_AH, FCML_REG_SPL, FCML_REG_CH, FCML_REG_BPL, FCML_REG_DH, FCML_REG_SIL, FCML_REG_BH, FCML_REG_DIL, FCML_REG_R8L - FCML_REG_R15L
16-bit GPR FCML_REG_AX, FCML_REG_CX, FCML_REG_DX, FCML_REG_BX, FCML_REG_SP, FCML_REG_BP, FCML_REG_SI, FCML_REG_DI, FCML_REG_R8W - FCML_REG_R15W
32-bit GPR FCML_REG_EAX, FCML_REG_ECX, FCML_REG_EDX, FCML_REG_EBX, FCML_REG_ESP, FCML_REG_EBP, FCML_REG_ESI, FCML_REG_EDI, FCML_REG_R8D - FCML_REG_R15D
64-bit GPR FCML_REG_RAX, FCML_REG_RCX, FCML_REG_RDX, FCML_REG_RBX, FCML_REG_RSP, FCML_REG_RBP, FCML_REG_RSI, FCML_REG_RDI, FCML_REG_R8 - FCML_REG_R15
64-bit SIMD (MMX) FCML_REG_MM0 - FCML_REG_MM7
128-bit SIMD FCML_REG_XMM0 - FCML_REG_XMM15
256-bit SIMD FCML_REG_YMM0 - FCML_REG_YMM15
FPU FCML_REG_ST0 - FCML_REG_ST7
Segment registers FCML_REG_ES, FCML_REG_CS, FCML_REG_SS, FCML_REG_DS, FCML_REG_FS, FCML_REG_GS
Control registers FCML_REG_CR0, FCML_REG_CR2, FCML_REG_CR3, FCML_REG_CR4, FCML_REG_CR8
Debug registers FCML_REG_DR0, FCML_REG_DR1, FCML_REG_DR2, FCML_REG_DR3, FCML_REG_DR4, FCML_REG_DR5, FCML_REG_DR6, FCML_REG_DR7
Remember that these defines can be used only to specify the register number. They do not describe register size in any way! It is why the register size has to be set in size field anyway.

Understanding entry point

The entry point is a structure used widely by FCML library. It is used to define basic information about virtual code segments where disassembled/assembled instructions live. It describes such things as a processor operating mode, default address and operand size attributes and address of instruction in the memory so called "instruction pointer":

typedef struct fcml_st_entry_point {
	fcml_en_operating_mode op_mode;
	fcml_usize address_size_attribute;
	fcml_usize operand_size_attribute;
	fcml_ip ip;
} fcml_st_entry_point;

The entry point structure consists of four fields. The first one op_mode is used to configure the processor operating mode and it can be set to one of the following three values: FCML_OM_16_BIT, FCML_OM_32_BIT, FCML_OM_64_BIT. They are self-describable, so let's spend a bit more time with the next two fields.

Fields address_size_attribute and operand_size_attribute define the default size of the address size attribute and the operand size attribute. Such attributes are set for every code segment when the processor is executing in the protected mode. The rule is simple. There is a 'D' flag in the segment descriptor. This flag specifies the default sizes for the both attributes. When we are in 32 bit mode and 'D' flag is 0, both attributes are set to 16 bits. When the flag is set to 1 they both are set to 32 bits. In case of 64 bit mode if 'D' flag is set to 0 the address size attribute is set to 32 bits and the operand attribute size is set to 64 bits. When the flag is set to 1 they both are set to 64 bits. The default values can be overridden by the instruction prefixes 0x66, 0x67 and REX. Using the entry point structure we can set the default size for operand and address attributes like they were defined by a code segment, but it is a bit more flexible because we can set every combination of values, even such which are not available in real environments. For instance 16 bit operand size attribute and 32 bit address size attribute. If they are set to 0 (FCML_DS_UNDEF), FCML concludes them using chosen op_mode and supposing that 'D' flag is 0.

The last field ip contains a value for the instruction pointer register. In other words, an address in the memory at which the instruction should be located in the code segment. When we are in 16-bit mode it is IP register, for 32-bit mode EIP register and RIP register for 64-bit mode.

Dialects

Various x86-64 assemblers use different instruction syntaxes to encode their instructions. There are two widely used syntax branches. The first one is called the Intel syntax, because it was used in Intel's manuals to the x86-64 architecture. The Intel syntax is widely used in Microsoft Windows and MS-DOS environments. The second syntax is called AT&T dialect and was created at AT&T Bell Labs. It is mainly used in UNIX-like environments. The main differences between these two are: the source and destination operand order, the way an operand size is specified, the format of the effective address encoding, register naming. For instance the following two instructions represent the same machine code using Intel and AT&T dialect:

One instruction encoded using different dialects
Dialect Machine code Instruction
AT&T 4D11648901 adc %r12,0x0000000000000001(%r9,%rcx,4)
Intel 4D11648901 adc qword ptr [r9+rcx*4+0000000000000001h],r12

It is out of the scope of the manual to describe all the differences in details, so if you are interested in one of these dialects and you would like to learn it, look for documentation that is dedicated to this subject.

FCML library provides the abstraction that can be used to add support for almost every existing assembler syntax. It is called dialect. Currently FCML supports two main dialects: the Intel dialect and the AT&T dialect called also a GAS dialect, because at the moment GNU Assembler (GAS) is in fact the reference implementation of the AT&T dialect. In theory the dialect model should be flexible enough to implement every existing assembler syntax, but it evolved together with these two supported dialects so it is possible that new implementations would have to extend it a bit. Anyway, currently there are no plans to support more dialects in the future.

From the user's point of view the dialect is just a syntax they would like to use while working with FCML library. Dialects are widely used by almost every function available in the library. Even if a function does not need the dialect to be passed to it explicitly it can use it for instance implicitly getting it from the assembler or disassembler instance passed to the function.

It is why creating a dialect instance is often the first thing that has to be done when initializing FCML library to work.

Initializing Intel dialect

In order to initialize the Intel dialect you have to use the following function, which is available inside the fcml_intel_dialect.h header file:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_dialect_init_intel( 
	fcml_uint32_t config_flags, 
	fcml_st_dialect **dialect );

The first argument is not used by the Intel dialect and should be always set to FCML_INTEL_DIALECT_CF_DEFAUL. It is used here only in order to make the function declaration the same as for the AT&T dialect which can be very useful when we have to store them inside function pointers.

The second parameter is an output pointer for the initialized dialect.

If the function succeeded, the return value is set to FCML_CEH_GEC_NO_ERROR.

The following code shows how to initialize the dialect in practice:

fcml_st_dialect *dialect;
error = fcml_fn_dialect_init_intel( FCML_GAS_DIALECT_CF_DEFAULT, &dialect );
if( error ) {
	printf( "Cannot initialize the Intel dialect, error: %d\n", error );
	exit(1);
}

Initializing AT&T (GAS) dialect

Initialization process of the AT&T dialect follows the same convention as the Intel dialect, so we have to use the following function, which is defined in fcml_gas_dialect.h header file:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_dialect_init_gas( 
	fcml_uint32_t config_flags, 
	fcml_st_dialect **dialect );

In case of the GAS dialect there is only one flag that can be set in order to configure it. It is FCML_GAS_DIALECT_CF_SYSV_SVR32_INCOMPATIBLE, which can be used to disable compatibility with the broken implementation of the non-commutative arithmetic floating point operations with two register operands introduced in SystemV/386 SVR3.2 assembler and spread across the world. If you know nothing about the problem, you definitely want a dialect which is fully compatible with the modern GAS implementations, so use FCML_GAS_DIALECT_CF_DEFAULT instead.

The second parameter is an output pointer for an initialized dialect.

If the function succeeds, the return value is set to FCML_CEH_GEC_NO_ERROR.

The following code shows how to initialize the dialect in practice:

fcml_st_dialect *dialect;
error = fcml_fn_dialect_init_gas( FCML_GAS_DIALECT_CF_DEFAULT, &dialect );
if( error ) {
	printf( "Can not initialize the AT&T dialect, error: %d\n", error );
	exit(1);
}

Disposing dialect

Dialects can be treated like singletons and are fully thread safe (in fact they are read only), but unfortunately they often allocate some resources internally, like hash maps used to map mnemonics to their instruction definitions. It is why they have to be disposed when they are not needed any more. In order to do so, use the following function:

	LIB_EXPORT void LIB_CALL fcml_fn_dialect_free(fcml_st_dialect *dialect);

The following code shows how to dispose the dialect in practice:

	fcml_fn_dialect_free( dialect );

Important differences between dialects

Generic instruction models are dialect dependant. Designing the model to be dialect independent was not in the project goals and probably will not be, so if it is for you a must to have feature, be aware that this is not possible in case of FCML library. The main arguments against are as follows:

  • Such a model would be harder to analyse for people who are interested in only one dialect

  • There are important differences even in case of instruction mnemonics which makes it almost impossible to use instruction codes (which could identify the instructions) in a consistent way. POPA instruction is a good example here. In case of the Intel syntax the mnemonic POPAD is used when operand size is 32 and mnemonic POPA is used when the operand size is 16. So in 32-bit mode the first mnemonic will be assembled to 0x61 and the second one to 0x66, 0x61 forcing 16 bits operand size by the use of the appropriate prefix. But in case of the AT&T syntax POPA is used when operand-size attribute is 32 and POPAW in used for 16 bit operand-size attribute. As you can see there is obvious conflict here. So it is definitely more convenient for the programmer to use a mnemonic they are used to rather than using the instruction code for instance F_POPA and then specify the needed effective operand-size attribute somewhere in the model. (Codes are also provided by the disassembler, because they might be more convenient when analysing the code. Especially for AT&T assemblers where one instruction may have multiple different mnemonics.)

  • They differ in the operand order so even if the Intel like ordering would be chosen here it would be unnatural for programmers using the AT&T syntax; to say nothing about inconsistencies in AT&T assemblers where some instructions use the Intel ordering anyway (see: ENTER, BOUND, MONITOR etc.)

  • There are also other AT&T inconsistencies like broken non-commutative arithmetic floating point operations with two register operands (FDIVR instruction encoded using FDIV with reverted operands!)

Sections bellow describe the most important differences you should be aware of.

Far pointers

At first let's see how direct far pointers are used in case of the Intel and AT&T syntaxes:

Far pointers encoding by the Intel and AT&T syntax
Intel call far 6655h:44332211h
AT&T lcall $0x6655,$0x44332211

Now let's try to look at these instructions from the parser point of view.

In case of the Intel syntax it is a simple matter. This colon based syntax is dedicated to the far pointers so the parser is able to interpret this operand as FCML_OT_FAR_POINTER easily.

The problem appears in case of AT&T parsers, because as you can see there are just two immediate integers separated by the comma. The parser is context-less so it is not able to guess that we want these operands to be interpreted as one far pointer operand. We also can not make any assumptions that these two following immediate integers are always far pointers, because it is simply not true (see: ENTER, INSERTQ). So in case of the AT&T syntax they are parsed to two FCML_OT_IMMEDIATE operands and then the dialect is responsible for making the correction to the model in the preprocessing phase when assembled instruction is already known.

So to sum up:

In case of the Intel dialect you have to provide one FCML_OT_FAR_POINTER operand.

In case of AT&T dialect you can use one FCML_OT_FAR_POINTER operand, but two FCML_OT_IMMEDIATE operands will be also interpreted correctly. Anyway, notice that the disassembler always disassembles such an addressing mode as one FCML_OT_FAR_POINTER operand, so using immediate operands will break the symmetry here. Just treat it as a workaround to the ambiguous syntax and do not use it in manually prepared generic instruction models.

Relative addresses

Immediate relative addressing is used in case of branch instructions like CALL or JMP. Let's take a look how such instructions are written using the Intel and AT&T syntaxes:

How relative jumps are encoded by the Intel and AT&T dialects
Intel jmp 00401001h
AT&T jmp 0x00401001

In case of AT&T everything is simple because it can be easily interpreted as an absolute offset (Immediate integers use $ prefix).

The problem appears when it comes to the Intel syntax, because value 00401001h is interpreted as an immediate value (FCML_OT_IMMEDIATE) and in fact it is how the Intel parser does its job here.

Fortunately, in case of the immediate relative addressing both dialects support FCML_OT_IMMEDIATE as well as FCML_OT_ADDRESS (FCML_AF_OFFSET), so both GIMs will be properly assembled.

In this case the disassembler does not take the used dialect into account and always returns relative addresses as FCML_OT_ADDRESS (FCML_AF_OFFSET) operands. So if you build the GIM manually and would like to preserve symmetry you should always use FCML_OT_ADDRESS (FCML_AF_OFFSET).

Direct and indirect addresses

The sections below describe some general rules and differences you should be aware of dealing with the direct and indirect addressing for branches.

Intel dialect

In case of the Intel dialect indirect near and far addresses for branches are encoded using the standard effective addressing pattern. Like in the following examples:

jmp dword [00401001h]
jmp far dword ptr [ebx+00000001h]

The Intel parser converts these operands to FCML_OT_ADDRESS and everything would be OK if there was not a small conflict with the direct immediate relative addressing. As you probably already know direct immediate relative addresses can be encoded in two ways using FCML_OT_IMMEDIATE and FCML_OT_ADDRESS operands. The first way (FCML_OT_IMMEDIATE) is used by the Intel parser (In case of addresses encoded as relative displacement.) and the second one (FCML_OT_ADDRESS) for instance is returned by the disassembler. It is just the problem, the different addressing modes use exactly the same way to encode their operands. For example:

jmp dword [00401001h] (indirect)
jmp 00401001h (direct)

From the GIM point of view both of these instructions can be encoded like this:

fcml_st_instruction instruction = {0};
instruction.mnemonic = "jmp";
instruction.operands[0].type = FCML_OT_ADDRESS;
instruction.operands[0].address.address_form = FCML_AF_OFFSET;
instruction.operands[0].address.size_operator = FCML_DS_32;
instruction.operands[0].address.offset.off32 = 0x00401001;
instruction.operands[0].address.offset.size = FCML_DS_32;
instruction.operands_count = 1;

So how does the assembler distinguish these operands and how does it know which addressing mode to choose?

The only way to instruct the assembler which addressing mode should be used is to use one of the instruction hints: FCML_HINT_INDIRECT_POINTER or FCML_HINT_DIRECT_POINTER. So for example the following instruction will be always assembled using the direct immediate relative addressing:

fcml_st_instruction instruction = {0};
instruction.mnemonic = "jmp";
instruction.operands[0].type = FCML_OT_ADDRESS;
instruction.operands[0].address.address_form = FCML_AF_OFFSET;
instruction.operands[0].address.offset.off32 = 0x00401001;
instruction.operands[0].address.offset.size = FCML_DS_32;
instruction.hints |= FCML_HINT_DIRECT_POINTER;
instruction.operands_count = 1;

In case of the Intel dialect FCML_HINT_INDIRECT_POINTER hint is always the default value in case of the described conflict. It is why such instruction "jmp dword [00401001h]" will always be encoded using indirect memory addressing even if the hint is not present. If you somehow would like to encode a direct immediate relative address using such an effective addressing pattern you can use a "direct" hint in order to do so (Remember that the Intel parser always encodes direct relative addresses as FCML_OT_IMMEDIATE). For instance "jmp direct dword [00401001h]”, but take into account that it is not the standard way to do so and as such it should not be used anyway. For the time of writing, there is only one case where it might be sometimes useful. You can use the following notation to force rel16 addressing (16 bit operand size attribute) for the immediate relative addressing: "jmp direct word [00401001h]", because it cannot be achieved otherwise at the moment.

Also notice that in case of the immediate relative addressing, size_operator can be silently ignored, because it is used only to force the specific effective operand size attribute for the instruction (It is why the trick above works). So if it is not set, the most relevant effective operand size attribute will be chosen (If default optimizer is used.) in respect to the chosen processor operating mode (16, 32, 64 bits).

Remember that the disassembler always sets appropriate hints, even if they are optional in given context in order to avoid potential ambiguities.

AT&T dialect

In case of the AT&T dialect indirect addressing for branches as well as the direct relative addressing are encoded using the standard effective addressing pattern. Like in the following examples:

jmp 0x90d11004
jmp *0x90d11004

Unlike the Intel dialect, the AT&T parser converts both of these forms to the FCML_OT_ADDRESS operands. This rule follows the standard AT&T syntax so everything is consistent here. For the AT&T dialect we will never get a direct relative address encoded as FCML_OT_IMMEDIATE operand. Of course the disassembler supports it anyway, but you should never use it in practice, because it is unnatural for the AT&T syntax. To say nothing of the symmetry.

Like in the Intel dialect there is also a conflict here, because both of these instructions are encoded to the same GIM. Fortunately, even if there is a conflict on the operand level the both instructions are not ambiguous at all due to the used indirect operator. So in fact the AT&T parser will always set FCML_HINT_INDIRECT_POINTER hint for the second instruction and everything would be OK if we were using only the parser to build general instruction models. The problem appears when we would like to use manually prepared GIM, because in order to avoid conflicts we have to use hints for both addressing modes. The direct relative immediate addressing needs FCML_HINT_DIRECT_POINTER hint whereas the indirect memory addressing is forced using FCML_HINT_INDIRECT_POINTER hint. To be consistent the AT&T dialect uses FCML_HINT_DIRECT_POINTER hint as the default hint for the branches which use direct relative addresses, so you have to set FCML_HINT_INDIRECT_POINTER hint only in order to force indirect addressing.

The AT&T disassembler always sets appropriate hints for both operating modes in order to avoid potential ambiguities.

Data types

The header file fcml_types.h defines all simple types and their nullable counterparts used by the implementation of the FCML library. These types are also used by structures and functions being part of the library interface, so you should take a look at them:

Simple data types supported by FCML
Type Size Description
fcml_char 8 bits One ASCII character.
fcml_string variable An array of ASCII characters.
fcml_float 32 bits Float values (Not used yet, reserved for future use.).
fcml_usize 32 bits Data type for all unsigned sizes.
fcml_size 32 bits Data type for all signed sizes.
fcml_flags 32 bits Data types for flags based of bit masks.
fcml_int Architecture dependent Counterpart of the standard int type.
fcml_bool Architecture dependent Boolean data type. Possible values: FCML_TRUE, FCML_FALSE.
fcml_int8_t 8 bits A signed 8 bits integer.
fcml_uint8_t 8 bits An unsigned 8 bits integer.
fcml_int16_t 16 bits A signed 16 bits integer.
fcml_uint16_t 16 bits An unsigned 16 bits integer.
fcml_int32_t 32 bits A signed 32 bits integer.
fcml_uint32_t 32 bits An unsigned 32 bits integer.
fcml_int64_t 64 bits A signed 64 bits integer.
fcml_uint64_t 64 bits An unsigned 64 bits integer.

Every integer type other than fcml_int has a nullable counterpart. They are structures which consist of an integer value itself and a boolean field indicating whether the value is set or not. For example:

typedef struct fcml_nuint32_t {
	fcml_uint32_t value;
	fcml_bool is_not_null;
} fcml_nuint32_t;

There is one more important complex data type fcml_st_integer. It is a general purpose container for integer values that also holds the size and sign of the integer value:

typedef struct fcml_st_integer {
	fcml_data_size size;
	fcml_bool is_signed;
	fcml_int8_t int8;
	fcml_int16_t int16;
	fcml_int32_t int32;
	fcml_int64_t int64;
} fcml_st_integer;

There are also constants with limits for integer values as well as patterns for text formatting functions from standard I/O libraries, see fcml_types.h for more details.

Error handling

The whole FCML library consistently follows the same error handling convention. All structures functions and macros dedicated for error handling are defined inside fcml_errors.h header file. The most commonly used data type is without a doubt fcml_ceh_error. It is used to return an error code from almost every FCML library function. The possible error codes are defined by fcml_en_ceh_error_globals enumeration which can be found inside fcml_errors.h header file.

There is also another enumeration fcml_en_ceh_message_errors which contains error codes dedicated for textual messages. The global error codes can be also used as message error codes and it is why they use different ranges.

The main structure behind error handling is fcml_st_ceh_error_container which is a container that holds one or more textual error messages:

typedef struct fcml_st_ceh_error_container {
	fcml_st_ceh_error_info *errors;
	fcml_st_ceh_error_info *last_error;
} fcml_st_ceh_error_container;

The field errors points to the first error available in the container. If the container is empty it is set to NULL. The next field last_errors points to the last error on the list. The following structure describes one error message:

typedef struct fcml_st_ceh_error_info {
	struct fcml_st_ceh_error_info *next_error;
	fcml_string message;
	fcml_ceh_error code;
	fcml_en_ceh_error_level level;
} fcml_st_ceh_error_info;

The first field next_error points to the next error on the list. For the last one it is set to NULL. The field message contains textual representation of the error whose code is available in the code field. Every error message has also a level associated with it. Currently there are two supported levels: FCML_EN_CEH_EL_WARN and FCML_EN_CEH_EL_ERROR. The first is destined for warning messages which might be returned even if a given function ended with success and the second one is for critical errors.

There are also dedicated error codes for warnings, take a look at fcml_errors.h for more details.

Configuring environment

The header file fcml_env.h declares three functions that can be used to register dedicated handlers for memory management subsystem. By registering them you can provide your own implementation of the functions which allocate memory blocks, reallocate them and free the ones that are not needed any more. Every internal FCML component (Even Flex lexical analysers and Bison parsers) uses these handlers to manage the memory heap, so after changing them you should not see any direct calls to the standard heap management functions like malloc or free.

In order to register a new handler you have to prepare the implementation first. To do so you have to implement functions that match the following function declarations:

typedef fcml_ptr (*fcml_fp_env_memory_alloc_handler)( fcml_usize size );
typedef fcml_ptr (*fcml_fp_env_memory_realloc_handler)( fcml_ptr ptr, fcml_usize size );
typedef void (*fcml_fp_env_memory_free_handler)( fcml_ptr memory_block );

Then you have to use the following functions to register the new handlers:

fcml_fp_env_memory_alloc_handler fcml_fn_env_register_memory_alloc_handler(
	fcml_fp_env_memory_alloc_handler handler );
fcml_fp_env_memory_realloc_handler fcml_fn_env_register_memory_realloc_handler(
	fcml_fp_env_memory_realloc_handler handler );
fcml_fp_env_memory_free_handler fcml_fn_env_register_memory_free_handler(
	fcml_fp_env_memory_free_handler handler );

So for example, in order to register a new handler for the memory allocation function you should call the fcml_fn_env_register_memory_alloc_handler function passing the pointer to the new handler implementation as the function parameter. As a result the function returns the pointer to the old allocation handler so you can use it to restore the original handler if needed. The working example is available in "check/internal-tests/env_t.c" unit test.

Pseudo operations

A pseudo-operation is an instruction to the assembler that does not generate any machine code. Every pseudo operation has its own independent and dedicated implementation. Currently only one pseudo operation is supported by FCML. It is db in case of the Intel dialect and .byte for the GAS. This operation was added only in order to handle unknown instructions properly. So if the disassembler configuration field fail_if_unknown_instruction is set to FCML_FALSE, instead of returning the FCML_CEH_GEC_UNKNOWN_INSTRUCTION the db/.byte pseudo operation is generated for the current byte. In both cases the instruction is encoded using the instruction mnemonic and one operand of the type FCML_OT_IMMEDIATE (8 bits). The following examples show syntaxes for the both dialects:

Pseudo operation
AT&T label: .byte 0x12
Intel label: db 0x12

The instruction label is of course optional and can be ignored.

Symbols

Symbols are used by parsers in place of constant values and to declare instruction labels. Generic instruction models need all values explicitly set, so all symbols have to be resolved on the parser layer. It is why the symbol tables are available only in the fcml_st_parser_context and fcml_st_lag_assembler_context structures (Remember that the multi-pass load-and-go assembler works with textual instructions, so they have to be parsed as well). The following example shows the proper way to use symbols:

label: adc eax, ( 128 + MAX_PATH ) * FACTOR

As you can see there are three symbols used. The first one "label" is a symbol declaration. The second and the third are used as constant values.

The main structure behind symbols is the symbol itself represented as the fcml_st_symbol structure and the data type fcml_st_symbol_table used to represent the symbol table. The symbol table is used to exchange symbols with parsers.

First of all let's take a look at the fcml_st_symbol structure:

typedef struct fcml_st_symbol {
	fcml_string symbol;
	fcml_int64_t value;
} fcml_st_symbol;

The first field stores a symbol name. The second field holds a symbol value. As it is described in the chapter about parsers, every parser works on 64-bit values. It is why the symbol value has 64 bits here.

The symbol table can be used in both directions. It means that you can use it in order to declare constants for the parser and to access labels declared by the parser. For more information about using symbols together with parsers do not hesitate to take a look at the Parser chapter.

For now, let's try to create our first symbol table and to declare the two symbols that have been used in the example above.

In order to create a new symbol table you have to use the fcml_fn_symbol_table_alloc function:

	LIB_EXPORT fcml_st_symbol_table LIB_CALL fcml_fn_symbol_table_alloc();

For instance having initialized the parser context you can declare it like this:

fcml_st_parser_context context = {0};
…
context.symbol_table = fcml_fn_symbol_table_alloc();

Of course you should check if the function succeeded, because it may return a NULL in case of a lack of memory.

Now feel free to add as much symbols as you want to the table, but you have to know one more important thing before allocating anything.

Every symbol should be allocated by the dedicated function fcml_fn_symbol_alloc. It is really important as long as you would like to use standard functions like fcml_fn_symbol_remove_all or fcml_fn_symbol_table_free to free symbols, because they have to use the same shared heap. It is especially important in case of windows binaries which were statically linked with standard libraries.

The declaration of the mentioned function:

fcml_st_symbol* LIB_CALL fcml_fn_symbol_alloc( 
	fcml_string symbol_name, 
	fcml_int64_t value )

The function gets a symbol name and a symbol value as parameters and returns the allocated symbol. The symbol name is duplicated and can be safely freed as soon as the function finishes.

OK, we know how to allocate the symbol table and the symbols themselves. Now let's try to add the new symbol to the symbol table. In order to do so you have to use the following function:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_symbol_add( 
	fcml_st_symbol_table symbol_table, 
	fcml_st_symbol *symbol );

It adds a symbol to a symbol table passed in the first parameter.

Alternatively you may consider the use of the fcml_fn_symbol_add_raw function which allocates a new symbol and adds it to a symbol table in one step:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_symbol_add_raw( 
	fcml_st_symbol_table symbol_table, 
	fcml_string symbol,
	fcml_int64_t value );

The first parameter is the destination symbol table. The second parameter is the symbol name and the last one is the symbol value. Every symbol name is duplicated so feel free to deallocate it as soon as the function returns. If the function succeeds FCML_CEH_GEC_NO_ERROR is returned.

Allocating the constant values from the example above (Without error handling):

fcml_st_symbol_table table = fcml_fn_symbol_table_alloc();
fcml_st_symbol *symbol = fcml_fn_symbol_alloc( "MAX_PATH", 256 );
fcml_fn_symbol_add( table, symbol )
symbol = fcml_fn_symbol_alloc( "FACTOR", 2 );
fcml_fn_symbol_add( table, symbol );

For now we know how to allocate and fill a symbol table with symbols, but what if we would like to get a label declared by the parser? It is quite easy. There is a function fcml_fn_symbol_get which can be used to retrieve every symbol from a symbol table:

LIB_EXPORT fcml_st_symbol* LIB_CALL fcml_fn_symbol_get( 
	fcml_st_symbol_table symbol_table, 
	fcml_string symbol );

It gets a symbol table and a symbol name as parameters and returns the corresponding symbol or NULL if there is no such symbol in the symbol table.

There is one more useful function that can be used to manipulate symbol tables. It is fcml_fn_symbol_remove which removes a symbol of matching name from a symbol table. It only removes the symbol from the symbol table, but does not free it in any way:

	LIB_EXPORT void LIB_CALL fcml_fn_symbol_remove( 
		fcml_st_symbol_table symbol_table,
		fcml_string symbol );

If you would like to free it manually you have to use a different function:

LIB_EXPORT void LIB_CALL fcml_fn_symbol_free( fcml_st_symbol *symbol );

When a symbol table is not needed any more it should be freed. In order to do so you can use the fcml_fn_symbol_table_free function which frees all symbols available in the symbol table and then deallocates the symbol table itself.

You can also clean a symbol table by freeing all symbols available in it using the fcml_fn_symbol_remove_all function:

LIB_EXPORT void LIB_CALL fcml_fn_symbol_remove_all( 
	fcml_st_symbol_table symbol_table );

Assembler

The FCML assembler is an one-line load-and-go assembler implementation. It means that it is able to generate the machine code even directly to a real code segment where it can be then executed. It also means that it generates the machine code only, so you cannot generate the whole executable file with all headers and chunks using the library.

The FCML assembler works with the GIM (see: Generic instruction model), so in order to assemble anything you are obliged to prepare the correct GIM and then pass it to the assembler. Such a solution has the following advantages:

  • You are able to build instructions in a dynamic manner and it is more convenient than simple string concatenation.

  • You are able to disassemble a piece of machine code directly to the GIM, analyse it, then for instance you may decide to change the offset of a branch instruction and pass such a modified GIM directly to the assembler to assemble it back. If instructions were represented as strings it would be pretty close to being impossible, or at least really hard to achieve :)

  • You can parse an instruction to the GIM, analyse it (for example to check if the user tried to use a restricted register for example.) and then pass such a verified model to the assembler.

Initializing assembler instance

In order to prepare the assembler to work it needs to be initialized with an instance of the dialect (see: Dialects), hence it is a time to take a look at the initialization function provided by the FCML library:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_assembler_init( 
	fcml_st_dialect *context, 
	fcml_st_assembler **assembler );

After the function is successfully invoked the assembler parameter points to the initialized assembler.

Notice that every assembler instance is thread-safe, hence it can be used across many independent threads, but initialization should be synchronized as long as the assembler parameter points to a shared pointer.

Every instance of the assembler should be freed when it is no longer needed. We are able to achieve that using the following function:

LIB_EXPORT void LIB_CALL fcml_fn_assembler_free( 
	fcml_st_assembler *assembler );

Just pass the assembler instance as the function parameter and that is all.

There is one more thing that should be pointed out here. In the paragraph above you learnt that assemblers are thread-safe and their instances can be safely shared across independent threads and that initialization has to be synchronized when we would like to initialize only one assembler instance for multiple threads. This rule is also true in case of the assembler freeing process, which also has to be synchronized. So it should be enough to remember that only the fcml_fn_assemble function which shares an assembler instance is really thread-safe.

The best pattern for multi-threaded environment is to initialize an assembler instance on the main thread, then use multiple independent working threads to share the assembler and at the end when working threads are dead, free the assembler using the main thread again.

Initialization example:

fcml_st_assembler *assembler;
fcml_ceh_error error = fcml_fn_assembler_init( dialect, &assembler );
if( error ) {
	printf("Can not initialize the assembler: error %d.\n", error);
	exit(1);
}
fcml_fn_assembler_free( assembler );

Assembling generic instruction model

When we already have an assembler instance initialized, we can use it to assemble an example GIM. In order to do so, we have to use the following function:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_assemble( 
	fcml_st_assembler_context *context, 
	const fcml_st_instruction *instruction, 
	fcml_st_assembler_result *result );

As you can see it gets three parameters. The first one is an assembler context which contains some information about the environment as well as the assembler configuration that should be used to assemble one instruction. The second parameter is the generic instruction model describing an instruction and the last one is a place where the assembler puts all the results. So let's start with the assembler context first.

Assembler context

An assembler context holds information about the environment. It tells FCML which assembler instance it should use to assemble the instruction code, an assembler configuration to be used while assembling the instruction and an entry point describing a virtual code segment (see: Understanding entry point). The context is reusable and can be shared between multiple function calls, but it should not be used across multiple threads, mainly due to the consistency with the disassembler and the fact that the assembler can modify the entry point in some circumstances. Take a look at the fcml_st_assembler_context structure declaration:

typedef struct fcml_st_assembler_context {
    fcml_st_assembler *assembler;
    fcml_st_assembler_conf configuration;
    fcml_st_entry_point entry_point;
} fcml_st_assembler_context;

The first property assembler has to be set to the initialized assembler instance. The second property configuration consists of all configuration properties. The fcml_st_assembler_conf structure declaration is available in the fcml_assembler.h header file and is well documented there, so let's focus on the properties it defines only:

increment_ip

Set to true if you would like the instruction pointer to be incremented after an instruction is assembled. It is incremented by the length of the assembled instruction chosen as a best option (By the instruction chooser). It can be very useful when we are assembling multiple instructions (which follow each other) one by one using the same assembler context.

enable_error_messages

Set to true if you want textual error messages to be generated (See: Error handling).

choose_sib_encoding

It is rarely a useful option. It tells the assembler to encode the ModR/M byte together with the SIB byte even if it is optional in a certain case. In the day-to-day usage it should be always set to FCML_FALSE.

choose_abs_encoding

It is useful in case of the 64-bit processor operating mode. If it is set to FCML_TRUE it disables the RIP addressing and forces the assembler to encode addresses as absolute offsets; otherwise it tells the assembler to use the RIP addressing and to encode addresses as displacements relative to the instruction pointer.

force_rex_prefix

Sometimes the REX prefix is useless so it is just omitted in the final machine code. By setting this field to FCML_TRUE you can force the REX prefix to be added anyway. In the day-to-day usage it should be always set to FCML_FALSE.

force_three_byte_VEX

The assembler always chooses the shortest form of the VEX/XOP prefixes an instruction can be encoded with, however by setting this field to FCML_TRUE you can force the assembler to always encode this prefix on three bytes. In the day-to-day usage it should be always set to FCML_FALSE.

optimizer

This field allows us to set a custom optimizer that should be used by the assembler (See: Optimizers). Setting this property to NULL results in choosing the default optimizer implementation.

optimizer_flags

Optional flags which will be passed to the optimizer.

chooser

This field defines a custom instruction chooser (See: Instruction choosers). Setting this property to NULL results in choosing the default chooser implementation.

Looking at these configuration properties you might be under the impression that there are only two configuration properties which are really useful if you do not need any special power and it is of course true. In the day-to-day usage the configuration should look something like this, according to your need:

context.configuration.increment_ip = FCML_TRUE;
context.configuration.enable_error_messages = FCML_TRUE;

The last property inside the assembler context is the entry_point. This structure is widely used across the FCML library and has its own chapter, so do not hesitate to take a look at it (Understanding entry point).

Optimizers

To explain it in simple terms we need an example here. So let's take a look at this instruction: "call qword ptr [rdi+1h]", which should be assembled to the following code: 0xFF5701.

Although the effective operand size attribute is always forced to 64 bits for this instruction, we still can try to force the 16-bit operand size using 0x66 prefix: 0x66FF5701, or even combine it with the optional REX.W set to the 1 (What is not necessary because the 64-bit operand size attribute is forced anyway): 0x6648FF5701. Of course despite the fact that these encodings are perfectly allowed, they are superfluous and can even hit the performance. In general optimizers are responsible for choosing the best combination of the effective operand size attribute and effective address size attribute to be used for assembled instructions. It is a bit more advanced subject and you definitely should spend a little while analysing the fcml_optimizers.h header file and the default optimizer implementation fcml_fn_asm_default_optimizer if you do really need to write your own optimizer.

Instruction choosers

As you probably already know, assemblers are able to assemble instructions to more than one piece of the machine code. If there are any alternatives it is up to the instruction chooser to choose the most relevant piece of the machine code and mark it for the user. The default implementation returns just the shortest instruction form available, so it is a pretty simple implementation. If you would like to implement your own instruction chooser, do not hesitate to take a look at the fcml_choosers.h header file and the default implementation itself: fcml_fn_asm_default_instruction_chooser. Currently there is no context available yet for the choosers, so everything you can analyse to make the decision is the machine code itself. It is a bit limited for the moment and the API probably will be extended in the future.

Preparing assembler result

Every invocation of the assembler needs to return a piece of assembled code as the result. The structure fcml_st_assembler_result is a container for anything that assembler can produce:

typedef struct fcml_st_assembler_result {
	fcml_st_ceh_error_container errors;
	fcml_st_assembled_instruction *instructions;
	fcml_st_assembled_instruction *chosen_instruction;
	fcml_usize number_of_instructions;
} fcml_st_assembler_result;

The first field contains textual errors with the reason of the failure. It also may contain warning messages which might be available there even if the function succeeded. The remaining fields hold information about assembled instructions. The field instructions points to the head of the instruction chain. The best instruction chosen by the assembler (See: Instruction choosers) is returned in the chosen_instruction field and the number of available instructions is available in the last number_of_instructions field.

The most interesting thing here is the assembled instruction itself, so let's take a look at the fcml_st_assembled_instruction structure:

typedef struct fcml_st_assembled_instruction {
    struct fcml_st_assembled_instruction *next;
    fcml_st_ceh_error_container warnings;
    fcml_uint8_t *code;
    fcml_usize code_length;
} fcml_st_assembled_instruction;

As you can see there are warnings related to the assembled instruction and the pointer to the next instruction in the chain, but it is the machine code what is the most important for us. The field code points to the machine code representation of the assembled instruction and code_length is the length of the machine code in bytes.

Now we know how the structures are organized, so it is time to pay some attention to the general usage, but at first let's take a look at the declaration of the assembling function again:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_assemble( 
	fcml_st_assembler_context *context, 
	const fcml_st_instruction *instruction, 
	fcml_st_assembler_result *result );

As you can see the assembler result is just a pointer to the result structure, so it cannot be allocated and returned by the assembler. Surely the function could be implemented to return a pointer to the pointer of the result which then would be used by the assembler to return a newly allocated structure by every invocation. Such a solution would be at least impractical, because you would be obliged to free this structure after every invocation of the assembler and it could not be reused in any way. Hence to avoid unnecessary allocations and deallocations, a reusable structure is used here. You can prepare only one assembler result and use it for every invocation of the assembler within one thread.

Of course the structure holds memory regions allocated by the assembler that have to be freed anyway, but it is done transparently by the assembler itself, so when the assembler gets the same structure again it checks if there is something allocated in it and deallocates everything in order to prepare it for the new assembling process (warnings, assembled instructions). You have to deallocate it on your own but only when it is not needed any more. In such a case you are obligated to free the result of the last assembling process using the fcml_fn_assembler_result_free function. Remember that this function is responsible for freeing all internal structures, but the main container is certainly still valid and you are responsible for freeing it (or even reusing it again). So anyway, when we call the assembler the first time it checks the result and tries to deallocate everything it finds there. It can be a bit dangerous process for uninitialized structures cause of potential rubbish there. It is why the assembler result has to be prepared for the first call. It is the fcml_fn_assembler_result_prepare function which is responsible for preparing it for the first call.

To sum up, let's take a look at the following piece of code. It shows the pattern we should follow when working with the result structure:

fcml_st_disassembler_result result;
fcml_fn_disassembler_result_prepare( &result );
while(...) {
    fcml_fn_assemble( &asm_context, &instruction, &asm_result );
    ...
}
fcml_fn_assembler_result_free( &asm_result );
Notice that: You have to copy the machine code of an assembled instruction if you need it later, because it will be deallocated by the next call to the assembler.

Although the assembler result structure is reusable it should be remembered that it cannot be used across multiple threads at the same time, because it may lead from memory loss to event unexpected crashes due to memory corruption.

Invoking assembler

Until now, we have prepared all instruction parameters, so let's see the assembling function declaration again:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_assemble( 
	fcml_st_assembler_context *context, 
	const fcml_st_instruction *instruction, 
	fcml_st_assembler_result *result );

In order to assemble anything, we have to call it with the parameters described in the previous chapters. There is nothing complicated with it, but if you would like to see an example code take a look at quick guide for the assembler.

The assembler uses the mnemonic from the GIM to get all instruction forms available for it. Then it checks which forms can be used in order to assemble the given instruction model. When instruction forms are validated and accepted, the optimizer chooses the best combination of the address size attribute and operand size attribute for each accepted instruction. In the next step everything is passed to the instruction encoder which consists of a chain of instruction part encoders which in turn are responsible for encoding independent instruction parts like prefixes, ModR/M, operands etc. Finally, all encoded parts are assembled into the machine code of the instruction as a whole. The encoding process is repeated for every accepted instruction form. Of course the whole process is a bit more complicated but this description should provide a general view at a high level of abstraction.

This function is thread-safe so it can be safely called simultaneously by multiple threads but you have to be sure that at least the context and result are not shared across threads.

Multi pass assembler

The FCML library supports simple multi-pass load-and-go assembling, which has been designed to be able to assemble multiple lines of source code at once. This is an experimental implementation and it has not been well tested yet, so I would be grateful for any feedback. There are few unit tests there. I also used it for my own purpose, but I definitely haven't spent as much time as I should testing and using it to call it a stable and mature implementation.

As well as the classic one-line assembler it needs an initialized dialect instance to work. It is just a wrapper to the one-line assembler, so you have to initialize an assembler instance as well. Let's see a declaration of the function that assembles the source code:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_lag_assemble( 
	fcml_st_lag_assembler_context *context, 
	const fcml_string *source_code, 
	fcml_st_lag_assembler_result *result );

The first thing you have to prepare is an assembler context instance, but in this case it is a dedicated fcml_st_lag_assembler_context structure. It is almost the same structure as in case of the one-line assembler, so there is no need to describe it in every detail. The only difference is a symbol table which can be used to pass additional custom symbols to the assembler and to return symbols allocated by the assembler itself to the user (As you will see it may be really useful in practice.):

typedef struct fcml_st_lag_assembler_context {
    fcml_st_dialect *dialect;
    fcml_st_assembler *assembler;
    fcml_st_assembler_conf configuration;
    fcml_st_entry_point entry_point;
    fcml_st_symbol_table symbol_table;
} fcml_st_lag_assembler_context;

The symbol table is described in the dedicated chapter: Symbols.

The next parameter is source_code which points to an array of strings, where every string represents one instruction to assemble.

In case of the multi-pass assembler it would be quite inconvenient to work with general instruction models, so in this case we need to provide source code using the syntax which is supported by the chosen dialect. It is more natural and supports features like symbols, labels and mathematical operators.

The assembled code is returned in a dedicated structure too:

typedef struct fcml_st_lag_assembler_result {
	fcml_st_ceh_error_container errors;
	fcml_int error_line;
	fcml_st_assembled_instruction *instructions;
} fcml_st_lag_assembler_result;

The first field holds errors, then we have the number of the source code line where the assembler failed and a list of instructions if everything succeeded.

Every assembled instruction is represented by the following structure:

typedef struct fcml_st_lag_assembled_instruction {
    struct fcml_st_assembled_instruction *next;
    fcml_st_ceh_error_container warnings;
    fcml_uint8_t *code;
    fcml_usize code_length;
} fcml_st_lag_assembled_instruction;

The used pattern is similar to the one-line assembler, but in this case every position in the instructions chain represents one line of the assembled source code. There are no duplicates like in case of the one-line assembler. It is just impossible because addresses used by assembled code have to be consistent and as so they are strictly connected with the size of the code between the branch instruction and the related label. So in order to generate a block of code which for example could be then executed, you have to iterate through the whole chain and copy every instruction code to the destination code segment.

The result structure has to be prepared in the same way as for the one-line assembler (For more details see: Preparing assembler result). It uses exactly the same model, the only difference is functions used to prepare and free the result structure. In this case you have to use the function fcml_fn_lag_assembler_result_prepare to prepare the result and fcml_fn_lag_assembler_result_free to free it if it is no longer needed.

Let's take a look at the following piece of code:

fcml_string source_code[] = {
	"start:      mov ebx, 1",
	"loop_big:   inc ebx",
	"            cmp ebx, 10",
	"            je  finish",
	"loop_small: mov eax, 1",
	"increment:  inc eax",
	"            cmp eax, 10",
	"            je  finish_small",
	"            jmp increment",
	"finish_small:",
	"            jmp loop_big",
	"finish:     ret",
	NULL
};

fcml_uint8_t assembled_instructions[] = {
	0xBB, 0x01, 0x00, 0x00, 0x00,
	0x43,
	0x83, 0xFB, 0x0A,
	0x74, 0x0F,
	0xB8, 0x01, 0x00, 0x00, 0x00,
	0x40,
	0x83, 0xF8, 0x0A,
	0x74, 0x02,
	0xEB, 0xF8,
	0xEB, 0xEB,
	0xC3
};

The first array contains the source code being assembled and the second one contains the machine code of the assembled instructions.

Look at the forth assembled instruction: "je finish", as you can see it was assembled to the following piece of code: 0x740F. In this case the assembler made a choice and decided that this jump can be encoded through the use of 8-bit displacement relative to the instruction pointer. The current implementation is a classic multi-pass assembler which assembles code in the multiple passes in order to generate the best code possible.

This implementation is based on the standard parsers used as well in case of the one-line assembler so everything they support is obviously available here.

As it was mentioned above, there is a symbol table in the assembler context. The symbol table is bidirectional and can be used to pass additional symbols (which are not defined in the code) to the assembler as well as to access symbols defined in the assembled code.

It can be a very useful feature, because after code is compiled you can easily check the address of every label defined in the source code. For instance, if you have assembled the source code available above, you can then check the address of the "finish" label using fcml_fn_symbol_get function (See: Symbols for more information about how symbols can be accessed.).

Like almost every FCML function, the multi pass load-and-go assembler is thread-safe and can be used across multiple threads using separated assembler contexts and results.

Although this assembler can be really useful, as opposed to the one-line assembler, it is still an experimental implementation so I cannot wait to have feedback from you. If you are interested in the way it was implemented do not hesitate to check the fcml_lag_assembler.c source file. The source code is well commented, so it should not be a big problem to understand the whole idea behind it. Unit tests are available in "check/internal-tests/lag_assembler_t.c".

The following example assembles a few lines of source code and prints the symbols and the machine code to the output:

#include <stdio.h>
#include <stdlib.h>

#include <fcml/fcml_lag_assembler.h>
#include <fcml/fcml_intel_dialect.h>
#include <fcml/fcml_symbols.h>

fcml_string source_code[] = {
	"_start: push ebp",
	"    mov  ebp,esp",
	"    mov  eax,dword ss:[ebp+8]",
	"    cmp  eax, 1",
	"    je   _ignore_call",
	"    call sys_function",
	"_ignore_call:",
	"    mov  esp,ebp",
	"    pop  ebp",
	"    ret",
	NULL
};

void print_symbol( fcml_st_symbol_table symbol_table, fcml_string symbol_name ) {
	fcml_st_symbol *symbol = fcml_fn_symbol_get( symbol_table, symbol_name );
	if( symbol ) {
		printf("  %s: 0x%lx\n", symbol_name, (fcml_uint64_t)symbol->value );
	} else {
		printf("  %s: Not found\n", symbol_name );
	}
}

int main(int argc, char **argv) {

	fcml_ceh_error error;

	fcml_st_dialect *dialect;

	error = fcml_fn_dialect_init_intel( FCML_INTEL_DIALECT_CF_DEFAULT, &dialect );
	if( error ) {
		fprintf( stderr, "Can not initialize the Intel dialect: %d", error );
		exit(1);
	}

	fcml_st_assembler *assembler;

	error = fcml_fn_assembler_init( dialect, &assembler );
	if( error ) {
		fprintf( stderr, "Can not initialize the assembler: %d", error );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	fcml_st_symbol_table symbol_table = fcml_fn_symbol_table_alloc();
	if( !symbol_table ) {
		fprintf( stderr, "Can not allocate the symbol table: %d", error );
		fcml_fn_assembler_free( assembler );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	error = fcml_fn_symbol_add_raw( symbol_table, "sys_function", 0x00405000 );
	if( error ) {
		fprintf( stderr, "Can not add a symbol to the symbol table: %d", error );
		fcml_fn_symbol_table_free( symbol_table );
		fcml_fn_assembler_free( assembler );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	fcml_st_lag_assembler_result result;
	fcml_fn_lag_assembler_result_prepare( &result );

	fcml_st_lag_assembler_context context = {0};
	context.assembler = assembler;
	context.configuration.enable_error_messages = FCML_TRUE;
	context.entry_point.op_mode = FCML_OM_32_BIT;
	context.entry_point.ip = 0x00401000;
	context.symbol_table = symbol_table;

	error = fcml_fn_lag_assemble( &context, source_code, &result );
	if( !error ) {

		printf("Assembled code:\n");

		fcml_st_assembled_instruction *instruction = result.instructions;
		while( instruction ) {
			int i;
			printf("  ");
			for( i = 0; i < instruction->code_length; i++ ) {
				printf("%02x", instruction->code[i]);
			}
			printf("\n");
			instruction = instruction->next;
		}

		printf("Symbols:\n");

		print_symbol( symbol_table, "_start" );
		print_symbol( symbol_table, "_ignore_call" );
		print_symbol( symbol_table, "sys_function" );

		printf("\n");

	} else {
		fprintf( stderr, "Cannot assemble the source code: %d, line: %d, Message: %s\n", error, result.error_line, result.errors.errors ? result.errors.errors->message : "None." );
		fcml_fn_symbol_table_free( symbol_table );
		fcml_fn_assembler_free( assembler );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	fcml_fn_symbol_table_free( symbol_table );

	fcml_fn_assembler_free( assembler );

	fcml_fn_dialect_free( dialect );

	return 0;

}

The expected output:

Assembled code:
  55
  8bec
  8b4508
  83f801
  7405
  e8f03f0000
  8be5
  5d
  c3
Symbols:
  _start: 0x401000
  _ignore_call: 0x401010
  sys_function: 0x405000

Parser

The following chapters describe all important details of the paring process.

Initialization and parsing

The possibility of assembling the general instruction model is without a doubt a very useful feature, especially in more advanced techniques, where we need the full control over dynamically generated instructions or when we modify disassembled instructions on the way and assemble them back.

But what if we work with textual forms of instructions, how to convert them to general instruction models?

An instruction parser is the answer. It is the FCML component responsible for converting one textual instruction into its GIM representation.

Parsers as well as other FCML components supports different types of dialects, so we can parse the Intel and AT&T syntax.

First of all let's take a look at the function declaration:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_parse( 
	fcml_st_parser_context *context, 
	fcml_string instruction, 
	fcml_st_parser_result *result );

It should look very familiar because it follows the same convention as the assembler and disassembler. It takes three parameters. The first one is a parser context, then an instruction we would like to parse and a structure which takes the result. So as usual let's start with the context at first:

typedef struct fcml_st_parser_context {
	fcml_st_dialect *dialect;
	fcml_st_parser_config config;
	fcml_ip ip;
	fcml_st_symbol_table symbol_table;
} fcml_st_parser_context;

The first field is of course a dialect instance used to support the syntax we would like to parse. It should be the same dialect instance as the one used by the assembler then (Remember that GIM is syntax-dependent). The next field is a configuration that also is not a surprise when we are familiar with the other FCML components.

So let's take a look at it:

typedef struct fcml_st_parser_config {
	fcml_bool ignore_undefined_symbols;
	fcml_bool disable_symbols_declaration;
	fcml_bool override_labels;
	fcml_bool alloc_symbol_table_if_needed;
	fcml_bool enable_error_messages;
} fcml_st_parser_config;

In this case, the configuration is not very sophisticated. The first field can be set to FCML_TRUE in order to force the parser to ignore all undefined symbols; or to FCML_FALSE if the parser should fail when such a symbol is found. If undefined symbols are allowed, they are converted to 0. This functionality can be useful if we would like for example to implement a multi-pass assembler (To be honest it was designed and is currently used by the multi pass load-and-go assembler available in the FCML library).

The second field disable_symbols_declaration can be used in order to disable support for labels. If it is set to FCML_TRUE every label declaration will cause an error to occur (FCML_CEH_GEC_UNSUPPORTED_LABEL_DECLARATION). It can be very useful when you do not need symbols at all and would not like to care about the symbol table in the context which can be allocated by the parser and has to be manually freed in such a case.

The third field override_labels should be set to FCML_TRUE if we allow overriding existing symbols by labels defined in the code if they share the same symbol name. For example let's imagine that there is a global utility function with a strange name "printf". After passing it to the multi pass assembler we could easily make a call to this utility function using branch instructions, for example: "call printf". What if we would like to override it and define our own implementation of the function? It would be possible as long as override_labels was set to FCML_TRUE, because the first existence of the local "printf" label would override the one defined by default. Of course it is only a theory and in practice it is disabled in the FCML multi-pass assembler :)

The forth field alloc_symbol_table_if_needed is a bit dangerous and you should understand it well before potential usage. By default the instruction parser ignores all symbol declarations if there is no symbol table provided in the parser context. By setting this value to true you can force the parser to allocate a new symbol table when needed. Remember that you are then responsible for freeing it, so this functionality can be a bit tricky, because you have to check for the existence of the symbol table every time it should be deallocated.

The last field enable_error_messages enables textual error messages for the parser.

Let's back to the parser context. The third field ip holds an instruction pointer (Address where instruction is located in the code segment.). It is used as a value for symbols that are defined for label declarations. So if your IP is 0x401000 and there is the following instruction being parsed "loop: add eax, 1", the new symbol loop will be defined with the value set to 0x401000.

The last but not least field is the symbol table itself. It is a very important structure because using it we are able to exchange symbols with the parser.

Let's imagine the following example:

You are the author of the best memory monitor in the world. Of course, as an advanced programmer you remember how hard it was to find symbols exported by modules loaded inside a monitored executable using all these simple and unsophisticated competitive products ;). Armed with this experience you decided to load all symbols exported by loaded modules and then provided them as symbols into your one-line assembler. Since that, all your users are able to call system functions as if they were defined directly in their code. For instance, by typing the following code: "call CreateWindow" (CreateWindow – A Windows API function using to create a new window.). Of course you can also define some standard constants like MAX_PATH and use them in mathematical computations "mov eax, MAX_PATH+1". I do not need to say how useful it can be.

If you do not need symbols to be used, just leave symbol table empty (I mean NULL) and set the configuration field disable_symbols_declaration to FCML_TRUE.

Let's back to the function parameters.

The second parameter is an instruction which points to the textual representation of the assembled instruction. There is nothing to explain here. It is just a plain ASCII string. All one-byte encodings as well as UTF-8 are supported, but you cannot use UNICODE here (Currently the library cannot be compiled with UNICODE support under the Windows, so you need to convert your UNICODE strings to the UTF-8 on your own).

The last parameter result is a container for the function result. Surely it has to be carefully prepared before usage. It follows the same pattern as the other FCML components so first of all let's take a look at the following chapter if you are not familiar with the pattern yet: Preparing assembler result. In this case the following functions have to be used in order to prepare and free the result container: fcml_fn_parser_result_prepare, fcml_fn_parser_result_free.

This is the structure itself:

typedef struct fcml_st_parser_result {
	fcml_st_ceh_error_container errors;
	fcml_st_symbol *symbol;
	fcml_st_instruction *instruction;
} fcml_st_parser_result;

The first field errors contains information about the reason of the failure.

The field symbol contains a symbol defined by a label if there was any. We have to spend a little more time here. You have to pay special attention when working with symbols, because even if you did not provide a symbol table inside the context, if there is a label defined by the parsed instruction, the symbol table will be allocated for the new symbol and returned by the context (Only if alloc_symbol_table_if_needed configuration field is set to FCML_TRUE; otherwise symbols are just ignored). In such a case you are responsible for freeing it. Remember that this is the parser context that is responsible for freeing the allocated symbols (Not literally of course.), because there is the symbol table that holds them. You should not really on fcml_fn_parser_result_free, because it frees everything but the symbol returned in the symbol field. In other words, in order to free symbol returned by the parser, free the symbol table from the parser context.

Once all parameters are prepared, we can then call the fcml_fn_parse function as follows:

error = fcml_fn_parse( &parser_context, instruction, &parser_result );
if( error ) {
	...
}

As you see, the pattern is exactly the same as in case of others main FCML functions.

At this point we have to stop for a moment and take a look at the subject of the structures ownership. Take into consideration the fact that the structure fcml_st_instruction located in the fcml_st_parser_result structure is allocated by the parser. It is very important to be aware that it is the parser that is responsible for freeing it. It can be a bit problematic in cases where you have to parse something but an interpretation is deferred in time, because you cannot free the whole assembler result that occupies unnecessary memory.

Fortunately, there is a solution to this problem. It is the function called fcml_fn_cu_clone_instruction available in the header file fcml_common_utils.h. This function can be used in order to clone the instruction model prepared by the parser. After the instruction is cloned you are able to free the whole result without any negative consequences. Of course you are responsible for the cloned instance and have to free it using the next fcml_fn_cu_free_instruction function as soon as it is no longer needed.

Never try to free any FCML structures manually using the plain "free" function, because such structures may be the roots of bigger trees with nested structures that have to be also properly freed. There is one more argument for not freeing it on your own (Because in certain circumstances it is possible anyway.). FCML is a quite flexible library and it even lets the user to replace functions used for memory allocation and deallocation to their own implementations. Now try to imagine the consequences when you use the standard ”free” function in order to deallocate a piece of memory which was allocated by a completely different dedicated memory heap. Statically linked libraries can also use different heaps than the executable that uses them, just by default.

How parser works

Even if there is only one function used to parse instructions which uses dialects in order to distinguish between supported syntaxes, every dialect has its own dedicated implementation of the parser. It is a very natural solution because parsers tend to be very complicated and they are very difficult if not even impossible to be implemented in a generic way. The main problem here is the grammar itself. It can be compared to the problem with natural languages where grammars differ on such many levels that it is just impossible to implement one generic tool which would be able to parse sentences in all languages using the same algorithm and sets of rules. Every generic solution would need to impose a general set of restrictions for future dialects it would be able to support. I am pretty sure you will agree with me that it is not the best idea to restrict innovation in any way ;)

So FCML currently supports the Intel and AT&T syntax. Every syntax is supported by the one dedicated parser. Every parser implements different set of rules in order to support the given syntax, but anyway they share a piece of common utility code. A good example are functions used to build and interpret an instance of the abstract syntax tree built while an instruction is parsed. It is why some rules are still common for all available parsers.

FCML parsers are built using GNU Bison parser generator and Flex scanner (You do not need them installed to compile the project, as long as you do not change the grammar files).

Common rules

The following chapter covers areas where parsers share the same set of rules which can be described in one place.

Representation of numeric values

Numeric values can be represented in many different ways on the level of the certain syntax, but they have something in common. After they are properly parsed, they are represented identically in the abstract syntax tree. They are also computed by using exactly the same set of rules, not matter which parser was used to parse the source expression (See: Expressions handling).

Every integer value is always parsed to the 64-bit unsigned value, because we never know the way the assembler will use the final result. It can use it as an 8-bit immediate value, but it may use it as an 64-bit immediate value for MOV instruction as well. Remember that every parser works with generic instruction models without any knowledge about the context where the models will be used further (They are so called context-less parsers).

Even if an integer value is written with the sign, for instance -10, it is still parsed to the positive value and then converted to -10 using the unary minus operator. It has to be taken into account especially in case of integer literals written as hexadecimal values. For instance let's take a look at the following example:

50 - 5

This example is very easy and there is nothing special to be done in the background. The integer literals 50 as well as 5 are parsed to 64-bit unsigned integer values. At this moment they are still treated as unsigned integers. Then they take a part in the expression where the second value (5) is subtracted from the first one (50). Every integer value, which is a part of a mathematical expression, is firstly converted to a signed value. Then computation is made by using the two signed values, so the result is always the signed value as well.

The rule is simple. As long as a value does not take a part in an expression, it is an unsigned value. Every result of an expression is always a signed value. Notice that the rule is still valid for unary minus operator, because it is also an expression.

The problems start when you work with hexadecimal literals:

10 + 0xFF

Notice that the final result of this computation strictly depends on the size of the values that take part in the expression. If they were 8-bit values the result would be 9 because the second value would be treated as -1; otherwise if the values were for instance 16 bits in length, the result would be 265. So how such an expression will be evaluated in case of FCML parsers?

The first rule tells us that every integer literal is parsed to an unsigned 64-bit value at first. So the first value is parsed to 0x000000000000000A and the second one is parsed to 0x00000000000000FF. Take into consideration that they are still unsigned, because the expression is evaluated later. So we have something like this:

((signed)((unsigned)0x000000000000000A)) + ((signed)((unsigned)0x00000000000000FF))

The values themselves are unsigned but when the expression is about to be evaluated, they are cast to signed integers and then the computation takes place. So the result of the computation is (signed) 0x0000000000000109.

You may be under the impression that it is really not important if the result is signed or not here, because the representation of the value is still the same internally, but there is one important difference. From time to time a value has to be extended (Yes extended, everything is explained two paragraphs below) to match the expected size of an immediate operand for example. If the value is unsigned, then it is extended just by padding the value with zeros, so for instance an unsigned 0xFF 8-bit integer is extended to the 0x00FF when the 16-bit value is expected. But in case of a signed 8 bits 0xFF value (i.e. -1) the higher bits are filled with 1's, so the result would be 0xFFFF. As you can see in certain circumstances you may get two completely different results.

Hence all you need to do is to remember the main general rule: Every integer value is parsed to a 64-bit unsigned value first, and then it can be eventually converted to the signed 64-bit value when it is a part of an expression.

Every parsed integer has 64 bits, right? So is it always returned as the 64-bit integer value in the generic instruction model? The answer is no...

Before a value is placed inside a GIM, the parser tries to convert it to the smallest value possible. So if the result of an expression is signed 64-bit value: -15 it will be converted to the 8-bit signed immediate value for the use of the GIM. Take into account that even if the assembler still expects a 64-bit value, it can be then extended with the sign to match the expected size. For instance:

The value -15 of the following instruction: "add eax, -15" is parsed to the following immediate operand ( FCML_EOT_IMMEDIATE):

fcml_st_operand operand = {0};
...
operand.immediate.size = FCML_DS_8;
operand.immediate.is_signed = FCML_TRUE;
operand.immediate.int8 = 0xF1;
...

Then the assembler interprets this operand as a 32-bit immediate value, so it has to extend the value from the GIM to the expected size. The operand is the signed value so it will be extended to 32 bits: 0xFFFFFFF1.

If you want to be on the safe side, just remember the second rule: If you use a negative integer typed as a hex literal in an expressions, always use unary minus operator. So instead of writing 0xFFFFFFF1 use -0x0000000F or -0xF.

All this theory might be a bit complicated, so take a look at the following examples:

Example expressions
Expression Explanation
0xFFFFFFFF + 0xFFFFFFFF

The literals are converted to two 64-bit unsigned 0x00000000FFFFFFFF values and then are added to each other resulting with the 64-bit signed 0x00000001FFFFFFFE value. The result is then returned as a 64-bit immediate value in the GIM, because only the 64-bit value is able to represent such a big numbers.

0xFFFFFFFF

Let us suppose that in this case an instruction needs a 32-bit signed integer value as an operand. The parser knows nothing about the context so it will parse it to a 64-bit unsigned integer. The value does not take a part in any expression so it remains unsigned. Then it is converted to the smallest possible unsigned immediate integer. The smallest unsigned integer variable that would be able to take this value has to be 32 bits long. So inside the GIM it will be represented as an unsigned 32-bit integer variable set to 0xFFFFFFFF. Now let's get back to the instruction. It needs the 32-bit signed integer as the operand value. Notice that size matches in this case (It is really important fact!), so value has not to be extended in any way. In such a case a simple cast is made, so in the result the operand will be also set to a 32-bit integer value 0xFFFFFFFF, but in this case it will be further interpreted as -1. Anyway take into account that it would not work if you typed 0xFFFF, because it would be intimately converted to 0x0000FFFF.

There is only one rule you need to follow when you use hexadecimal literals which are not used in expressions: If the value has to be interpreted as a negative one, write it using exactly as much characters as needed in a given context.

100 - -0x0F

100 is parsed to 64-bit 0x0000000000000064.Then 0x0F is parsed to an unsigned 0x000000000000000F. The unary operator has higher precedence than "-" (minus), so firstly the second operand is changed to a signed 0xFFFFFFFFFFFFFFF1. Then the first operand is changed to a signed value and the second operand is subtracted from it. So writing the expression using decimals we have 100 + 15. Hence the result is 0x0000000000000073. The immediate operand returned inside the GIM will be a 8-bit signed 0x71 value.

Expressions handling

An expression is a combination of symbols like numeric values, constants etc. formatted according to some general rules and grouped by using operators and brackets. For instance this is an example of an expression which can be appropriately parsed by the Intel parser as well as by AT&T:

( ( start – end ) * 2 ) + 1

It follows general mathematical expressions rules supported as well by almost every programming language, so I do not think it is a place and time to describe such basics.

The one important thing here is a list of the operators currently supported by parsers (They are ordered by precedence):

Example expressions
Operator Name Description
+

Addition

Adds two operands to each other.

-

Subtraction

Subtracts two operands from each other.

*

Multiplication

Multiplies one operand by another.

/

Division

Divides the first operand by the second one.

-

Unary minus

Additive inverse.

Intel parser

Parsers which follows the Intel syntax are based on the syntax proposed by the Intel manuals to the x86_64 architecture. The following chapters describe some specific aspects of the Intel syntax (From the point of view of the FCML Intel dialect). If you are interested how the Intel parser works do not hesitate to take a look at the following source files:

Rules for Flex scanner generator: ${dist}/src/fcml_intel_lexer.l
BNF grammar for Bison parser generator: ${dist}/src/fcml_intel_parser_def.y

Remember that it is out of the scope of this manual to describe every aspect of the Intel syntax in general. My objective is just to describe everything that is often implemented in different ways across various assembler/disassembler implementations.

Numeric values

Decimal literals are written in the standard mathematical way using digits from 0-9 (Regular expression: [0-9]+d?). For example: 12, 1 or 112. You can add “d” suffix but it is optional. So they are the same values: 12, 12d. There is nothing special about it so let us take a look at hexadecimal values.

There are two ways hexadecimal literals may be written. The first one uses a convention popularized by C-like programming languages with the prefix 0x, for example 0xFF or 0x12. The second convention is used mostly by Intel assemblers and uses "h" suffix. For instance 12Fh, 0FFh. As you might have noticed, there are values that might start with the letter, for example FF. In order to distinguish them from named symbols use the precedent 0, for example 0FFh.

Float values are currently not supported, because they are not used directly in instructions and pseudo operations other than "db" used for data allocation are not supported by FCML yet.

Registers

The following table shows register symbols used by the Intel parser:

Registers supported by the Intel parser
Type Registers
8-bit general purpose registers

al, cl, dl, bl, ah, ch, dh, bh, r8l, r9l, r10l, r11l, r12l, r13l, r14l, r15l, spl, bpl, sil, dil.

16-bit general purpose registers

ax, cx, dx, bx, sp, bp, si, di, r8w, r9w, r10w, r11w, r12w, r13w, r14w, r15w

32-bit general purpose registers

eax, ecx, edx, ebx, esp, ebp, esi, edi, r8d, r9d, r10d, r11d, r12d, r13d, r14d, r15d

64-bit general purpose registers

rax, rcx, rdx, rbx, rsp, rbp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15

SIMD(64) – MMX

mm0, mm1, mm2, mm3, mm4, mm5, mm6, mm7

SIMD(128) – XMM

xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7, xmm8, xmm9, xmm10, xmm11, xmm12, xmm13, xmm14, xmm15

SIMD(256) – YMM

ymm0, ymm1, ymm2, ymm3, ymm4, ymm5, ymm6, ymm7, ymm8, ymm9, ymm10, ymm11, ymm12, ymm13, ymm14, ymm15

FPU

st(0), st(1), st(2), st(3), st(4), st(5), st(6), st(7)
OR
st0, st1, st2, st3, st4, st5, st6, st7

Control registers

cr0, cr2, cr3, cr4, cr8

Debug registers

dr0, dr1, dr2, dr3, dr4, dr5, dr6, dr7

Instruction pointer register

RIP (Used only with RIP addressing.)

Size operators

Here is a list of size operators used to specify the size of memory data accessed by an instruction. If there are any alternatives they are separated by commas:

Size operators supported by the Intel parser
Operators Data size
byte, byte ptr 8 bits
word, word ptr 16 bits
dword, dword ptr 32 bits
fword ptr, pword ptr, fword, pword 48 bits
qword, qword ptr 64 bits
tbyte, tword, tbyte ptr, tword ptr 80 bits
dqword, oword, dqword ptr, oword ptr 128 bits
qqword, qqword ptr 256 bits
mmword, mmword ptr Multimedia 64 bits
xword, xmmword, xword ptr, xmmword ptr Multimedia 128 bits
yword, ymmword, yword ptr, ymmword ptr Multimedia 256 bits
14byte, 14byte ptr 14 bytes
28byte, 28byte ptr 28 bytes
94byte, 94byte ptr 94 bytes
108byte, 108byte ptr 108 bytes

All multimedia size operators additionally set FCML_OP_HINT_MULTIMEDIA_INSTRUCTION hint for the memory addressing operand. It is useful in one specific case:

In general, the multimedia size operators can be used interchangeably with the standard ones, because they share the same sizes. So they are used mainly as discriminators in order to hint the assembler that we are using SIMD instructions. For instance "mmword ptr" can be changed to the "qword ptr" and everything will be working like a harm, but there is one case when it do really matter which one we have chosen.

There are instructions that are ambiguous, for example:

0F 6F /r MOVQ mm, mm/m64 MMX move quarword from mm/m64 to mm.
REX.W 0F 6E /r MOVQ mm, r/m64 MMX move quadword from r/m64 to mm.

In general they do exactly the same thing, but if you look closely you will see that they differ in the way they access registers when two register operands are used. The first one expects a MMX register while the second one expects a 64-bit general purpose register. Therefore in case of two registers everything is okay and there is no ambiguity here. But what if the second operand accesses the memory?

In such a case we have a little problem here, because both of the instruction above can be used to encode the following instruction:

movq mm0,[rax] 0x0f, 0x6f, 0x00
0x48, 0x0f, 0x6e, 0x00

If the default instruction chooser is used the first instruction form will always be chosen just because it is the shorter one. But what if we would like to choose one of them explicitly?

The size operators are the answer here. Even if data size operators are interchangeable (multimedia with classic) for instructions which are not ambiguous, using the certain one can help us in such cases:

movq mm0, mmword ptr[rax] 0x0f, 0x6f, 0x00
movq mm0, qword ptr [rax] 0x48, 0x0f, 0x6e, 0x00

By using the specific data size operator you can choose between two instruction encoding forms.

This feature is available only in the case of the Intel dialect. The AT&T dialect does not support it.

Prefixes

The following explicit prefixes are supported by the Intel syntax:

Explicit prefixes
Prefix Value
lock 0xF0
repne/repnz 0xF2
repe/repz/rep 0xF3
xacquire 0xF2
xrelease 0xF3
branch 0x2E
nobranch 0x3E

If you do not known any of these do not hesitate to look at the Intel or AMD architecture manual.

Hints

In case of the Intel dialect more than one hint can be used for an instruction. For example this instruction is perfectly valid: "jmp indirect near dword ptr [eax]".

The following hints are used by the Intel syntax:

Hint keywords
Hint Example Description
far call far 6655h:44332211h

A far jump to the address located in a different code segment than the current one.

near call near dword ptr [edi+00000001h]

A near jump inside the current code segment.

long_form vmaxsd long_form xmm3,xmm7,xmm0

Encodes instruction using three bytes VEX prefix.

indirect jmp indirect dword ptr [eax]

An offset is specified indirectly by an effective address or general purpose register.

rel rcl byte ptr [rel 0000800000401007h],03h

Encodes given address using the RIP addressing. Useful in the 64-bit mode only.

abs rcl byte ptr [abs 0000000000401007h],03h

Encodes given address as an absolute one. Take into account that the address here (0000000000401007h) differs from the address from "rel" hint. It is because using absolute addressing you are able to address 4GB of memory only.

sib

32-bit mode:
rcl byte ptr [sib 00401007h],03h

64-bit mode:
rcl byte ptr [sib 00401007h],03h

Forces SIB based encoding to be used.

In the first example SIB is not necessary to encode this instruction, so it encodes to: 0xc0150710400003 (Only the ModR/M field and the displacement.). But when SIB is forced, the same instruction encodes to: 0xc014250710400003 (ModR/M, SIB and displacement). Remember that not every addressing mode can be encoded with the SIB byte.

The second example is an interesting combination. The RIP addressing should be used here by default, but there is the SIB hint defined. Hints have higher precedence so it implicitly forces absolute addressing to be used (Because absolute offset is encoded using the SIB byte). In this case the SIB hint works like an ABS one. However it does not mean you should use it this way, it is just a side effect of the way x86_64 architecture encodes absolute addresses :)

AT&T parser

Parsers which follows the AT&T syntax are based on the syntax which was created at AT&T Bell Labs and is mainly used in UNIX-like environments. The following chapters describe some specific aspects of the AT&T syntax. If you are interested how the AT&T parser works do not hesitate to take a look at the following source files:

Rules for Flex scanner generator: ${dist}/src/fcml_gas_lexer.l
BNF grammar for Bison parser generator: ${dist}/src/fcml_gas_parser_def.y

You may wonder why GAS term is used to describe the AT&T syntax. The answer is simple. AT&T cannot be used as a valid C symbol. Since the GAS (GNU assembler) is in fact the reference implementation of the syntax now, it is justified to call it GAS syntax as well. The more that the main goal of the dialect is to be compatible with the GNU assembler.

Numeric values

Decimal literals are written in the standard way using digits from 0-9 (Regular expression: ([1-9][0-9]*|0+)). For example: 12, 1 or 112.

Hexadecimal literals use a convention popularized by C-like programming languages with the prefix 0x, for example 0xFF or 0x12 (Regular expression: 0x[0-9a-f]+).

Float values are currently not supported, because they are not used directly with instructions and the pseudo operations other than ".byte" used to data allocation are not supported by FCML yet.

Registers

All used registers are the same as in case of the Intel syntax (see: Registers) but they are preceded by "%" character.

Size operators

The AT&T syntax does not use size operators. This information is encoded in the instruction mnemonic as appropriate suffixes. FCML is GAS compatible in this case.

Prefixes

The AT&T dialect supports the same set of prefixes as the Intel syntax (see: Prefixes).

Hints

There are only two hints currently supported by the GAS dialect. Some instruction mnemonics like "lcall" adds instruction level hints like: FCML_HINT_FAR_POINTER and FCML_HINT_INDIRECT_POINTER. FCML_HINT_INDIRECT_POINTER hint can be also specified explicitly using '*' (asterisk) indirect operator for example: jmpq *(%rax)

Disassembler

The FCML disassembler decodes machine code and converts it to the general instruction model structure. Despite the GIM it also returns some additional detailed information about the instruction in another dedicated structures. The FCML disassembler is able to disassemble one instruction at a time. The main goal here is to make the API as simple as possible, so in order to disassemble a whole block of code you have to invoke the disassembler multiple times. For now, there is no utility function which would make it easier.

The fact that the GIM is used as the result of the disassembling process allows us to analyse disassembled instructions easily. It is almost impossible (or at least very inconvenient) in case of disassemblers that return the textual instructions directly. Such a solution has the following advantages:

  • You are able to analyse instructions easily just by analysing the well known common generic instruction model.

  • You can assemble instructions back to the machine code using the FCML assembler. For instance you are able to disassemble a branch instruction, then change the offset and assemble it back.

  • You can avoid the rendering process if it is not needed.

Initializing disassembler instance

Let us initialize a disassembler instance using fcml_fn_disassembler_init function:

fcml_st_disassembler *disassembler;
error = fcml_fn_disassembler_init( dialect, &disassembler );

To make the code clearer, error handling has been avoided in this case, but it should be implemented in the same way as in case of the dialect initialization. All possible error codes are defined in the include file fcml_errors.h (See: Error handling)

Initializing disassembler result

Having initialized the dialect and disassembler, there is the last thing to be done before disassembling is possible. It is the disassembler result structure. This structure is reusable so it has to be prepared in the right way in order to allow the disassembler to reuse it correctly. To do so, a manually allocated structure has to be passed to the fcml_fn_disassembler_result_prepare function.

fcml_st_disassembler_result result;
fcml_fn_disassembler_result_prepare( &result );

That is all, the disassembler is fully prepared to do its job, so let us try to disassemble a piece of machine code.

Disassembling machine code

The first thing to do is to prepare a disassembler context structure. It consists of a disassembler instance which should be used to disassemble code, some configuration flags we can use to configure the disassembling process, entry point which will be used to inform the disassembler about the code segment and a piece of instruction machine code to be disassembled.

The disassembler context itself can be allocated on the stack, but it is very important to clear the memory it uses before initializing it and passing it to the disassembler. We should do it just to set all configuration options and other parameters to their default values. For example, the following code is the proper way how to initialize the disassembler context:

fcml_st_disassembler_context context = {0};

Let us take a look at the fcml_st_disassembler_context structure first:

typedef struct fcml_st_disassembler_context {
	fcml_st_disassembler *disassembler;
	fcml_st_disassembler_conf configuration;
	fcml_st_entry_point entry_point;
	fcml_ptr code;
	fcml_data_size code_length;
} fcml_st_disassembler_context;

The first field disassembler should point to a disassembler instance we would like to use. The field entry_point describes a code segment used while disassembling code (See: Understanding entry point). Then field code which points to a memory buffer where an instruction machine code is located and the last one code_length contains the length of the code in the buffer. The last thing there is a configuration structure. It contains a lot of configuration fields, so instead of copying its source code here let describe the fields only:

increment_ip

If the flag is set to FCML_TRUE the instruction pointer is incremented and the buffer length is decremented by the size of an assembled instruction every time when the disassembling process succeeded. It can be very useful when we are disassembling multiple instructions one by one.

enable_error_messages

If the flag is set to FCML_TRUE textual error messages are generated and returned through fcml_st_disassembler_result structure.

carry_flag_conditional_suffix

If the flag is set to FCML_TRUE is enables "carry suffixes" for conditional instructions. See the table with instruction suffixes in the Instruction renderer chapter.

conditional_group

This flag can be set to FCML_DASM_CONDITIONAL_GROUP_1 or to FCML_DASM_CONDITIONAL_GROUP_2 in order to choose a conditional suffixes group. See the table with instruction suffixes in the Instruction renderer chapter.

short_forms

If the flag is set to FCML_TRUE then short instructions forms are used when disassembling code. So for instance "cmps byte ptr [si],byte ptr [di]" is disassembled as "cmpsb" (Operands are empty).

extend_disp_to_asa

If the flag is set to FCML_TRUE, the size of the displacement will always be the same as the effective address size attribute of the disassembled instruction.

So at first let us prepare a configuration structure using the flags described above:

context.configuration.enable_error_messages = FCML_TRUE;
context.configuration.short_forms = FCML_TRUE;

The disassembler is configured but we still have not provided instruction machine code yet. It can be done by setting the following two additional context fields: code and code_length.

context.code = code;
context.code_length = sizeof( code );

The field code should be a pointer to an array of bytes which contains the instruction machine code and field code_length holds the length of the machine code in bytes.

The machine code has been provided, but we know nothing about the code section it is located in. This information can be supplied by setting the instruction pointer and processor addressing mode inside the entry point structure (If you do not know what the instruction pointer, address size attribute or processor operating mode are, you definitely should at least read the following chapter: Understanding entry point)

The first required field of the entry point structure is op_mode which describes the processor operating mode (16, 32 or 64 bit). We can also set default values for the address size attribute and operand size attribute for our virtual code segment.

context.entry_point.op_mode = op_mode;
context.entry_point.address_size_attribute = FCML_DS_UNDEF;
context.entry_point.operand_size_attribute = FCML_DS_UNDEF;
context.entry_point.ip = 0x00401000;

These defaults can be silently ignored because they set these attributes to 0 anyway.

The disassembler context is almost initialized, but we have left the most important thing at the end. It is the disassembler itself. It has to be also put into the context, because it will be used to do the whole of the work.

context.disassembler = disassembler;

The next piece of code shows how the whole context initialization code should looks like:

fcml_st_disassembler_context context = {0};
context.disassembler = disassembler;
context.configuration.enable_error_messages = FCML_TRUE;
context.configuration.short_forms = FCML_TRUE;
context.entry_point.op_mode = FCML_OM_32_BIT;
context.entry_point.address_size_attribute = 0;
context.entry_point.operand_size_attribute = 0;
context.entry_point.ip = 0x00401000;
context.code = code;
context.code_length = sizeof( code );

Now we are ready to disassemble the first piece of code, hence let's do it. In order to disassemble anything we have to call the function fcml_fn_disassemble:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_disassemble( 
	fcml_st_disassembler_context *context, 
	fcml_st_disassembler_result *result );

The function gets a disassembler context and disassembler result as the function parameters, so let's set them to the instances we have prepared earlier:

error = fcml_fn_disassemble( &context, &result );
if( !error ) {
	…
}

The function returns FCML_CEH_GEC_NO_ERROR if successful; otherwise an appropriate error code is returned. The result structure contains the disassembled instruction in the form of the generic instruction model and potential warning messages or error messages if the function failed.

In addition there is a new structure which needs a little more attention here. It is the fcml_st_instruction_details structure which consists of additional information which is not relevant for the generic instruction model, but anyway can be useful through the process of the instruction analysis.

Analysing instruction details

As you probably already know, the disassembler decodes an instruction to the generic instruction model. It is very useful, because such a model is understandable by the assembler, so it can be then reassembled again. Either way it is a bit limited set of information, so there is another structure fcml_st_instruction_details which can be used to access more details about disassembled instructions. The following structure describes them:

typedef struct fcml_st_instruction_details {
	fcml_bool is_shortcut;
	fcml_bool is_pseudo_op;
	fcml_uint8_t instruction_code[FCML_INSTRUCTION_SIZE];
	fcml_usize instruction_size;
	fcml_st_prefixes_details prefixes_details;
	fcml_st_operand_details operand_details[FCML_OPERANDS_COUNT];
	fcml_st_decoded_modrm_details modrm_details;
	fcml_bool opcode_field_s_bit;
	fcml_bool opcode_field_w_bit;
	fcml_en_instruction instruction;
	fcml_en_pseudo_operations pseudo_op;
	fcml_uint16_t addr_mode;
	fcml_uint64_t instruction_group;
} fcml_st_instruction_details;

Shortcuts

Two first two fields is_shortcut and is_pseudo_op are related to the short instruction forms. The first one is always sets to FCML_TRUE if disassembled instruction is in its short form, so it needs the flag short_forms from the fcml_st_disassembler_conf structure to be set.

The second flag is_pseudo_op can be set for these instructions: CMPSD, VCMPSD, CMPSS, VCMPSS, VPCOMB, VPCOMW, VPCOMD, VPCOMQ, VPCOMUB, VPCOMUW, VPCOMUD, VPCOMUQ and is set only and only if their short forms called pseudo-ops are returned. For example:

cmpsd xmm0,mmword ptr [rax+0000000000000020h],06h
cmpnlesd xmm0,xmm1

These two instructions describe the same piece of machine code 0xF20FC2C106, but the field is_pseudo_op will be set to FCML_TRUE only for the second one (It also needs short_forms field to be set in the configuration.). Notice that they also differ on the level of operands. The second form has only two operands set in the GIM.

Instruction machine code

The machine code of the instruction you have just disassembled is also accessible through two fields instruction_code and instruction_size. Obviously you have the same piece of machine code in your own array which is set in the disassembler context so this information is redundant, but it can be convenient to have it here if you collect disassembled instructions for further analysis.

Prefixes

Every instruction can consist of more than one instruction prefix. Most assembler programmers are aware of explicit prefixes like REPNE or LOCK, but there are also prefixes like REX, VEX, XOP etc. which are not specified explicitly by the programmer in the source code. Anyway, it would be nice to have some information about them if they exist.

Some additional information about available prefixes is available through the field prefixes_details available directly in the disassembler result structure. These details are described by the following structure:

typedef struct fcml_st_prefixes_details {
	fcml_st_instruction_prefix prefixes[FCML_DASM_PREFIXES_COUNT];
	fcml_int prefixes_count;
	fcml_int prefixes_bytes_count;
	fcml_bool is_branch;
	fcml_bool is_nobranch;
	fcml_bool is_lock;
	fcml_bool is_rep;
	fcml_bool is_repne;
	fcml_bool is_xrelease;
	fcml_bool is_xacquire;
	fcml_bool is_vex;
	fcml_bool is_xop;
	fcml_bool is_rex;
	fcml_uint8_t vex_xop_first_byte;
	fcml_uint8_t r;
	fcml_uint8_t x;
	fcml_uint8_t b;
	fcml_uint8_t w;
	fcml_uint8_t l;
	fcml_uint8_t mmmm;
	fcml_uint8_t vvvv;
	fcml_uint8_t pp;
} fcml_st_prefixes_details;

The field prefixes is an array of the fcml_st_instruction_prefix structures. Field prefixes_count holds the number of really used elements in the array and the last one prefixes_bytes_count tells us how many bytes of the instruction code are really prefixes.

So let's start with the fcml_st_instruction_prefix structure:

typedef struct fcml_st_instruction_prefix {
	fcml_uint8_t prefix;
	fcml_en_prefix_types prefix_type;
	fcml_bool mandatory_prefix;
	fcml_uint8_t vex_xop_bytes[2];
} fcml_st_instruction_prefix;

Every prefix consists of a prefix field which is a byte interpreted as a prefix, field prefix_type which is the type of the prefix (see below) and a mandatory_prefix flag which is set to FCML_TRUE for all mandatory prefixes (See Intel or AMD Architecture Manuals for more information about mandatory prefixes). The last field is vex_xop_bytes which is used in case of multi-byte VEX and XOP prefixes and contains second and the optionally third byte of the whole prefix.

There are use cases when you are not really interested in any details about specific prefixes, but you would like to know if they exists or not. You can achieve it using the following lookup fields: is_branch, is_nobranch, is_lock, is_rep, is_repne, is_xrelease, is_xacquire, is_vex, is_xop, is_rex.

The last category of fields exposed by fcml_st_prefixes_details structure can be used to access common fields defined inside REX, XOP and VEX prefixes. These are: r, x, b, w, l, mmmm , vvvv, pp. It is a bit more advanced subject and if you are really interested in it do not hesitate to take a look at Intel or AMD Architecture Manuals, but you should not need it in the day-to-day usage.

Operand details

The next structure fcml_st_operand_details which is accessible through the field operand_details contains additional information about all operands available in the general instruction model. For now it consists of only one field access_mode which tells us if operand is FCML_AM_READ, FCML_AM_WRITE or FCML_AM_READ_WRITE.

Details about ModR/M field

The next structure fcml_st_decoded_modrm_details which is accessible through the field modrm_details contains some details about the ModR/M:

typedef struct fcml_st_decoded_modrm_details {
	fcml_uint8_t modrm;
	fcml_nuint8_t sib;
	fcml_bool is_rip;
} fcml_st_decoded_modrm_details;

The first two fields are self-describable. The last one is_rip is set to FCML_TRUE if the RIP addressing is used (only in 64-bit mode).

Instruction code and opcode fields

The next four fields of the fcml_st_instruction_details structure are opcode_field_s_bit, opcode_field_w_bit, instruction and addr_mode. The first two are set after the opcode fields 'w' and 's', but anyway they are set for informational purpose only and you should not use them for any critical functionality. You can treat them as deprecated fields and as such they will be probably removed in the future releases.

The third field instruction stores an instruction code. Every instruction has its own code which is declared in the fcml_instructions.h header file. For instance F_ADD, F_CALL, F_DIV. These codes are more convenient than mnemonics when we only need to identify the type of the instruction.

The fourth field defines instruction form being used to disassemble the code. For instance: FCML_AM_RM8_IMM8 or FCML_AM_RMO_RO. They are used internally but can be also useful together with the instruction codes to identify not only the instruction but also its form. Anyway you should not rely on it, because there are instructions which do not specify the forms, even if there are any.

Instruction groups

The last field instruction_group can be used to identify an instruction group. The instruction group is used to classify instructions. Groups are just bit masks, so every instruction can be a member of more than one group. All groups are defined in the fcml_instructions.h header file.
#define 	FCML_AMT_UNDEF   0x0000000000000000UL
#define 	FCML_AMT_SSEx   0x0000000000000001UL
#define 	FCML_AMT_VEXx   0x0000000000000002UL
#define 	FCML_AMT_SIMD   0x0000000000000004UL
#define 	FCML_AMT_GPI   0x0000000000000008UL
#define 	FCML_AMT_FPU   0x0000000000000010UL
#define 	FCML_AMT_MMX   0x0000000000000020UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSE   0x0000000000000040UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSE2   0x0000000000000080UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSE3   0x0000000000000100UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSSE3   0x0000000000000200UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSE41   0x0000000000000400UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSE42   0x0000000000000800UL | FCML_AMT_SSEx
#define 	FCML_AMT_SSE4A   0x0000000000001000UL | FCML_AMT_SSEx
#define 	FCML_AMT_AVX   0x0000000000002000UL | FCML_AMT_VEXx
#define 	FCML_AMT_AVX2   0x0000000000004000UL | FCML_AMT_VEXx
#define 	FCML_AMT_AES   0x0000000000008000UL
#define 	FCML_AMT_SYSTEM   0x0000000000010000UL
#define 	FCML_AMT_3DNOW   0x0000000000020000UL | FCML_AMT_MMX
#define 	FCML_AMT_TBM   0x0000000000040000UL | FCML_AMT_VEXx
#define 	FCML_AMT_BMI1   0x0000000000080000UL
#define 	FCML_AMT_BMI2   0x0000000000100000UL
#define 	FCML_AMT_HLE   0x0000000000200000UL
#define 	FCML_AMT_ADX   0x0000000000400000UL
#define 	FCML_AMT_CLMUL   0x0000000000800000UL
#define 	FCML_AMT_F16C   0x0000000001000000UL | FCML_AMT_VEXx
#define 	FCML_AMT_RDRAND   0x0000000002000000UL
#define 	FCML_AMT_RDSEED   0x0000000004000000UL
#define 	FCML_AMT_PRFCHW   0x0000000008000000UL
#define 	FCML_AMT_LWP   0x0000000010000000UL | FCML_AMT_SIMD
#define 	FCML_AMT_SVM   0x0000000020000000UL
#define 	FCML_AMT_FSGSBASE   0x0000000040000000UL
#define 	FCML_AMT_FMA   0x0000000080000000UL | FCML_AMT_SIMD
#define 	FCML_AMT_FMA4   0x0000000100000000UL | FCML_AMT_SIMD
#define 	FCML_AMT_XOP   0x0000000200000000UL | FCML_AMT_SIMD
#define 	FCML_AMT_EDX   0x0000000400000000UL
#define 	FCML_AMT_ABM   0x0000000800000000UL
#define 	FCML_AMT_VMX   0x0000001000000000UL
#define 	FCML_AMT_SMX   0x0000002000000000UL
#define 	FCML_AMT_POPCNT   0x0000004000000000UL
#define 	FCML_AMT_RTM   0x0000008000000000UL
#define 	FCML_AMT_CTI   0x0000010000000000UL
#define 	FCML_AMT_BRANCH   0x0000020000000000UL
#define 	FCML_AMT_MMX_SIMD   FCML_AMT_MMX | FCML_AMT_SIMD
#define 	FCML_AMT_SSE_SIMD   FCML_AMT_SSE | FCML_AMT_SIMD
#define 	FCML_AMT_SSE2_SIMD   FCML_AMT_SSE2 | FCML_AMT_SIMD
#define 	FCML_AMT_SSE3_SIMD   FCML_AMT_SSE3 | FCML_AMT_SIMD
#define 	FCML_AMT_SSSE3_SIMD   FCML_AMT_SSSE3 | FCML_AMT_SIMD
#define 	FCML_AMT_SSE41_SIMD   FCML_AMT_SSE41 | FCML_AMT_SIMD
#define 	FCML_AMT_SSE42_SIMD   FCML_AMT_SSE42 | FCML_AMT_SIMD
#define 	FCML_AMT_AVX_SIMD   FCML_AMT_AVX | FCML_AMT_SIMD
#define 	FCML_AMT_AVX2_SIMD   FCML_AMT_AVX2 | FCML_AMT_SIMD
#define 	FCML_AMT_3DNOW_SIMD   FCML_AMT_3DNOW | FCML_AMT_SIMD

For instance CALL instruction is a member of the following groups: FCML_AMT_GPI (General purpose instruction), FCML_AMT_CTI (Controll transfer instruction), FCML_AMT_BRANCH (Branch instruction).

Notice that they are not always related to the CUID flags.

Freeing resources

When resources used by the disassembler are no longer needed they have to be freed. The following code frees a disassembler instance, disassembler result and dialect.

fcml_fn_disassembler_result_free( &result );
fcml_fn_disassembler_free( disassembler );
fcml_fn_dialect_free( dialect );

Remember that fcml_fn_disassembler_result_free function does not free the result structure itself. It is only responsible for freeing all structures allocated by the disassembler which are accessible through the disassembler result, as error messages for instance. It is why such a structure can be still reused by the disassembler even if it was freed before.

Instruction renderer

Instruction renderers are used to generate textual representation of assembled instructions. All you need to do is to configure instruction renderer and render a disassembler result into a provided text buffer. So let's do it, but at first take a look at the function we will use to render the instruction model:

LIB_EXPORT fcml_ceh_error LIB_CALL fcml_fn_render( 
	fcml_st_dialect *dialect, 
	fcml_st_render_config *config, 
	fcml_char *buffer, 
	fcml_usize buffer_len, 
	fcml_st_disassembler_result *result );

This function needs quite a few arguments but take into account that there is only one argument that have to be carefully prepared. It is the fcml_st_render_configstructure which configures some aspects of the rendering process:

typedef struct fcml_st_render_config {
	fcml_uint32_t render_flags;
	fcml_uint16_t prefered_mnemonic_padding;
	fcml_uint16_t prefered_code_padding;
} fcml_st_render_config;

It is not so complicated, because it contains only a few rendering flags and the padding configuration for the mnemonic and instruction code (If instruction code is rendered). The padding fields are described together with the rendering flags FCML_REND_FLAG_CODE_PADDING and FCML_REND_FLAG_MNEMONIC_PADDING.

The following list describes all available rendering flags:

FCML_REND_FLAG_RENDER_CODE

If this flag is set, the instruction code is rendered. For instance: "666781d04280 adc ax,32834".

FCML_REND_FLAG_HEX_IMM

If this flag is set, the immediate operands are rendered as hexadecimal literals: "adc rax,0000000042806521h".

FCML_REND_FLAG_RENDER_DEFAULT_SEG

If this flag is set, the segment register is rendered even if there is only the default one: "call far fword ptr cs:[ebx+00000001h]".

FCML_REND_FLAG_HEX_DISPLACEMENT

The displacement value is rendered as hexadecimal literal: "adc byte ptr [ecx+eax+00000002h],03h".

FCML_REND_FLAG_COND_GROUP_1/2

These flags can be used to choose a set of conditional suffixes used when rendering conditional mnemonics. In general there are two groups available "Group 1" and "Group 2" as has been shown in the table below. So these flags are disjunctive.

FCML_REND_FLAG_COND_SHOW_CARRY

When conditional suffix group is already chosen you can also enable carry suffixes for two conditions. See "Show carry" column in the table below.

FCML_REND_FLAG_RENDER_SIB_HINT

If this flag is set, the SIB hint will be rendered. Supported by the Intel syntax only: "add dword ptr [sib eax],eax". Ignored in case of the AT&T dialect.

FCML_REND_FLAG_RENDER_ABS_HINT

If this flag is set and the absolute addressing is used, "abs" hint will be rendered: "rcl byte ptr [abs 0000000000401007h],03h". Ignored in case of the AT&T dialect.

FCML_REND_FLAG_RENDER_REL_HINT

If this flag is set and the relative addressing is used, "rel" hint will be rendered: "rcl byte ptr [rel 0000800000401007h],03h". Ignored in case of the AT&T dialect.

FCML_REND_FLAG_RENDER_INDIRECT_HINT

If this flag is set and the indirect addressing mode is used, "indirect" hint will be rendered: "jmp indirect dword ptr [eax]". Ignored in case of the AT&T dialect.

FCML_REND_FLAG_CODE_PADDING

If this flag is set a code padding is rendered. The code padding is a fixed space between the instruction code and the mnemonic. For instance the following instruction has been rendered for the code padding set to 10, so there have to be minimum 10 * 2 characters rendered before the mnemonic: "6681d04280           adc ax,32834". The code padding can be configured using the prefered_code_padding configuration field.

FCML_REND_FLAG_MNEMONIC_PADDING

If this flag is set a mnemonic padding is added. The mnemonic padding is a fixed space between the mnemonic and the first operand. For example the following instruction has been rendered with the mnemonic padding set to 8: "6681d04280 adc     ax,32834". Take into account that there are 8 characters between the start of the mnemonic and the first operand. The mnemonic padding can be configured using the prefered_mnemonic_padding configuration field.

FCML_REND_FLAG_REMOVE_LEADING_ZEROS

If the flag is set, renderer removes leading zeros from all integer literals. For example the following instruction: "call far fword ptr cs:[ebx+00000001h] will be rendered as: "call far fword ptr cs:[ebx+1h]".

FCML_REND_DEFAULT_FLAGS

It can be used if we do not want to set any rendering flags.

The following table shows mentioned conditional suffixes groups:

Conditional prefixes groups
Group 1 Group 2 Show carry
o o Group 1 or Group 2.
no no Group 1 or Group 2.
b nae c
nb ae nc
e z
ne nz Group 1 or Group 2.
be na Group 1 or Group 2.
nbe a Group 1 or Group 2.
s s Group 1 or Group 2.
ns ns Group 1 or Group 2.
p pe Group 1 or Group 2.
np po Group 1 or Group 2.
l nge Group 1 or Group 2.
nl ge Group 1 or Group 2.
le ng Group 1 or Group 2.
nle g Group 1 or Group 2.

This is an example of a very basic configuration:

fcml_st_render_config render_config = {0};
render_config.render_flags = FCML_REND_FLAG_HEX_IMM | FCML_REND_FLAG_HEX_DISPLACEMENT;

To get back to the rendering function, the two arguments that follow configuration: buffer and buffer_length point to the destination text buffer where textual representation of the instruction will be written. You can allocate the destination buffer this way for example:

fcml_uint8_t buffer[FCML_INSTRUCTION_SIZE];

Remember that this buffer is also a reusable one and does not have to be cleaned between multiple calls to the rendering function.

We have prepared all needed arguments, so let's render the result:

fcml_ceh_error error;
error = fcml_fn_render( dialect, render_config, buffer, sizeof( buffer ), result );

Notice that we pass the whole disassembler result structure to the renderer. It is important to note that in order to render the instruction the GIM is not enough, it is why the whole result is passed as the argument, because it contains fcml_st_instruction_details structure which is also used by the renderer.

Although it is possible to prepare such a disassembler result by hand and pass it the renderer, it would be very risky. Just remember to use the renderer only with structures prepared by the FCML disassembler. The last thing you need to know is the fact that you have to use the same dialect that was used by the disassembler.

The following example disassembles a piece of code and renders it to the console:

#include <stdio.h>
#include <stdlib.h>

#include <fcml/fcml_disassembler.h>
#include <fcml/fcml_renderer.h>
#include <fcml/fcml_intel_dialect.h>

int main(int argc, char **argv) {

	fcml_int8_t code[] = { 0x4D, 0x11, 0x64, 0x89, 0x01 };

	fcml_ceh_error error;

	/* Initializes Intel dialect instance. */
	fcml_st_dialect *dialect;
	if( ( error = fcml_fn_dialect_init_intel( FCML_INTEL_DIALECT_CF_DEFAULT, &dialect ) ) ) {
		fprintf( stderr, "Can not initialize Intel dialect: %d", error );
		exit(1);
	}

	/* Initializes a disassembler for the Intel dialect. */
	fcml_st_disassembler *disassembler;
	if( ( error = fcml_fn_disassembler_init( dialect, &disassembler ) ) ) {
		fprintf( stderr, "Can not initialize disassembler: %d", error );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	/* Prepares a disassembler result. */
	fcml_st_disassembler_result dis_result;
	fcml_fn_disassembler_result_prepare( &dis_result );

	/* Disassembles the code. */
	fcml_st_disassembler_context context = {0};
	context.disassembler = disassembler;
	context.entry_point.ip = 0x401000;
	context.entry_point.op_mode = FCML_OM_64_BIT;
	context.code = code;
	context.code_length = sizeof( code );

	if( ( error = fcml_fn_disassemble( &context, &dis_result ) ) ) {
		fprintf( stderr, "Can not disassemble the code: %d", error );
		fcml_fn_disassembler_free( disassembler );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	/* Renders the disassembled instruction. */
	fcml_char buffer[FCML_REND_MAX_BUFF_LEN];

	fcml_st_render_config config = {0};
	if( ( error = fcml_fn_render( dialect, &config, buffer, sizeof( buffer ), &dis_result ) ) ) {
		fprintf( stderr, "Can not disassemble the code: %d", error );
		fcml_fn_disassembler_result_free( &dis_result );
		fcml_fn_disassembler_free( disassembler );
		fcml_fn_dialect_free( dialect );
		exit(1);
	}

	printf( "Instruction: %s\n", buffer );

	/* Free everything. */
	fcml_fn_disassembler_result_free( &dis_result );
	fcml_fn_disassembler_free( disassembler );
	fcml_fn_dialect_free( dialect );

}

It should print something like this:

Instruction: adc qword ptr [r9+rcx*4+1],r12

CPP Wrapper

Since FCML 1.1.0 there is a C++ wrapper available. The whole implementation is placed in header files which are located next to their C counterparts. The core of FCML library still exports only C symbols which are then used by the wrapper implementation internally. The main advantage of this approach is the fact that anyone can use the wrapper even if there is only C version of the FCML on a particular system. The implementation is based on the following goals and assumptions:

  • The wrapper has to be easy in use.
  • The performance is not the main goal here. If you really need the best performance possible, you should definitely consider using the C API directly. Anyway it does not mean that it is slow. It can be just a bit slower that proper C implementation, maintly because C++ data structures are often being copied just to make the usage easier. See the next point.
  • If it is possible to copy or assign an object you can be sure that the deep copy is being made and the two objects being the result of the operation are completely independent and can be freed whenever you want (There is one exception from this rule and it is documented later, see: CodeIterator class). It can hit the performance a bit, so you have to bear it in mind that C structures are always copied to the C++ objects. In most cases there are no wrappers around the structures directly. It is the most important assumption here and it will not be changed in future releases. For instance the whole fcml_st_assembler_result is always copied to the fcml::AssemblerResult after every call to the assembler.
  • If for whatever reason an object cannot be deep copied C++ compiler just will not allow you to do so. The Assembler class is a good example of such a class. It maintains a real instance of fcml_fn_assemble structre which cannot be copied just like that.
  • Strings are always wrapped in the fcml::fcml_cstring, so you do not have to track them.
  • Build-in exceptions are ALWAYS thrown using references. It's event not possible to allocate them using the “new” operator.
  • Some of the C enumeration types are wrapped in the new types declared directly in the classes using them (just to make them more convenient to use).
  • There might be CPP files that can be included just to compile and link some definitions into the destination object files, but they are here due to the convenience and they are optional, so it is your decision to use them or not.
  • Most of the C structures have their CPP counterparts placed in header files with “.hpp” extensions.
  • Everything dedicated to the C++ is available in the fcml name space.

The documentation of the wrapper is not as detailed as the rest of the FCML manual, thus you should undoubtedly read the C part if your goal is to master the library. The wrapper code is really simple so when you are not so sure if something is allowed or not do not hesitate to take a look at the source code and the API documentation. Do not be fooled by the size of the wrapper code. There are a lot of helper methods and accessors over there which are really simple, but they make the code significantly bigger. Of course you can always help improving the manual and make it more detailed if you have time and will.

Ok, that's enough talking. Let's get started!

Mnemonics

There are two header files prepared for the C++ wrapper. It's fcml_gas_mnemonics.hpp and fcml_intel_mnemonics.hpp. They declare constant strings with all mnemonics supported by the available dialects. The convention is really simple. Every mnemonic is prefixed with M_. For example these all are valid instruction mnemonics: M_ADD, M_CALL, M_SUB and so on. This prefix is a bit inconvenient and definitely can be seen as completely unnecessary here, but some of the mnemonics like OUT or IN fall in conflict with pre-processor macros which are defined by Microsoft compilers form example. Of course it is an open source project so you are welcome to take the code and modify it for your purposes. As far as I remember there are no conflicts in case of GNU compilers. I haven't checked clang yet.

There is one more thing you have to do in order to make it work. As you probably already know the wrapper is implemented using header files only so there are no C++ symbols exported by the FCML library. Due to this hmm restriction? you have to include definitions of these mnemonics exactly once in your own source code. It is as simple as including every regular header file:

#include <fcml_gas_mnemonics.cpp>
int main() {
    return 0;
}

Remember to include them only once. They do not have to be included in every file which uses mnemonics. In fact doing so would lead to symbols duplication errors while the application is compiling.

For Intel mnemonics use fcml_intel_mnemonics.cpp

If you really need to use GAS and Intel mnemonics together, just fell free to do it, but you have to be aware that they use different name spaces: fcml::intel and fcml:gas.

In case you are sceptical about including implementation files you can always use plain C header files fcml_gas_mnemonics.h and fcml_intel_mnemonics.h which declares mnemonics using preprocessor directives, but remember that using mnemonics dedicated to C++ you can gain some performance, because fcml_cstring instances do not need to be created implicitly if methods or operators expect them (every mnemonic is just a constant fcml_cstring).

Common types

This chapter describes important common types which are used by the rest of wrapper classes. They are not described in every detail because lots of them are full of various utility methods and operators, so the best way to master them is to head over to the API documentation.

Character types

There are two new type declarations for character types. The first one is fcml::fcml_cstring which is just std::basic_string for fcml_char type. It can be used alternatively with the standard std::string as long as the fcml_char is a standard C character type. For now UNICODE support is still in the plans (and will be probably available soon), so it does not matter which configuration you use (ASCII/UNICODE in case of VS), because fcml::fcml_cstring is always a char based type like the std:string. So regardless of the chosen configuration it should be possible to use std:string and fcml::fcml_cstring alternatively.

However you should bear in mind that using the fcml::fcml_cstring is more portable and as so choosing it can lead to significant advantages if your source code will be ported to the UNICODE in the future. In the next releases fcml_char will probably be configuration dependant and might not be convertible directly to the std:string. For now just remember that fcml::fcml_cstring is an 8-bit ASCII, not UNICODE type. Remember that UNICODE characters are two byte or more in length and are represented as words.

The second type fcml::fcml_costream is just an output string stream based on the std::basic_ostringstream and fcml_char type.

EntryPoint class

The class holds information about an entry point. It is just a counterpart to the fcml_st_entry_point structure. In most cases this class is used indirectly and its usage is just hidden from the user. In order to create an entry point instance directly use the following piece of code:

    EntryPoint entryPoint( EntryPoint::OM_32_BIT, 0x401000 );

Both address size attribute and operand size attribute can be also provided as optional parameters.

Integer class

It is counterpart to the fcml_st_integer structure. It is a bit complex class and you should definitely take a look at the API as well as the source code in order to understand it deeply. In short words it wraps an integer value of a particular size and has a sign. In most cases they can be used as ordinal integer types provided by the compiler, because implementations of the most important operators are provided. Take a look at the following piece of code:

Integer v1 = static_cast<fcml_uint8_t>( 10 );
Integer v2 = static_cast<fcml_uint8_t>( 20 );

fcml_uint8_t sum = v1 + v2;

sum == 30 // True

It creates two integers. The first one is an 8-bit unsigned integer with the value of 10. The second one is set to 20 and is also an 8-bit unsigned value. They are then added to each other and the result is then stored in a new Integer.

Remember that the result is always of the size and type of the first operand. For instance:

Integer v1 = (fcml_uint8_t)20;
Integer v2 = (fcml_int64_t)10;


Integer sum = v1 + v2;

sum == 30 // True
!sum.isSigned() // True
sum.getSize() == 8 // True

In such a case the result “sum” is an 8 bit unsigned value set to 30.

Register class

It is a direct counterpart to the fcml_st_register structure. It has the same members and they have the same meaning. In order to create a register instance you can use the constructor directly:

Register regCons( FCML_REG_AL, FCML_DS_8 );

You can also use static factory methods provided by the fcml::Register class:

Register reg = Register::AL();
const Register &reg = Register::AL();

It is a convenient way of creating all the registers, but there is even a simpler way of doing it. All you have to do is to include the following header files: fcml_registers.hpp and fcml_registers.cpp. It is very important to include the *.cpp file only once to avoid duplicated symbols. The main project file is the best place for including it. After doing so you gain access to the constant variables for all registers that can be directly acceded:

#include <fcml_registers.hpp>
#include <fcml_registers.cpp>

const Register &reg = AL;
if( reg == Register::AL() ) {
}

Unlike the .cpp file, .hpp header has to be visible in every source file which accesses the registers.

Generic instruction model

All class related to generic instruction models are strict counterparts to the C structures: FarPointer, SegmentSelector, EffectiveAddress, Address, Operand, Condition, Instruction.

The following examples show how they can be allocated:

Far pointer

A far pointer class instance can be allocated using the constructor or two factory methods. The first method creates a far pointer for a 32-bit offset while the second one for a 16-bit one:

FarPointer fp = FarPointer(0x3000, 0x000401000 );
fp = FarPointer::off32( 0x3000, 0x000401000 );
fp = FarPointer::off16( 0x3000, 0x01212 );

Segment selector

Using the constructor to allocate segment selector:

SegmentSelector ss(Register::CS(),true);

A static factory method is also available here:

SegmentSelector::seg( Register::CS(),true );

The first parameter represents a segment register, the second one should be set to true if the segment selector sets the default segment register in the given context.

Effective address

Effective addresses can be created using constructors, but there are also factory methods which can be used to prepare every allowed combination of the effective address components. Take a look at the API to master all the ways of allocating it. The following code shows few examples:

Using the constructor:

EffectiveAddress base( Register::EAX() );

Using the factory methods:

EffectiveAddress base_disp = EffectiveAddress::addr( EAX, Integer::int8(0x40) );
EffectiveAddress base_index = EffectiveAddress::addr( EAX, EDX );
EffectiveAddress base_index_scale = EffectiveAddress::addr( EAX, EDX, 4 );

Addresses

Address class represents relative and absolute offsets as well as an effective addresses. The constructor and factory methods can be used alternatively to create instances of the class. Some of the factory methods uses fcml::EffectiveAddress class internally, so you do not have to allocate it explicitly in order to prepare the effective address:

A relative or absolute offset:

Address address(0x401000);

An effective address using factory method from the EffectiveAddress class:

Address effective(EffectiveAddress::addr(EAX(), FCML_DS_32));

The same effective address using the factory method provided directly by the Address class:

Address effective = Address::effective(EAX, FCML_DS_32)

Conditions

fcml::Condition is a class that represents a condition which is used to describe conditional instructions like conditional branches. It's really rare when instances of the Condition class have to be created manually, but if you really need to do it, use the provided constructor or factory methods as follows:

Constructor:

Condition condition( Condition::CONDITION_LE, true );

Factory methods:

Condition condition = Condition::NLE()

The most important thing here is the fact that there are a lot of helpful methods which can be used to check the condition type. For instance:

Condition condition( Condition::CONDITION_LE, true );
if( condition.isNLE() ) {
}
if( condition == Condition::NLE() ) {
}
if( condition.isNegation() ) {
}

In the first and second example we just compare the whole condition. The third example checks if the condition is a negation of a particular condition type, LE in this case.

Operands

fcml::Operand class is built in the same way as the fcml_st_operand structure. There is a type and aggregated objects describing every possible type of a operand. The following examples describe ways operands can be created:

Constructors:

Operand imm( 0x201000 );
Operand far_ptr( FarPointer( 0x1020, (fcml_int16_t)0x3030 ) );

Some methods dedicated to set up certain types of the operands:

Operand operand;
operand.far_ptr( FarPointer( 0x1020, (fcml_int16_t)0x3030 ) );
operand.addr( EffectiveAddress( Register::EAX() ) );

Take a look at the API to see all of them.

There is also an operand builder class fcml:OB which is strictly dedicated to creation of operands by exposing static factory methods:

Operand eff = OB::addr( EffectiveAddress( Register::EAX() ) );
Operand eff = OB::eff( EAX, 8 );
Operand reg = OB::reg( ECX );
Operand imm = OB::imm( 0x401000 );
Operand farp = OB::far_ptr( 0x1020, (fcml_int16_t)0x3030 );

Instructions

Until now we have discussed all parts of the generic instruction model, so it is time to create some instructions using them. Of course instructions can be prepared using constructors directly, but there are no factory methods in the fcml:Instruction class like in the other cases. Instead, there is another class fcml:IB. It is a stateful instruction builder which is destined to prepare instructions in a bit more convenient way. So let's see some examples:

For the sake of this example let's try to build the following instruction by hand: "mov byte ptr [eax], ecx".

The first way using the constructor with an instruction mnemonic and some factory methods adding operands. If the builder state is complete, the instruction itself can be built using the "build()" method:

Instruction instruction;
instruction = IB( M_MOV ).eff( EAX, 8 ).reg( ECX ).build();

The last method is optional and the whole code might look like this:

instruction = IB::inst( M_MOV ).eff( EAX, 8 ).reg( ECX );

As you can see the instruction builder itself is created a bit differently in this case using the static factory method "inst()".

There is another way using shift operators:

instruction = IB(M_MOV) << OB::eff( EAX, 8 ) << OB::reg( ECX );

There is also a way to set prefixes and instruction hints, so let's set some of them:

instruction = IB(M_MOV) << IB::LOCK() << IB::REP() << IB::INDIRECT_PTR() 
                        << IB::NEAR_PTR() << OB::effb( EAX ) << OB::reg( ECX );

Take into account one significant benefit of building instructions in this way. They can be built in a dynamic manner, you are able to provide every operand, address etc. using language variables. It's also possible to build instructions in more than one step using conditional statements. For instance such a construction is perfectly valid:

IB builder = IB(M_JMP).off(0x404000);

if( isPositionIndependent ) {
    builder.set( OperandHint::RELATIVE_ADDRESSING() );
} else {
    builder.set( OperandHint::ABSOLUTE_ADDRESSING() );
}

Instruction instruction = builder.build();

Or even such one:

IB builder = IB(M_MOV).reg(ECX);

Address address;
if( offset ) {
    address = Address::off( offset, FCML_DS_32 );
} else {
    address = Address::eff( EAX, FCML_DS_32 );
}

Instruction instruction = builder.op( address );

Where we build MOV instruction with the source operand being an offset of effective address with base register depending on the fact whether the absolute offset is set or not.

To get back to the hints, there is also a possibility to set operand hints using the builder. The rule is simple here. All setting hints are added to the last operand which have been sent to the builder. So the following code:

IB(M_RCL) << OB::effb( 0x0000000000401007 ) << IB::ABS() << OB::imm( 3 );

OR

IB( M_RCL ).effb( 0x0000000000401007 ).abs().imm( 3 );

can be interpreted as follows: "rcl byte ptr [abs 0000000000401007h],03h".

When the instruction is already built it can be still modified. For instance you are able to change the operands easily:

instruction[0] = OB::reg( EDX );

Assembler

Using C++ assembler is as easy as using the native C version. All you have to do is to allocate a dialect, assembler context and then use them together to create and use an assembler instance. The whole process is shown in the following code:

try {

   IntelDialect dialect;

   AssemblerContext ctx( EntryPoint::OM_32_BIT, 0x401000 );
   Assembler assembler( dialect );

   Instruction instruction = IB( M_MOV ) << OB::reg( EAX ) << OB::offd( 0x40302010 );
   
   AssemblerResult result;
   assembler.assemble( ctx, instruction, result );
   AssembledInstruction *chosenInstruction = result.getChosenInstruction();

} catch( BaseException &exc ) {
   std:out << "Exception while assembling the code." << std::endl;
}

The dialect and assembler instances are non copyable because they manage native FCML resources internally and in order to avoid complex resource management mechanisms and make the whole wrapper easy and consistent they cannot be copied like the other ordinal classes like Instruction, AssemblerResult and so on. Just remember about the rule which was mentioned earlier, if you are able to copy anything it can be copied safely without making any dependencies between copied objects. You can be sure that a deep copy is made.

AssemblerResult consists of the same set of details as the plain fcml_st_assembler_result structure, so I'm pretty sure it doesn't have to be explained here. Anyway head over to the API section because there are a lot useful methods and operators which of course are not available in the C version. There is one interesting method in the AssemblerResult structure: getChosenInstruction. This method is a special one, because as you might have noticed before it returns a pointer which is not so common in case of the whole FCML wrapper. It's probably the only place where such a convention is used, so be aware that it's a pointer to one of internally managed AssembledInstruction instances and as such they have the same life time.

As you can see, exceptions are used if something fails by default. It's a global rule in case of FCML wrapping classes so bear in mind that you cannot avoid the usage of exceptions. Anyway in case of "assemble()" method (more about that in the disassembler chapter) you can disable exceptions by setting appropriate flag in the assembler configuration:

assemblerContext.getConfig().setThrowExceptionOnError( false );

In such a case the "assemble()" method returns errors in the same way as fcml_fn_assemble method does. This flag is set to true by default.

Remember that the constructor of the IntelDialect and GASDialect can also throw an exception.

See the API documentation I order to learn more about how AssembledInstruction and AssemblerResult can be used.

Multiline assembler

Multiline assembler has almost the same API as the one-line assembler. The only difference is the source of the instructions and the way the result is returned. The following code example should be self-explainable:


// Notice that it is an array of strings.
const fcml_string instructions[] = {
	"start:      mov ebx, 1",
	"loop_big:   inc ebx",
	"            cmp ebx, 10",
	"            je  finish",
	"loop_small: mov eax, 1",
	"increment:  inc eax",
	"            cmp eax, 10",
	"            je  finish_small",
	"            jmp increment",
	"finish_small:",
	"            jmp loop_big",
	"finish:     ret",
	NULL
};

…

try {

    IntelDialect dialect;
    MultiPassAssemblerContext ctx( EntryPoint::OM_32_BIT, 0x401000 );

    MultiPassAssembler assembler(dialect);

    MultiPassAssemblerResult result;
    assembler.assemble( ctx, instructions, result );

    CodeIterator it = result.getCodeIterator();
    while( it.hasNext() ) {
        cout << "Next byte: " << hex << it.next() << std::endl;
    }

} catch( BaseException &exc ) {
    cout << "Exception while assembling the code." << std::endl;
}

Notice the iterator here. Instead of iterating through the list of assembled instructions you can use this iterator to iterate through the whole assembled machine code without even touching the instructions. Remember that the iterator is a part of the MultiPassAssemblerResult instance and they have the same lifetime.

Of course instructions are also available and can be accessed using "getAssembledInstructions()" method of the MultiPassAssemblerResult class.

Stateful assembler

There is one more way to assemble code. It is a stateful assembler which can be used to assemble the code incrementally without even the care of the instruction pointer, code address incrementation etc. It is very easy to use and in fact it is also a very simple component. The following code shows an example usage:

try {

    IntelDialect dialect;
    AssemblerContext ctx( EntryPoint::OM_64_BIT, 0x404ddc );

    Assembler assembler( dialect );

    StatefulAssembler statefulAssembler( assembler, ctx );

    statefulAssembler << M_PUSH << RBP;
    statefulAssembler << M_MOV  << RBP << RSP;
    statefulAssembler << M_SUB  << RSP << 0x20;

    CodeIterator it = statefulAssembler.getCodeIterator();
    while( it.hasNext() ) {
        cout << "Next byte: " << hex << it.next() << std::endl;
    }

} catch( BaseException &exc ) {
    cout << "Exception while assembling the code." << endl;
}

As you can see everything you are obliged to do in order to initialize it is to pass an assembler instance and an assembler context into the constructor. Then you can use dedicated methods or shift operators in order to assemble instructions one by one without doing anything else to achieve that. As you might have noticed using it is very similar to using the fcml:IB class described earlier and you are right. In fact it uses fcml:IB internally so the rules are very similar.

There is one thing you have to be aware of using it. As you can see instructions are built mnemonic by mnemonic and operand by operand, but there is nothing like "now assemble the pending instruction". The assembling process is always invoked if there is a pending instruction in the buffer and the new instruction is being pushed into the assembler as a new mnemonic for instance. In such a case the pending instruction is firstly assembled and then the new one is being placed into the instruction builder. In case of the last instruction, the instruction is assembled lazily when you ask for the result by getting a code iterator or vector of assembled instructions. There is also a way to flush pending instruction on demand passing StatefulAssembler::FLUSH() to the builder using shift operators of calling "flush()" method directly:

statefulAssembler << M_CALL << OB::offq(0x41d719) << StatefulAssembler::FLUSH();

It is a quite important rule because such a lazily invoked assembling may fail and throw an exception. So you should be always prepared for it.

There is also a way to assemble textual instructions after passing true as the third parameter to the StatefulAssembler constructor:

try {

    IntelDialect dialect;
    AssemblerContext ctx( EntryPoint::OM_64_BIT, 0x404ddc, true );

    Assembler assembler(dialect);

    StatefulAssembler statefulAssembler(assembler, ctx);

    statefulAssembler.getParserConfig().setDisableSymbolsDeclaration(false);

    SymbolTable st;
    st.add( _FT("_start"), 0x41d719 );
    
    statefulAssembler.setSymbolTable( &st );

    statefulAssembler << _FT( "push rbp" );
    statefulAssembler << _FT( "mov rbp, rsp" );
    statefulAssembler << _FT( "sub rsp, 0x20" );
    statefulAssembler << _FT( "_label: mov dword [rbp-0x14], edi" );
    statefulAssembler << _FT( "mov qword [rbp-0x20], rsi" );
    statefulAssembler << _FT( "mov esi, 0x62d260" );
    statefulAssembler << _FT( "mov edi, 0x41dbe8" );
    statefulAssembler << _FT( "call _start" );

    Symbol symbol = st.get( _FT("_label") );

    CodeIterator it = statefulAssembler.getCodeIterator();
    while( it.hasNext() ) {
        cout << "Next byte: " << hex << it.next() << std::endl;
    }

} catch( BaseException &exc ) {
    cout << "Exception while assembling the code." << endl;
}

In such a case an internal parser instance is allocated and used to parse instructions just before they are passed to the assembler.

This component is very simple so take a look at the source code to understand how it works in details.

Disassembler

The whole disassembler functionality is wrapped into Disassembler class, which is a wrapper over the native fcml_fn_disassemble function. The class itself is rather small and consists of 100 lines of code, but there are a lot of data containers which in majority of cases are counterparts to the structures that are used to return information about disassembled instructions and as such they are not described here in details. Head over to the API documentation to master them. Looking at the API focus on the overloaded operands, because in some cases they can be really useful. Especially in case of common classes like Integer.

Ok, let's back to the disassembler. In order to prepare a working disassembler you have to allocate a dialect which has to be passed into the Disassembler constructor and that's all as long as the constructor succeeded, but you have to be aware that it might fail if there is any problem with initialization of the FCML disassembler. Bear in mind that the majority of wrappers here use C functions in order to allocate needed resources. The disassembler is not an exception here and it uses fcml_fn_disassembler_init function which can also fail in some circumstances. Having initialized a disassembler you only need a DisassemberContext and after initializing it you can start playing with the disassembler. The following code shows the disassembling process as a whole:

try {
	Disassembler disassembler( dialect );

	DisassemblerContext ctx( code , sizeof( code ) );
	ctx.setIP(0x401000);
	ctx.setOperatingMode( EntryPoint::OM_32_BIT );

	DisassemblerResult result;
	disassembler.disassemble( ctx, result );

	const Instruction &instruction = result.getInstruction();

	fcml_cstring instructionMnemonic;

	Renderer renderer( dialect );
	RenderConfig config;
	renderer.render( config, result, instructionMnemonic );

	cout <<  instructionMnemonic;

} catch( BaseException &exc ) {
	cout << "Disassembling failed." << endl;
}

As you might notice the disassembled instruction is immediately rendered. The rendering process is so straightforward that it doesn't have dedicated chapter. Everything you have to do is to allocate Renderer instance for a dialect and pass a configuration into it. DisassemblerResult consists of the same set of details as plain fcml_st_disassembler_result structure, so I'm pretty sure it doesn't have to be explained here. Only one difference to the C version is one configuration flag responsible for controlling the error handling process. It can be used to disable exceptions for "disassemble()" and "render()" methods. It is a consistent rule that renderer, parser, assembler and disassembler can return classic fcml_ceh_error error code instead of firing exceptions, however you should bear in mind that their constructors throw exceptions in case of any problem and as such you are not able to avoid exceptions at all.

As it was pointed out in the first chapter every object can be safely copied and it doesn't need any special care, so you can for example copy every assembled instruction into a vector and you do not have to deal with any allocation related issues.

Stateful disassembler

The stateful disassembler automates the process of disassembling instructions one by one. It consist of DisassemberContext and the Disassembler itself and is a kind of mediator which controls the whole process. The following code shows how to initialize stateful disassembler and dissemble four instructions in a row:

try {

    GASDialect dialect;

    Disassembler disassembler( dialect );

    DisassemblerContext ctx( code_buffer , sizeof( code_buffer ) );
    ctx.setIP(0x404ddc);
    ctx.setOperatingMode(EntryPoint::OM_64_BIT);

    StatefulDisassembler disas( disassembler, ctx, true );

    Instruction instruction;

    // Disassemble four instructions oner by one.
    disas >> instruction;
    disas >> instruction;
    disas >> instruction;
    disas >> instruction;

} catch( BaseException &exc ) {
    cout << "Exception while disassembling the code." << endl;
}

You have to take into account that you cannot avoid exceptions using the stateful disassembler. It is a reasonable decision, because as you might have noticed, in some cases there are even no ways to return classic error codes.

The component itself is located in the fcml_stateful_disassembler.hpp header file. If you have any questions do not hesitate to take a look at the implementation. The component is really simple and consists of only about 200 lines of code.

Mind you that every class used is allocated on the stack. It is one of the global rules. You can allocate everything without the use of the new operator. There are event classes which cannot be allocated using the new operator at all, like the exceptions.

Examples

The following sections describe three example applications available for the FCML library.

fcml-asm

It is a simple console application built during the main build process. It is an one-line assembler which can be used to assembler a piece of machine code and print the assembled instruction to the console. This application is not installed in the system when "make install" is called, so in order to use it go to the "example/fcml-asm" directory when the build process is finished. This example application is not available for Visual Studio build.

Example usage:

tas@tas ~/git/fcml/example/fcml-asm $ ./fcml_asm -m32 -ip 0x4001000 "add eax,1"
Number of assembled instructions: 3
Instruction: 1 
 Code: 83c001 
 Code length: 3 
Instruction: 2 
 Code: 81c001000000 
 Code length: 6 
Instruction: 3 
 Code: 0501000000 
 Code length: 5 
Best instruction chosen by the assembler: 1 

In order to get information about all supported options type "./fcml-asm –help".

fcml-disasm

It is an one-instruction disassembler (built as a part of the main build process) which can be used to disassemble a piece of machine code and print the disassembled instruction to the console. This application is not installed in the system when "make install" is called, so in order to use it go to the "example/fcml-disasm" directory when the build process is finished. This example application is not available for Visual Studio build.

Example usage:

tas@tas ~/git/fcml/example/fcml-disasm $ ./fcml_disasm -m32 -ip 0x4001000 0x678316010203
Basic information: 
 Disassembled instruction: adc dword ptr [0201h],3 
 Mnemonic: adc 
 Instruction hints: FCML_HINT_NEAR_POINTER 
 Number of operands: 2 
  Operands: 
   Operand: 1 
    Type: FCML_OT_ADDRESS 
    Address form: Offset. 
    Segment register: ds (default: true) 
    Size operator: 32 
    Offset: 
     Size: 16 
     Signed: true 
     Value: 0x0201 
   Operand: 2 
    Type: FCML_OT_IMMEDIATE 
    Signed: true 
    Size: 32 
    Value: 0x00000003 
Details: 
 Instruction code: 0x678316010203 
 Instruction code length: 6 
 Pseudo-op: false 
 Shortcut: false 
 ModR/M details: 
  Is RIP: false 
  ModR/M byte: 0x16 
 Prefixes details: 
  Prefixes size in bytes: 1 
  Number of available prefixes: 1 
  Available prefixes (flags): 
  Prefixes fields: 
  Available prefixes (details): 
   Byte: 0x67, Type FCML_PT_GROUP_4, Mandatory: false, XOP/VEX bytes: 0x00, 0x00. 
 Operands details: 
  Operand: 1 
   Access mode: FCML_AM_READ FCML_AM_WRITE 
  Operand: 2 
   Access mode: FCML_AM_READ 

In order to get information about all supported options type "./fcml-disasm –help".

hdis

It is a fully functional disassembler for Java just-in-time compiler. It can be registered inside the chosen Java process in order to print the whole native code generated by JIT. As opposed to the examples above it does not take part in the main build process, so in order to build it you have to go to the "exampes/hdis" directory and type: "make". After the build process is finished you have to go to the ".libs" directory where the shared library was built. You will find something like: "libsdis.so.0.0.0" in there. It is the built disassembler. This project is also available for the Visual Studio build and in case of the Visual Studio all you need to do is to click build :)

Do not use MinGW/Cygwin to build this example if you do not know how to build it correctly, because by default FCML is not configured to build the library in the appropriate way using these compilers (Mentioned problem with symbols.). Remember that there are pre-built binaries in the distribution archive.

The next thing to do is to install the library in the correct place. The following table shows the name of the library it should be renamed to and the place where it has to be copied. For the VS pre-built binaries the renaming is not necessary because libraries are available with the correct name.

System/Architecture Library name Destination
GNU/Linux-x86 hsdis-i386.so ${JAVA_HOME}/jre/lib/amd64
GNU/Linux-x86_64 hsdis-amd64.so ${JAVA_HOME}/jre/lib/i386
Windows-x86 hsdis-i386.dll ${JAVA_HOME}\bin\server\hsdis-i386.dll
Windows-x86_64 hsdis-amd64.dll ${JAVA_HOME}\bin\server\hsdis-amd64.dll

In case of Windows, the Java process looks for the library in the same directory where the used jvm.dll is located. So you can also copy the library into all places where different versions of the jvm.dll are located if you do not know which one is being used by the Java process.

Then you have to run Java with appropriate parameters in order to enable the installed disassembler, for instance:

java -XX:+UnlockDiagnosticVMOptions \
     -XX:+PrintAssembly \
     -XX:+LogCompilation \
     -XX:PrintAssemblyOptions=intel,mpad=10,cpad=10,code \
     -jar fcml-test.jar

In the result you should see something like this:

e:\>java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:+LogCompilation -XX:PrintAssemblyOptions=intel,mpad=10,cpad=10,code -jar fcml-test.jar 
Java HotSpot(TM) 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output
Loaded disassembler from C:\Program Files\Java\jre7\bin\server\hsdis-amd64.dll
Decoding compiled method 0x000000000212a210:
Code:
RIP: 0x212a340 Code size: 0x00000158
[Disassembling for mach='amd64']
[Entry Point]
[Constants]
  # {method} 'hashCode' '()I' in 'java/lang/String'
  #           [sp+0x30]  (sp of caller)
  0x000000000212a340: 448b5208            mov       r10d,dword ptr [rdx+8h]
  0x000000000212a344: 493bc2              cmp       rax,r10
  0x000000000212a347: 0f8513d7fcff        jne       20f7a60h
                                                ;   {runtime_call}
  0x000000000212a34d: 666690              nop
[Verified Entry Point]
  0x000000000212a350: 89842400a0ffff      mov       dword ptr [rsp+0ffffffffffffa000h],eax
  0x000000000212a357: 55                  push      rbp
  0x000000000212a358: 4883ec20            sub       rsp,20h
                                                ;*synchronization entry
                                                ; - java.lang.String::hashCode@-1
  0x000000000212a35c: 4c8bea              mov       r13,rdx
  0x000000000212a35f: 8b4210              mov       eax,dword ptr [rdx+10h]
                                                ;*getfield hash
                                                ; - java.lang.String::hashCode@1
  0x000000000212a362: 85c0                test      eax,eax
  0x000000000212a364: 0f85da000000        jne       212a444h 

The following assembly options are supported by hsdis:

code
Print the machine code before the mnemonic.
intel
Use the Intel syntax.
gas
Use the AT&T assembler syntax (GNU assembler compatible).
dec
Prints IMM and displacement as decimal values.
mpad=XX
Padding for the mnemonic part of the instruction.
cpad=XX
Padding for the machine code.
seg
Shows the default segment registers.
zeros
Show leading zeros in case of HEX values.