Solar Assembler Reference Manual

for Sol_Asm version 0.38.40

updated 05.02.2018

©   Copyright 2007,2018 Bogdan Valentin Ontanu. All rights reserved.


Chapter.1 Introduction

This document presents an overview of the syntax and usage of Solar Assembler. It makes the assumption that the reader is familiar with assemblers and ASM programming language.

During this document the following terms and abbreviations are used:

Abbreviation Description
Sol_Asm Solar Assembler
OS Operating system
Win32 or Win64 Windows 32 or 64 bits operating system
PE32 or PE64 Portable Executable Format - 32 or 64 bits
DLL Dynamic Link library
CDECL C default calling convention
STDCALL Win32 API default calling convention
OMF Object Module Format - OBJ format specification
COFF Common Object Format - OBJ format specification
ELF Executable and Linking Format - OBJ format specification
HLL High Level Language

Also in this document the accolades "{}" are used to enclose some text, name or value that you have to specify in syntax definition.

One exception to this rule is in STRUCTURE initialization section where "{}" are part of the syntax.

1.1.Design Goals

SOL_ASM is designed from the point of view of the creator that uses ASM as its main programming language. Hence Sol_Asm tries to ease the development of huge ASM only projects.

However Sol_Asm can also be used as a low level assembler without the help from HLL directives.

Sol_Asm main features are:

In daily usage this means that:

It also means that SOL_ASM does contain a decent amount of HLL features like:

All HLL statements are implemented internally in SOL_ASM and code is generated for them at compile time (not by user included macros). This means that those features can be used to start development with minimal includes.

Of course Sol_Asm is written in assembly language and compiled by Sol_Asm itself. That is why it is named sol_asm2 ... because sol_asm is building sol_asm2 ;)

1.2 Targets

Short term targets

The short term targets until alpha stage have been:

All of the short term targets have been acquired.

Long term targets

The long term targets until beta stage are:

Most (but not all) of the long term targets have been acquired also.

1.3 Fair warning

Solar Assembler is stable and functional but still in development. This means that it still contains a few bugs and a few missing features. Caution is advised but it should be ok for big ASM projects.

Lately Sol_Asm has been used to develop Sol_ASM itself and Solar_OS and Hostile Encounter RTS Game. Those are big and complex projects and Sol_ASM has proved itself valid for them.

1.4 OS specific versions

This document assumes you are using the Win32 version of Sol_Asm. Other OS versions and details are not fully presented here.

However SOL_ASM OS specific versions are almost the same and share 99% of code with Win32 versions.

The only differences are:




Chapter.2 Running Solar assembler

2.1 Invocation

You can execute SOL_ASM from the command line like this:


	sol_asm2 {input_file} {output_file} {-options}


	sol_asm2 {-options} {input_file} {output_file} 


	sol_asm2  -pe32  my_game.asm  my_game.exe 

2.2 Options

All command line options must be specified with the "-" prefix character. On Windows you can also use the "/" prefix character.

Help options:

Option Action Obs.
-h, -help This will print all help text and then exit  
-h0, -h1, -h2, -h3, -h4 This will print a limited part of help text and then exit
  • h0: help options
  • h1: format options
  • h2: PE sub-systems
  • h3: Other options
  • h4: Info options

Output options:

Option Action Obs.
-pe32 This will generate Win32 Portable Executable  
-pe64 This will generate Win64 PE executable  
-console This will set console sub-system (for Win32)  
-dll This will set DLL characteristics of PE (for Win32) make a DLL
-binary This will generate a plain binary useful for OS development or handcrafted formats
-omf32 This will generate an OBJ file in OMF format. This OBJ can be linked with ALINK linker.
-coff32 This will generate an 32 bit OBJ file in MS-COFF format. This OBJ can be linked with MS link, Polink, GOlink and other linkers. and used in projects that link multiple modules together as OBJ
-coff64 This will generate COFF 64 OBJ format  
-elf32 This will generate an 32 bit OBJ file in ELF format. can be linked with LD or GCC on Unix like systems
-elf64 This will generate an 64bit OBJ file in ELF format. can be linked with LD or GCC on Unix like systems
-mac32 This will generate an OBJ file in MACHO format. is still experimental or not finished

For MacOSx it is still recommended that you use the ELF32/64 format and then convert from ELF to MachO obj format before linking (eg by using objconv by Agner Fog)

Other options:

Option Action Obs.
-q Be Quiet: only error messages are shown (for makefiles) for makefiles
-d equ_name This will define equ_name symbol at command line. The value of the symbol is 1 (one) and can be later tested in source code.
-size This will optimize for the output for size.

Using this option will usually result in more passes being done.

-dbg This will generate debug info. Works for PE32, ELF and COFF OBJ, other debug formats and levels will follow.
-list This will generate an listing file named: output_filename_list.lst One extra pass will be done for listing
-list_pass This will generate a series of listing files named: _list_1.lst list_2.lst ... one file for each pass. Compile speed will be slower with this option
-bench This will show a compiler/parser speed benchmark

Info options:

Option Action Files name suffix
-info This will generate OllyDbg specific info.
This can be loaded into OllyDbg with the Labelmaster plugin.
-info_proc This will generate a list of all PROC's and their arguments. _proc.lst
-info_stru This will generate a list of all STRUC's and their members info. _stru.lst
-info_equ This will generate a list of all EQU's items and their values. _equ.lst
-info_enum This will generate a list of all ENUM items and their values as EQU definitions.
-info_tkn This will generate a list of known opcodes and directives. _tkn.lst
-info_files This will generate a list include files and folders. _files.lst
-info_reloc This will generate a list of relocations _reloc.lst
-info_sect This will generate a list of sections for each pass _sections_N.lst
-info_all This will generate all above info.  

The info_files option

If you have a main ASM file that includes all other files then Sol_ASM will parse this tree and generate a list of all the files included in your project. The sub folders of your project's main ASM file will become "groups".

The generated list of files is in RadASM INI format and you can copy paste it in a dummy project in order to transfer your existing non-RadASM project into an RadASM project.

2.3 OllyDbg specific debug info

Sol_Asm can generate a file named: output_filename_info.txt that will contain a list of your application LABELS, PROC's and their addresses. This file can be loaded in OllyDbg by the LabelMaster plug-in and can help symbolic debugging a lot. You will be able to see familiar code labels, PROC names, variable names and call stack in OllyDbg.

You can obtain Labelmaster plugin here

The same thing can be obtained by using the -dbg command line option that will generate debug info inside the OBJ or PE32 files. However this simple ascii format has some advantages (multiple address with the same name) and can be used with ease for your own custom debugging utils.

Chapter 3. Program Setup

A series of initial statements are required for making a valid program. this is usually called "red tape" to wrap the package.

Here is a sample of the most simple Sol_Asm program:

; minimal test file
section "code" class_code


Only one declaration is absolutely required by SOL_ASM: the section declaration.

3.1 Sections

SOL_ASM divides a program into multiple sections.

You define a section like this:


	section {"section name"} {section_type}


	section "code" 	class_code
	section "data"  class_data
	section "idata" class_imports

At least one section must be defined before any code generation.

Section type Description Attributes
class_code for code CODE, EXECUTE, READ
class_data for initialized data INITIALIZED, READ, WRITE
class_bss for not initialized data READ, WRITE, RAW size = 0
class_imports for imports INITIALIZED, READ, WRITE
class_relocs relocations INITIALIZED, READ, WRITE
class_exports exports INITIALIZED, READ
class_rsrc for resources work in progress

After you have defined your sections, in the program body you can switch in between sections with ".section_name" like this:

	; enter your code here
	; enter some data definitions here
	; return / continue to code section

3.1.2 Section Name Alias

When defining a section you can provide an alias name like this:


	SECTION	{section_name_for program} ALIAS {section_name_for_OS}

This is useful for linkers that have default section naming conventions or like to unite sections based on section name.

For example:


This will allow to use the familiar ".code" and ".data" section selectors in your program and still output a section name according to your linker's preferences.

Alternatively you can name your section".text" and select it with "..text"

3.2 Imports

If your program is a PE32 or PE64 format then you can specify the imported DLL's and the functions imported from each DLL

3.2.1 Imports Definition

You define imports like this:


	FROM	 	{dll_name}
	IMPORT		{function_name} {[param_count]} {calling convention} ALIAS {alias_name} 


	from	 	kernel32.dll
	import		ExitProcess
	import		GetStdHandle [1] STDCALL ALIAS _GetStdHandle@4

	from		user32.dll
	import		MessageBox ALIAS MessageBoxA

The above example will import:

Each "import" statement belongs to the previous "from" statement.


Alternative names for import keywords are:

3.2.2 Imports Alias

When importing an API name you can provide an alias name like this:


	IMPORT	{function_name_for program} ALIAS {function_name_for_OS}

For example:

	import	MessageBox alias MessageBoxA

This will allow your program to refer to ASCII or UNICODE versions of API using a single API name across your code.

3.2.3 Calling convention for imports

By default Sol_Asm considers all imported functions to be STDCALL for binary and 32 bits format and WIN64 for 64 bits format.

You can establish a different calling convention of an imported function like this:


	import	Str_Printf CDECL/win64/stdcall/lin64

Additionally you can add the "varg" statement to mark a variable arguments import function. This is usefull for lin64 calling convention.

3.2.4 Argument count for Imports

Sol_Asm does not need procedures prototypes because it will extract this information from your PROC definition even if the definition is present after procedure usage. However imported API's are not defined in your sources and hence by default Sol_Asm will not check the argument count for imported functions.

You can define the argument count for imports like this


	IMPORT	{function_name} [{argument_count}]


	import	MessageBox [4] alias MessageBoxA

In this case Sol_Asm will check an INVOKE statement to have 4 parameters and the IMPORT statement acts as a mini prototype.


With EXTERN you can define symbols external to your module (defined in other modules). This is usefully when you gnerate OBJ files and link them together with an external linker.


	EXTERN	{function_name} [{argument_count}] ALIAS {function_alias}


	extern AddAtom [1] alias _AddAtomA@4

EXTERN is similar to IMPORT but it does not need a FROM_DLL statement

EXTERN symbols will be solved by the linker at link time. Depending on your linker configuration they can be linked in statically or dynamically.


Using EXPORT you can export a procedure or a label from your program. It works for PE32, DLL and COFF,ELF output formats.

For OBJ output formats it has the effect of making your symbol PUBLIC.

3.4.1 Define Exports

You define exported functions like this:


	EXPORT {proc_or_label_name} ALIAS {export_name_for_output}

3.5 Entry point

By default the entry point is at the start of the first section defined.

However you can specify another location like this: Syntax:

	.ENTRY {symbol_name}


	.entry App_Init
		call	Main
		invoke	ExitProcess


This will define "App_Init" label as the entry point of your program.


3.6 Base address, ORG and DISP

Those keywords allow you to setup the address where your code will or should be placed in memory at run time.

3.6.1 Base address

The default base address is setup like this:

You can setup a base address like this:


	BASE	{absolute_or_virtual_address}


	BASE	02000_0000h

This will make your executable base address start at 512M. The base address is the address of the first section. Each section is aligned at 4K and PE files also have an additional 4K header before the first section starts.

BASE has the same effect like an ORG and an DISP with the same value.

3.6.2 ORG

If you want a piece of code to be located at an absolute address (for OS development) or if you want to "jump around" inside code positions you can use the ORG directive:


	ORG	{absolute_address}

The ORG directive moves both the current address counter and output pointer in output file to the specified address.

Code and data will be generated at the new address and offset in output file.

3.6.3 DISP

DISP directive will move the output pointer backward a certain amount. the reason for this is to avoid having zeroes at start of output file after an ORG directive.


	DISP	{negative_move_size}

For example:

	org	0B000h
	disp	0B000h

Are the first lines in Solar_OS System32 module.

This means that code is made to run at absolute address 0xB000 but the output pointer remains at:

	output_position = + B000h (because of org) - B000h (because of disp) = 0

And this way the generated binary will not contain 0xB000 zeroes or garbage at start of file.

3.7 Encoding Modes

Sol_Asm can encode 16bits, 32bits and 64bits ASM.

You switch between 32/64 bits encoding with:

	.USE16		- encode 16 bits
	.USE32		- encode 32 bits (default)
	.USE64		- encode 64 bits 

3.8 Include files

Your main asm file can include other asm files and so on.


	include		{include_path_and_file_name}

Or alternatively you can include binary files.


	incbin		{include_path_and_file_name}

You can also fine tune the binary include:


	incfrom		{start_pos}, {size}, {include_path_and_file_name}

In this case all 3 parameters must be present.

Include size can be "?" if you want to include until the end of the file.


	incfrom		512,1027,help2.txt	; skip 512 bytes, include 1027 bytes
	incfrom		128,?,help2.txt		; skip 128 bytes, include rest of file

Chapter 4. Language elements

4.1 Numbers

SOL_ASM accepts numbers in the following formats:



	111_000_101b		- binary
	0FFFF_C0_00h		- hexadecimal
	1_000_000		- decimal

	10.2345			- floating point

Numbers do not have to start with "0" or a digit but that is good practice.


4.2 Expressions

Expressions are statements like:

	<1 SHL 5>

Expressions can contain Numbers, Operators, Braces and Symbols

Operators are:

Operator Description Priority
"*" multiplication 1
"/" division 1
+ addition 2
- subtraction 2
x SHL n shift x left n times 1
x SHR n shift x right n times 1
x ROL n rotate x left n times 1
x ROR n rotate x right n times 1
x XOR y binary XOR 1
x AND y binary AND 1
x OR y binary OR 1
NOT x binary NOT 2
- unary minus 1
RND N obtain a random number in range [0...N] 1

Variables recognized in expressions

Variable Description Priority
$ current address 2
$adr current address 2
$$ current section base addr 2
$$$ format base addr 2
$ofs current offset in section 2
$rva symbol RVA of symbol 2
$pass current pass nr 2
$style token token style (token, string modrm) 2
$type token token type (register, label, etc) 2
$size token token size (8,16,32,64 bits) 2
$value token token value (reg code, label addr) 2


For example:

	< 1 SHL 5 >

The above expression does contain spaces and was therefore enclosed in < and >

4.3 ModRM Expressions

Are expressions used by CPU in complex effective address calculations. Those kind of expressions are handled differently from normal expressions by Sol_Asm.

The generic layout is:


	[{base_reg} + {scale}*{index_reg} + {displacement}]


	mov	eax,[esi + 4*ecx + 1234h]
	mov	eax,[esi + INFO_CTX.name_len]

In the first example above:

	base_reg 	= esi
	scale		= 4
	index_reg	= ecx
	displacement	= 1234h

As per CPU specifications scale can be missing or: 2,4,8 only


4.4 Small strings

This is special kind of string that can be used as an instruction operand.

For example:

	mov	eax,"abcd"
	cmp	al,"-"

You can use the SWAP modifier to reverse the string

For example:

	cmp	eax," rox"		; compare with "xor " in reverse because of endian issues
	cmp	eax, swap "xor "	; same as above but much easier to read

4.5 User defined symbols

They are used as names for labels, procedures, etc in the program.

For Example:


User defines symbols are case sensitive and can contain underscores "_" digits and special characters but can not contain: CR, LF, space "<>" and comma.

They do not have to start with a letter... (but that is good practice).

The max symbol size is 128 bytes.


4.5.1 Line comments

This kind of comments start with ";" character and extend until the end of line.

For example:

	; this is a single comment on a line

	mov	eax,1		; this comment is at end of line

4.5.2 Block comments

Block comments are made with: "/*" and "*/"

For example:

; comment out this debug code
	ODS_str	<13,10,"+++ Equ_Create">
	; notice here how block comments can be nested
	ODS_fmt	<13,10,09,"equ_create: value=%x">, eax


4.6 Very Long Lines

You can continue a long line on the next line with the "\" symbol.

For Example:

invoke	CreateWindowExA,0,class_name,wnd_title,\

4.7 Keywords

Keywords are:

For Example:

	MOV, XOR, ADD, SUB, JMP, CALL		- are opcode mnemonics 
	EAX, ECX, ST0, MM0, RAX, XMM1		- are register names
	PROC, STRUC, .entry, ORG, INVOKE	- are SOL_ASM directives

Keywords are case insensitive.

4.8 Special symbols

SOL_ASM rarely treats a symbol in a special way.All symbols are born equal :P However there are exceptions:

Special Character Description Notes
SPACE, TAB or "," are used as separators for tokens can not be part of user tokens
CR, LF line end and separators for tokens can not be part of user tokens
":" it defines a code label when used as suffix after a user symbol can be part of user tokens
"$" means current address counter can be part of user tokens
"?" means "do not care" / "non initialized" in data define statements can be part of user tokens
"." hints a section selection when followed by a section name
means a structure member name separator in {structure}.{member}
can be part of user tokens
" " (double quotes) encloses a string can be part of user tokens
' ' (single quotes) encloses a string can be part of user tokens
< > means LITERAL, multiple tokens enclosed by < > and separated by spaces or comma will be considered as one ;) has use restrictions
"[" "]" encloses a Mod_RM address expression has use restrictions
"{" "}" used to enclose structure initializations statements
also used instead of < >
can be part of user tokens but has use restrictions

The following symbols have a special meaning only inside a MACRO body:

Special Character Description Notes
"@" means MLOCAL when used as a prefix inside a MACRO can be part of tokens
"&" triggers MARG check and expansion even when inside a token can be part of tokens

As you can see Sol_Asm is relatively tolerant toward the use of special symbols inside user defined tokens.

4.8.1 Build Time

The special symbol "$time" means current build time in OS_TIME format and it creates a data definition.

	year		dw	?
	month		dw	?
	day_of_week	dw	?
	day_of_month	dw	?
	hour		dw	?
	minute		dw	?
	second		dw	?
	mili_sec	dw	?

When include in source this data definition will be updated by SOL_ASM at each compile time.

For example:

	; compile time symbol, 
	; value is filled in by assembler
	db	0	

Chapter 5. Data definitions

5.1 Define initialized data

You can define initialized data like this:


	{label_name}	db	{data_item}	; define byte 	8 bits
	{label_name}	dw	{data_item}	; define word	16 bits
	{label_name}	dd	{data_item}	; define dword	32 bits

	{label_name}	dq	{data_item}	; define qword	64 bits
	{label_name}	dt	{data_item}	; define tword	80 bits
	{label_name}	do	{data_item}	; define oword	128 bits


Additionally "db" does accept ASCII strings.

For example:

	my_dwords	dd	1,2,7,0FACE_BABEh,1356789,11

	my_string	db	"This is a message",0 

You can use the "?" special character to define non initialized data.

For example:

	my_var_db	db	?
	my_var_dw	dw	?
	my_var_dd	dd	?

5.2 Define unicode strings

You can define an unicode string like this:


	{label_name}	du		"type your utf-8 string here",0	

The parser will read and interpret utf-8 encoded code points from the quoted string and translate them to 16 bits words.

5.3 Define Floating point data


	{label_name}	real4		{data_item}	; define REAL4 number	- 32bits
	{label_name}	real8		{data_item}	; define REAL8 number	- 64bits
	{label_name}	real10		{data_item}	; define REAL10 number	- 80bits	

For example:

	test1		real4		10.2345
	test2		real4		0.7785
	test3		real4		1_277_789.534
	test4		real4		999_123_456_789.37

	test1b		real8		10.2345
	test2b		real8		0.77854321773
	test3b		real8		1_277_789.534
	test4b		real8		999_123_456_789.37

	test1c		real10		10.2345
	test2c		real10		0.77854321773
	test3c		real10		1_277_789.534
	test4c		real10		999_123_456_789.37	

SOL_ASM performs real number conversions into the highest floating point precision available (80bits) and stores the result in requested format. Because of this "test4" above can not retain all defined digits but "test4c" can do it.

5.4 Reserve non initialized data

You can reserve data with the following keywords:

	{label_name}	rb	{count} 	; reserve byte(s)  =   8 x  bits
	{label_name}	rw	{count} 	; reserve word(s)  =  16 x  bits	
	{label_name}	rd	{count} 	; reserve dword(s) =  32 x  bits
	{label_name}	rq	{count} 	; reserve qword(s) =  64 x  bits
	{label_name}	rt	{count}		; reserve tbytes   =  80 x  bits
	{label_name}	ro	{count}		; reserve owords   = 128 x  bits

You can reserve structures like this:

	rs	{structure_name},{count} 	- reserve {count} structures	

For example:

	rb	1024		; reserve 1024 bytes	
	rw	17		; reserve   17 words
	rd	23		; reserve   23 dwords
	rs	WNDCLASS,77	; reserve   77 WNDCLASS structures	

5.5 Fill data buffers

You can fill initialized data buffers with the following keywords:

 {label_name}	fb	{count},{fill_value} 	; fill bytes	
 {label_name}	fw	{count},{fill_value} 	; fill words		
 {label_name}	fd	{count},{fill_value} 	; fill dwords	
 {label_name}	fq	{count},{fill_value} 	; fill qwords	

And for structures:

 {label_name}	fs	{struc},{count},{fill_value}	; fill structures 

5.6 Structure data definitions

Structure definitions are automatically promoted as data types and you can define a structure like this


	{label_name} {structure_name}	{data_item}

For example:

	my_class	WNDCLASS	?	

Defines one WNDCLASS structure at label "my_class" with initial value unknown, and one PAINTSTRUCT structures at label "my_ps".

5.7. Structure Member data initializations

Considering the structure:

		x	dd	?
		y	dd	?
		z	dd	?

You can initialize structure members like this:

	my_pt_1		POINT_3D	{ 1 2 3 }				

	my_pt_2		POINT_3D	{  y = 2  z = 7  x = 1 }

	my_pt_3		POINT_3D {  
					x = 2  
					y = 7  
					z = 1 

You can initialize sub structure members by name like this:

		x	dd	?
		y	dd	?
		color		dd	?
		center		rs	POINT_2D,1
		radius		dd	?
	my_circle_var	CIRCLE_2D {
					color = 00_7F_FF_3Fh
					center.x = 100
					center.y = 200
					radius = 77

You can nest {} like this

	my_circle_var	CIRCLE_2D {
					color = 00_7F_FF_3Fh
					{ 100 200 }
					radius = 77
	; or by name
	my_circle_var	CIRCLE_2D {
					color = 00_7F_FF_3Fh
					{ x = 100 y = 200 }
					radius = 77

You can also use {} for members that are made of multiple items but are not typed as structures (RB, RW, RD, RQ, RO) like this:

struc GUID
	dd1	dd	?
	dw1	dw	?
	dw2	dw	?
	bytes	rb	8

my_guid	GUID { aaaa_bbbbh,cccch,ddddh { 1 2 3 4 5 6 7 8 } }

Chapter 6. General Code Syntax

SOL_ASM follows Intel style ASM syntax as opposed to AT && T syntax. The syntax reflects my personal preferences resulted from doing extensive applications in ASM. The following sub chapters will present the most notable syntax issues...

6.1 Default ASM instruction syntax

Each ASM source line haves this default layout:


	{label} {instruction} {parameter1} , {parameter2} {; comments}

For example:

read_pixel:	mov	eax,[esi]	; 32 bits ARGB format

All elements can be missing but if a {parameter} is present then {instruction} must also be present.

Directives do not have to follow this syntax.

6.2 Offset keyword and use of []

There is no "offset" keyword. The name of a variable or label automatically means "offset of" As a consequence you must always use brackets for obtaining "contents of" a variable.

For example:

		my_var		dd	37h
		mov	esi,my_var
		mov	edx,[my_var]
		mov	edx,[esi]		

In the above example the first MOV will fill ESI register with the offset of my_var (ie with 0x402000 for example)

The second MOV will fill EDX with the content of my_var (ie with 37h for example). The 3rd move will do the same but by using esi as a pointer to my_var.

Notice the similarity of the second MOV with: MOV EDX,[ESI]

6.3 Size overrides

When needed or when wanted the user can override the operand size of encodings.

Available overrides are:

	byte		- force   8 bits
	word		- force  16 bits
	dword		- force  32 bits
	qword		- force  64 bits
	tbyte		- force  80 bits
	oword		- force 128 bits

	small		- force low word of symbol 

6.4 Structure members

Let us assume we have defined the following structure:

		info_name		rb	128
		info_dword		dd	?
		info_word		dw	?
		info_byte		db	?	

And then we reserve a vector of 1024 such structures:

	my_info		rs	INFO_CTX,1024

Then the following rules apply for accessing structure members:

	mov	esi,my_info
	mov	eax,[esi + INFO_CTX.info_dword]		
	mov	[esi + INFO_CTX.info_word],2		; will move WORD 2
	mov	[esi + INFO_CTX.info_byte],1		; will move BYTE 1

	; go to next item in vector
	add	esi, size INFO_CTX

Observe how the structure member size will hint instructions for operand size when possible. This greatly reduces the need for "dword / word / byte" modifiers.

For example:

	movzx	eax,[esi + INFO_CTX.info_byte]
	movzx	eax,[esi + INFO_CTX.info_word]

is equivalent to:

	movzx	eax,byte [esi + INFO_CTX.info_byte]
	movzx	eax,word [esi + INFO_CTX.info_word]

But you do not have to use "byte" and "word" hints because of the structure that provides this information.

However in this example:

	mov	byte [esi],4

SOL_ASM will require the "byte" user size override / hint because there is no structure member hint available

6.5 Multiple instructions on the same line

You can write multiple assembly instructions on the same line. Sol_Asm will know when one instruction ends and the next one starts.

For Example

	push ebx  	push esi	push edi

	; init
	mov eax,1 	mov ecx,17	mov ebx,3
	xor ecx,ebx  	sub ebx,edx
	dec ecx 	jnz loop
	pop edi 	pop esi 	pop ebx

6.7 Empty Spaces

In this development stage Sol_ASM can be very annoying about white spaces requirements. This behaviour is in part because the parser always considers spaces as token separators no mater what. This helps parsing speed and eases debugging but it also makes some problems.

It is my intention to remove those limitations in later versions but for now you will have to know and respect them

6.7.1. Expressions and spaces

The expression parser doe shandle white spaces but the high level tokenizer does break expressions on spaces and because of this you must avoid spaces in expressions or if you need spaces then enclose the whole expression in < and > or { and }

; this is OK
mov	eax, (7*4)+(5*PACKET_SIZE)	; this is an expression with no spaces inside
mov	ecx, WND_CHILD+WND_MINIMIZE	; this is an expression	with no spaces inside
mov	ecx, size MY_STRU		; this is not an expression
mov	al, byte [esi]			; this is not an expression

; this is NOT OK because expressions can not contain spaces
mov	eax,(7*4) + (5 * PACKET_SIZE)
mov	eax,1 SHL 18					; this expression needs spaces

; this is made  OK by the use of < and >
mov	eax, < (7*4) + (5 * PACKET_SIZE) >
mov	eax, < 1 SHL 18 >
Notes for expressions

6.7.2 .IF and Spaces

Runtime conditionals like .IF or .While or .Repeat do need spaces arround: paranthesis, conditions and logical operators.

; this is OK 
.if ( eax == 1 .and. ebx == 5 ) .or. ( [status] == 1 .and. [errors] == 0 )

; this is NOT OK
.if (eax==1 .and. ebx==5).or.([status]==1.and.[errors]==0)

; here use {} because < and > are conditional operators also
.if ( eax < { 7FFFFh SHR 5 } ) .and. ( edx > { 1 SHL 7} )
Notes for .IF

Chapter 7. Directives



	{symbol_name}	EQU	{value or expression}


	equ1			equ	40
	equ_28			EQU	< 1 SHL 28 >

Equates can not be redefined or double defined. However you can use the assignment operator for this:


	{symbol_name}	=	{value or expression}


	x = y + 1
	y = 7

For example the folowing code will force Sol_ASM to make 8 passes until y = 7 and no longer changes it's value

#if $pass == 1
	y = 0

#if y < 7
	y = y+1

#echo " y=%x",y


Labels are defined in two modes:


{label_name}	{data definition keyword}	{data_items}

For example:

	mov	ecx,nr_of_items
	mov	esi,items_ptr
	; perform some actions here
	add	[esi+ITEM.quantity],1
	; next item
	add	esi,size ITEM
	dec	ecx
	jnz	my_loop

In the above code sequence "my_loop" is a code label and serves as a target for the JNZ instruction.


	my_account_balance	dd	1234_5678h
	mov	ecx,nr_of_invoices
	mov	esi,invoices_ptr
	; perform some actions here
	mov	eax,[]
	sub	[my_account_balance],eax
	; next invoice
	add	esi,size INVOIVE
	dec	ecx
	jnz	my_loop	

In the above code sequence "my_account_ballance" is a data label and serves as a parameter for the SUB instruction.

7.2.1 Labels scope

Labels defined outside of a procedure are global in name scope. Global labels can not be double defined.

Labels defined inside PROC ... ENDP construct are local in namespace to the procedure. Hence there can be multiple labels with the exact same name as long as they reside in different procedures.


Structures are defined like this:


STRUC {structure_name}
	{member_name1}	{data_definition_keyword}	{data_item}
	{member_name2}	{data_reserve_keyword}		{count}

For example:

		packet_ptr		dd	?
		packet_id		dd	?
		packet_mac_src		rb	16
		packet_mac_dest		rb	16

		drv_id			dd	?
		drv_name		rb	128
		eth_status		dd	?
		packets_buff		rs	ETH_PACKET,1024

As you can see structures can contain other structures. Once a structure is defined it can be used in subsequent data definitions.

Access to it's members can be done like this: mov eax,[esi + ETH_DRV.eth_status]

And it's size can be obtained like this: add esi, SIZE ETH_DRV

Also you can obtain the offset of a member inside a structure like this: mov eax, ETH_DRV.eth_status


Hence this code is also valid:

	mov eax, ETH_DRV

And it will move the size of ETH_DRV structure into eax.

For clarity reasons the use of SIZE is recommended whenever possible.

You can access structure members like this:


	my_driver 	rs	ETH_DRV,16
	mov	esi,my_driver
	mov	eax,[esi + ETH_DRV.packets_buff.packet_id]

7.3.1 UNIONS

You can define unnamed UNIONS inside a structure.


		{member_name1}	{data_definition_keyword}	{data_item}
		{member_name2}	{data_reserve_keyword}		{count}

For Example:

	struc pixel_format
		flags1   dd   ?

			r_mask   dd   ?
			y_mask   dd   ?
				rx_mask      dd   ?
				ry_mask      dd   ?

		flags2   dd   ?

			g_mask   dd   ?
			u_mask   dd   ?      

And you can access any UNION member just like any other structure member.


Procedures are defined like this:


PROC {proc_name} {proc_call_convention_type}
	USES	{uses_list}
	ARG	{arg_list}
	LOCAL	{local_list}

	; some code



For example:

	PROC Test_01 stdcall
		USES	esi,edi
		ARG	wnd_handle, wnd_action
		LOCAL	count, my_var1, my_var2

		mov	esi,[wnd_handle]
		mov	ecx,100

		mov	eax,[esi]
		test	eax,eax
		jz	finish

		add	[count],eax

		dec	ecx
		jnz	loop_here

		mov	eax,[count]

Known calling conventions

SOL_ASM will automatically generate PROLOGUE and EPILOGUE code and will generate code for handling of USES, ARG and LOCAL variables as needed.

Known calling conventions are:

Additionally you can use the "varg" statement to mark a procedure that uses variable arguments count.


For PROC's defined as NOFRAME Sol_Asm will not emit prologue and epilogue code but will emit PUSH/POP code for USES statements if present. In this case you should write the prologue and epilogue code yourself.

Default arguments and locals sizes:

This can be overwritten if they have a structure type like this:

	PROC Test_02 stdcall
		USES	esi,edi
		ARG	wnd_handle, wnd_action
		LOCAL	my_var2 :MCTX  l_point :POINT_3D

7.4.1 PROC Local buffers

You can define a local procedure buffer like this:

	PROC Test_02 stdcall
		USES	esi,edi
		ARG	wnd_handle, wnd_action
		LOCAL	my_var1,  my_buff [32],  my_var2 :MY_CTX [32]

This will define a buffer of 32 dwords starting at "my_buff" and a 32 * SIZE MY_CTX buffer / vector at my_var2


For example: in PROC Test_02 above incrementing address from "my_buff" will hit "my_var1" and not "my_var2"


Procedures or imported functions can be used with INVOKE syntax:


	INVOKE {function_name}, {param1},{param2}, ... {paramN}

For example:

	invoke	Str_Printf,ods_fname,ods_fname_fmt,[pass_nr]
	invoke	OS_File_Create,ods_fname
	mov	[ods_fhandle],eax

	mov	eax,[My_Dynamic_Proc]	
	invoke	eax,ecx,edx

Depending on each procedure definition Sol_ASM will handle calling conventions details.


CINVOKE is a variation for invoke that will assume CDECL convention and will not perform parameter count checking.


SOL_ASM contains a MACRO processor that supports nested and recursive macros with VARARG and checked arguments.

7.6.1 Define MACRO

A MACRO is defined like this:


	MACRO {macro_name}
		MARG	{ marg_list [:REQ] [:VARARG] }

		; some code



For example:

; define and output a simple string
; note: @ means local symbol for macros
	MARG	mpar1
	#ifdef SHOW_DEBUG

		jmp	@over1
			@mstring1	db	mpar1,0

		invoke	Str_Len,@mstring1
		invoke	OS_File_Write_Dbg,[ods_fhandle],@mstring1,eax


And can then be used like this:

ODS_str	<13,10,"-------- Listing Sections -------">

Inside a MACRO the "@" prefix means that the symbol is local to this MACRO and will get a different name each time the MACRO is expanded.



A macro can have a variable number of arguments.

For example:

; define and output a formatted string
; note: @ means local symbol for macros
	MARG	mfmt, arg_list :VARARG

	jmp	@over1	
		@mstring1	db	mfmt,0

	invoke	Str_Printf, sz_buff1, @mstring1, arg_list
	invoke	OS_File_Write_Dbg, [ods_fhandle], sz_buff1, eax


And can then be used like this:

ODS_fmt	<13,10,"Section:%u RVA=%x VSIZE=%x Name=%s">,ecx,[esi+PE_SECT.rva],[esi+PE_SECT.vsize],esi

7.6.3 MACROS with :REQ

The ":REQ" MARG type can be used to force MACRO parameter number check up to a specific argument position.

For example:

	MARG	a1,a2,a3,a4 :req , a5

	mov	eax,a1
	mov	ebx,a2
	mov	ecx,a3
	mov	edx,a4


On macro invocation this will check for 4 macro arguments. And because of this "a5" can be missing but "a4" can not.

7.6.4 Nested MACROS

You can define a macro inside another macro... and so on.

For example:

	MARG arg1,arg2

	mov	eax,arg1

		MARG arg3,arg4
		mov	eax,arg3
		push	arg4
	push	eax
	push	arg2


On first invocation of M2 only it's body will be generated and M3 will be defined but not expanded.

7.6.5 Using "&" in MACROS

In MACRO body the "&" character will trigger a MARG check and expansion even if found in the middle of another token or string.

For example

	MARG arg1 arg2

	mov	eax,<&arg1>
	db	" In strings: &arg2",0


7.6.7 Recursive Macros

A macro can invoke itself recursively.

For example:

	MARG	p1,p2,p3,p4

	#ifnb <&p1>
		push	p1
		MPUSH	p2,p3,p4

7.6.7 Using EXITM in macros

EXITM can be used to return a token from a MACRO expansion.

For example

	MARG func, params

	invoke func,params

	; return something from macro
	exitm eax

; later on in code
	mov	ecx,RV GetModuleHandle
	invoke	ExitProcess, < RV GetModuleHandleA >
	push	RV GetModuleHandleA


7.6.8 The REPT Macro

You can use REPT to repeat a series of instructions.

For example

	x = 7

	REPT 12
		shl	eax,x
		add	ecx,3
		x = x+1

7.6.9 The FOR Macro

You can use FOR to repeat a series of instructions for each item in a list.


	FOR {item} IN: {items list} {REV} DO
		{ for macro body }

Sol_Asm will expand the {for macro body} for each element in {items list} and will replace any occurrence of {item} in the {macro body} with current {items list} element.

The "REV" keyword is optional and if present then the {items list} will be parsed in reversed order.

FOR can be used to iterate the variable parameters of a MACRO.

For example

MACRO my_invoke
	MARG func :req, params :vararg

	FOR item IN: params REV  DO
		push   item

	call	func

The above sample will define your own INVOKE like macro and you can later on use it like this:

	my_invoke	My_Func,eax,0,1,"123",[ecx]

7.7 Conditional Assembly

You can conditionally eliminate a block of source code at compile time by using the following directives:

Directive Description
#ifdef {symbol} if symbol is defined
#ifndef {symbol} if symbol is not defined
#ifb {token} if token is blank
#ifnb {token} if token is not blank
#if_used {symbol} if symbol is used in code
#if_not_used {symbol} if symbol is not used in code
#if {condition} if condition is true


	#ifdef {symbol_name}

		; code block for true
		; code block for false


For example:

	; this checks the command line /binary option
	#ifdef /binary
		org	0B000h
		disp	0B000h
	#if $ >= 512
		#echo "boot sector address overflow: %x", $

Observe how command line options get auto promoted as EQU symbol and can be tested by #ifdef

#ifdef can be nested on multiple levels so the following example is valid also.

		mov	eax,1
			mov	esi,32h
			#ifdef	LUCKY
				mov	ecx,33h
				mov	ecx,11h
			mov	esi,16h
		mov	edi,88h
		mov	eax,2

			mov	esi,32h
			#ifdef	LUCKY
				mov	ecx,33h
				mov	ecx,11h
			mov	esi,16h
		mov	edi,77h

7.8 Runtime High Level .IF and friends

You can use runtime high level .IF .ELSEIF .ELSE .ENDIF constructs in SOL_ASM.

Sol_ASM will generate the needed compare, jump code and labels internally. This internal code generation is preformed much faster than a MACRO can do.


	.IF {operand1} {condition_a} {operand2}

		; code block for {condition_a} true

	.ELSEIF  {operand3} {condition_b} {operand4}

		; code block for {condition_b} true

		; code block for all above conditions false


For example:

	.if [parse_mode] == 1
		.if [parse_status] == 1
			mov	ecx,1
		.elseif [parse_status] == 2
			mov	ecx,2
		.elseif [parse_status] <= 7
			mov	ecx,7
			mov	ecx,-1
	.elseif [parse_mode] == 2
		mov	edx,2
	.elseif eax == swap "xor "
		mov	edx,7
		mov	edx,-1

Known condition operators are:

Operator Description Flag checked
"==" equal ZF = 1
"!=" not equal ZF = 0
"<" unsigned smaller CF = 1
">" unsigned greater (NBE)
"<=" smaller or equal (BE)
">=" greater or equal (NC)
"zero?" Z flag (Z)
"zero?" not Z (NZ)
"carry?" Carry (C)
"!carry?" not Carry (NC)
"sign?" S flag SF
"!sign?" not signed SF
Overflow? OF = 1 OF
!Overflow? OF = 0 OF
parity? P = 1 PF
!parity? P = 0 PF


7.8.1 Using multiple conditions

You can use multiple conditions in .IF like this:


	.if ( eax == 1 .or. ecx == 2 ) .and. esi != 7
	.elseif dl == "a" .or. dl == "b" .or. dl == "s"

7.8.2 Using signed conditions

By default all comparations in a .IF are unsigned.
You can use signed conditions in .IF by prefixing the condition with the signed keyword like this:


.if signed edx > = [edi + HTML_CTX.wnd_dy]
	; flag done		
	mov	eax,1


You can use high level REPEAT ... UNTIL constructs. SolAsm will generate the needed code.


		{repeat body}
	.UNTIL {condition}	


	mov	ecx,17
		mov	edx,0
			inc	edx
		.until edx > 7

		dec	ecx
	.until ecx == 0

7.8.4 .WHILE .ENDW

You can use high level WHILE ... ENDW constructs. SolAsm will generate the needed code.


	.WHILE {condition} 
		{while body}


	mov	ecx,17
	.while ecx > 1
		mov	edx,0
		.while edx < 7 
			inc	edx
		dec	ecx


ENUM is a kind of auto generated EQU sequence. Sol_Asm will auto increment the values and will check for limits.

You can define ENUMS like this:


	ENUM {enum_name},{start_value},{max_value}
		{enum name items}


ENUM Modes,77h,ffh

Sol_ASM will generate: MODE_1 EQU 77h , MODE_2 EQU 78h ... and so on for each ENUM item in sequence and will check for limits.


7.10 DEFINE text equates

DEFINE creates symbolic constants for text or strings. It behaves like a kind of EQU for strings and tokens.

This allows you to:


	DEFINE {symbolic_name},{text}

An alternative name for DEFINE is TEQU


	define	text1	"planet earth"
	define	text2	< swap "ecx" >
	define	text3	ebx
	define	text4	[esi+4]
	define	text5	STRCUT has_ebx_inside,5,3

	define	and	xor


	my_stting	db	text1	; in fact "planet earth"

	mov	eax,text2	; in fact mov eax,swap "ecx"
	mov	eax,text3	; in fact mov eax,ebx
	mov	eax,text4	; in fact mov eax,[esi+4]

	mov	eax,text5	; in fact mov eax,ebx		

	and	eax,eax		; in fact XOR eax,eax			

Textequ Types

Defined text equates have some subtle types attached:

7.11 STRING Functions

String functions allow you to operate on strings in text equates.

The folowing functions are available

Function Description Notes
STRCUT Extract a sub string from a string
STRADD Add two strings
STRLEN Obtain Length of string the result is a numeric token

7.11.1 STRCUT

STRCUT will extract a sub string from a source string


	STRCUT {source},{start_pos},{length}


 define	ebx1	STRCUT has_ebx_inside,5,3	; ebx		type token 
 define	ebx2	STRCUT "has_ebx_inside",6,3	; "ebx"		type string
 define	ebx3	STRCUT [ebx+ecx],1,3		; [ebx]		type ModRM

The result of STRCUT has the same type as the source

7.11.1 STRADD

STRADD will add two strings together.


	STRADD {string1},{string2}


 define	txt1	STRADD "planet"," earth"	; "planet earth" 	string 
 define	txt2	STRADD in,voke			; invoke		token
 define	txt3	STRADD [ebx],[+ecx]		; [ebx+ecx]		ModRM

The result of STRADD has the same type as string2

7.11.3 STRLEN

STRLEN will return the length of a string.


	STRLEN {string1}


 len1	equ	STRLEN "planet"	
 len2	equ	STRLEN invoke
 len3	equ	STRLEN [ebx+ecx]

 len4	equ	STRLEN STRADD "planet"," earth"	
 define	txt1	STRCUT "has_ebx_inside",6, STRLEN "ebx"

7.12 #ECHO

The #ECHO directive allows you to emit formated message text at compile time. This can be used to debug macros or inform user of compile stages.


	#ECHO {format string},{arg1},{arg2},... 


MY_EQU equ 1234
define my_str " this is a string message"


 #echo "\n code end=%x section base=%x, my_equ=%u string=%s",$,$$$,MY_EQU,my_str 


As a format specificator you can use one of the folowing:

Format Description
%x Hexadecimal number
%u unsigned decimal number
%d signed decimal number
%s an ASCII null terminated string
\n new line (CR+LF)
\t TAB
%% the "%" ASCII char itself
\\ the "\" ASCII char itself


The OPTION directive is used to setup compiler optional behaviour.


	OPTION {option_type}, [ {option_value} ] 

The folowing options are available

Option Description
list_on activates listing output
list_off deactivates listing output
proc_align { value } setups alignment for PROC (default is 16 bytes)

7.14 #LOAD

This directive allows you to read a value from compiled code or data at compile time.


	#LOAD {equ_name}, [byte/word/dword/qword] {address}  

For Example:

	my_db	db	1
	#load	x,byte my_db
	#echo " x=%x",x

7.15 #STORE

This directives allows you to write a value to compiled code or data at compile time.


	#STORE {address}, [byte/word/dword/qword] {value}  

For Example:

	my_db	db	1
	#store	my_db, byte 55h

Chapter 8. Resource compiler

Sol_Asm does contain a mini resource compiler.

It can parse some RC scripts elements and can generate an "in memory" templates for them.

In resource scripts Sol_ASM does support C style hexadecimal constants.

8.1 Resource ID's

You can define a resource ID like this:


	#define		{ID value}

For Example:

	#define IDD_DLG1 1000
	#define IDC_BTN1 1001
	#define IDC_EDT1 1002
	#define IDC_BTN2 1003

8.2 Dialogs

You can define a DIALOG like this:


	{dialog_id} 	DIALOGEX {dlg_x},{dlg_y},{dlg_dx},{dlg_dy}
	CAPTION		{caption string}
	STYLE		{style value}
		{ control definitions }

You can define a CONTROL like this:


	CONTROL {caption},{id},{"class"},{flags},{x},{y},{dx},{dy},{flags_ex}

For Example:

#define IDD_DLG1 1000

#define IDC_BTN1 1001
#define IDC_EDT1 1002
#define IDC_BTN2 1003
#define IDC_STC1 1004

IDD_DLG1 	DIALOGEX 	57,7,258,158
CAPTION 	"Sol_Asm Dialog 01"
STYLE		0x10CF0000

 CONTROL "Save",	IDC_BTN1,"Button",	0x50010000,	134,114,50,13,	0x00000000
 CONTROL "Exit",	IDC_BTN2,"Button",	0x50010000,	196,112,42,15,	0x00000000
 CONTROL "Name",	IDC_STC1,"Static",	0x50000000,	12,24,22,8,	0x00000000
 CONTROL "Text Edit",	IDC_EDT1,"Edit",	0x50010000,	50,22,134,11,	0x00000200


You can define a MENU like this:


	{menu_id} 	MENUEX 
		POPUP {"text"},{id}

			MENUITEM {"text"},{id}

For Example:


#define IDR_MENU 	10000
#define IDM_File 	10001
#define IDM_File_Open 	10004
#define IDM_File_New 	10005
#define IDM_File_Exit 	10009
#define IDM_Edit	10002
#define IDM_Edit_Cut	10006
#define IDM_Edit_Copy	10007
#define IDM_Edit_Paste	10008

	POPUP "File",IDM_File

		MENUITEM "Open",IDM_File_Open
		MENUITEM "New",IDM_File_New
		MENUITEM "Exit",IDM_File_Exit

	POPUP "Edit",IDM_Edit
		MENUITEM "Cut",IDM_Edit_Cut
		MENUITEM "Copy",IDM_Edit_Copy
		MENUITEM "Paste",IDM_Edit_Paste

8.4 Emit Compiled Resources

You can emit a compiled resource as a data item like this:


	EMIT_RSRC {resource_id}

For Example:

align 32


align 32



and in your code you can write:

   	invoke	DialogBoxIndirectParamA,[hInstance],my_dialog,0,Dlg_Proc,0

	invoke	LoadMenuIndirectA,my_menu

Chapter 9. Listing

Sol_Asm can produce a listing file when the "-list" command line option is used.

Listing columns format:

{include_level} {macro_level} {flag} {address} {program text} {opcodes}

Include Level column

Shows the depth of include file nesting.

Macro level column

Shows the depth of macro expansion nesting

Flag column

It is an internal flag to Sol_Asm and changes often for debugging. Currently it shows if there is a need for a new pass to solve a symbol.

Address column

Shows the address for current line being assembled. For OBJ formats it shows the offset in section since the final address will be setup by the linker.

Program text column

Shows the program source text.

This includes:

Opcode Column

It shows the CPU opcodes or data generated by Sol_Asm for each source line as a series of hexadecimal bytes.

Opcode column is aligned to column 128 if possible and expands up to column 224.

If more opcodes are needed then a new row is generated. If more than 4 rows are needed then an ellipsis "..." is shown and further opcodes are not shown anymore.


Listing Example:

1 0 0 00401047	
1 0 0 00401047		;--------------------------
1 0 0 00401047		; make up a build date 
1 0 0 00401047		;--------------------------
1 0 0 00401047		mov	esi,build_time				BE 3C A4 42 00 
1 0 0 0040104C		
1 0 0 0040104C		xor	eax,eax					33 C0 
1 0 0 0040104E		xor	edx,edx					33 D2 
1 0 0 00401050		xor	ecx,ecx					33 C9 
1 0 0 00401052		
1 0 0 00401052		mov	ax,[esi + OS_TIME.year]			66 8B 46 00 
1 0 0 00401056		mov	cx,[esi + OS_TIME.month]		66 8B 4E 02 
1 0 0 0040105A		mov	dx,[esi + OS_TIME.day_of_month]		66 8B 56 06 
1 0 0 0040105E	
1 0 0 0040105E		invoke	Str_Printf,sz_tmp1,sz_fmt_bld1,eax,ecx,edx
1 0 0 0040105E	push edx  						52 
1 0 0 0040105F	push ecx  						51 
1 0 0 00401060	push eax  						50 
1 0 0 00401061	push sz_fmt_bld1  					68 69 A0 42 00 
1 0 0 00401066	push sz_tmp1  						68 00 16 43 00 
1 0 0 0040106B	call Str_Printf  					E8 B0 07 00 00 
1 0 0 00401070	add esp, 00000014h  					83 C4 14 

Appendix.1 Other issues

A1.1 Namespaces

Sol_Asm does use separated NAMESPACES for:

Because of this you can have a PROC with the same name as a STRUC but not two PROC's or two STRUC's with the same name.

However for now this is not under the control of the programmer and hence it is advised to avoid such coding practice because you can not control the order in witch SOl_ASM searches the separated namespaces.

It is my intention to provide a mechanism for controlling and defining namespaces to the user.

A1.2 System requirements


Sol ASM does require a 386 CPU as a minimum and does benefit form new advanced CPU's.


SOL_ASM pre allocates approximatively 24Mega bytes at startup.

Each section gets 1M at define time and that is eventually reallocated when needed.

Additional memory is allocate when needed for files, imports, macro's etc


Sol asm was tested on WinXP, Solar OS and WinXP64 but it should also work on Win95, win98, win2k, win2003 and Vista

Starting from version 14.02 Sol_Asm also runs on Linux and on UNIX like OSes that can link Sol_Asm OBJ against a limited set of LIBC functions.

A version for Mac OS X is also available in ELF OBJ format. You can use Agner Fog's OBJCONV program to convert it to MACH-O and link to LIBC to obtain the executable on your Mac.

A1.3 Speed testing

Speed testing was performed on two big projects: Sol_Asm itself and Solar_OS.

Synthetic testing was performed on files with 10.000 or 100k PROC's

For Example:

Solar Assembler version 0.10.01
Copyright (C) 2004-2008 Bogdan Valentin Ontanu, All rights reserved.
Build on 2008_2_23  at 7:14:23

Assembling file: sol_asm2.asm
Assembler  pass: 1
Assembler  pass: 2
Assembler  pass: 3
Assembler  pass: 4
Assembler lines: 67866
Output    bytes: 192512
Assembler  time: 406 ms

4 pass x 67.866 lines = 271.464 lines in 406 ms --> 668.630 lines per second

For Example:

Solar Assembler version 0.10.01
Copyright (C) 2004-2008 Bogdan Valentin Ontanu, All rights reserved.
Build on 2008_2_23  at 7:14:23

Assembling file: system_32.asm
Assembler  pass: 1
Assembler  pass: 2
Assembler  pass: 3
Assembler lines: 111403
Output    bytes: 534016
Assembler  time: 578 ms

3 pass x 111.403 lines = 334.209 lines in 578 ms --> 578.216 lines per second

This are real projects with many PROC's, STRUC's, MACRO's and code.

Testing was performed on an laptop with an Intel Core 2 Duo CPU at 2Ghz and with 1G of RAM in WinXP 32.

Appendix.2 Known keywords




8 bit registers
"al"    "r8l"       "spl"
"cl"    "r9l"       "bpl"
"dl"    "r10l"      "sil"
"bl"    "r11l"      "dil"
"ah"    "r12l"
"ch"    "r13l"
"dh"    "r14l"
"bh"    "r15l"
16 bits registers
"ax"    "r8w"      "es" 
"cx"    "r9w"      "cs" 
"dx"    "r10w"     "ss" 
"bx"    "r11w"     "ds" 
"sp"    "r12w"     "fs" 
"bp"    "r13w"     "gs" 
"si"    "r14w" 
"di"    "r15w" 
32 bits registers
"eax"     "r8d"  
"ecx"     "r9d"  
"edx"     "r10d" 
"ebx"     "r11d" 
"esp"     "r12d" 
"ebp"     "r13d" 
"esi"     "r14d" 
"edi"     "r15d" 
64 bits registers
"rax"     "r0"      "r8" 
"rcx"     "r1"      "r9" 
"rdx"     "r2"      "r10"
"rbx"     "r3"      "r11"
"rsp"     "r4"      "r12"
"rbp"     "r5"      "r13"
"rsi"     "r6"      "r14"
"rdi"     "r7"      "r15"

MMX registers       
FPU registers       
XMM registers       

Instructions and directives

0	mov					 
1	lea					 
2	movzx					 
3	movsx					
4	bswap					 
5	xchg					 
6	xor					 
7	cmp					 
8	add					 
9	sub					 
10	or					 
11	and					 
12	sbb					 
13	adc					 
14	shl					 
15	shr					 
16	sar					 
17	rol					 
18	ror					 
19	rcl					 
20	rcr					 
21	sal					 
22	shld					 
23	shrd					 
24	test					 
25	not					 
26	neg					 
27	inc					 
28	dec					 
29	div					 
30	idiv					 
31	mul					 
32	imul					 
33	call					 
34	jmp					 
35	loop					 
36	ret					 
37	retn					 
38	int					 
39	int3					 
40	into					 
41	iret					 
42	iretd					 
43	hlt					 
44	leave					 
45	push					 
46	pushad					 
47	pusha					 
48	pushfd					 
49	pushf					 
50	pop					 
51	popad					 
52	popa					 
53	popfd					 
54	popf					 
55	jo					 
56	jno					 
57	jc					 
58	jnc					 
59	jb					 
60	jnb					 
61	jnae					 
62	jae					 
63	jz					 
64	jnz					 
65	je					 
66	jne					 
67	jbe					 
68	jnbe					 
69	jna					 
70	ja					 
71	js					 
72	jns					 
73	jpe					 
74	jpo					 
75	jl					 
76	jnl					 
77	jnge					 
78	jge					 
79	jle					 
80	jnle					 
81	jng					 
82	jg					 
83	rep					 
84	movsb					 
85	movsd					 
86	movsw					 
87	stosb					 
88	stosd					 
89	stosw					 
90	lodsb					 
91	lodsd					 
92	lodsw					 
93	scasb					 
94	scasd					 
95	nop					 
96	clc					 
97	stc					 
98	daa					 
99	das					 
100	cbw					 
101	cdq					 
102	cld					 
103	cmc					 
104	aaa					 
105	aas					 
106	lahf					 
107	lock					 
108	cpuid					 
109	rdtsc					 
110	aad					 
111	aam					 
112	out					 
113	in					 
114	finit					 
115	fninit					 
116	fld					 
117	fild					 
118	fst					 
119	fstp					 
120	fistp					 
121	fadd					 
122	faddp					 
123	fiadd					 
124	fsub					 
125	fisub					 
126	fdiv					 
127	fdivrp					 
128	fmul					 
129	fmulp					 
130	fimul					 
131	fxch					 
132	fucompp					 
133	fclex					 
134	fnclex					 
135	fnop					 
136	fchs					 
137	fabs					 
138	ftst					 
139	fxam					 
140	fld1					 
141	fldl2t					 
142	fldl2e					 
143	fldpi					 
144	fldlg2					 
145	fldln2					 
146	fldz					 
147	f2xm1					 
148	fyl2x					 
149	fptan					 
150	fpatan					 
151	fxtract					 
152	fprem1					 
153	fdecstp					 
154	fincstp					 
155	fprem					 
156	fyl2xp1					 
157	fsqrt					 
158	fsincos					 
159	frndint					 
160	fscale					 
161	fsin					 
162	fcos					 
163	emms					 
164	sidt					 
165	lidt					 
166	lgdt					 
167	sgdt					 
168	cli					 
169	sti					 
170	wbinvd					 
171	xlat					 
172	db					 
173	dw					 
174	dd					 
175	dq					 
176	dt					 
177	do					 
178	real4					 
179	real8					 
180	real10					 
181	rb					 
182	rw					 
183	rd					 
184	rq					 
185	rt					 
186	ro					 
187	rs					 
188	equ					 
189	align					 
190	proc					
191	uses					 
192	arg					 
193	local					 
194	endp					 
195	.if					 
196	.elseif					 
197	.else					 
198	.endif					 
199	#ifdef					 
200	#ifndef					 
201	#else					 
202	#endif					
203	#ifnb					 
204	#ifb					 
205	#if_used				 
206	#if_not_used				
207	macro					 
208	endm					 
209	exitm					 
210	rept					 
211	invoke					 
212	cinvoke					 
213	cdecl					 
214	stdcall					 
215	include					 
216	incbin					 
217	incfrom					 
218	import_dll				 
219	from_dll				 
220	import_lib				 
221	from_lib				 
222	import_func				 
223	import					 
224	extern					 
225	export					 
226	alias					 
227	struc					 
228	struct					 
229	ends					 
230	enum					 
231	ende					 
232	.entry					 
233	org					 
234	disp					 
235	.use16					 
236	.use32					 
237	.use64					 
238	section					 
239	class_code				 
240	class_data				 
241	class_imports				 
242	class_relocs				 
243	class_bss				 
244	class_exports				 
245	class_rsrc				 
246	#define					 
247	begin					 
248	end					 
249	dialogex				 
250	caption					 
251	style					 
252	control					 
253	menuex					 
254	popup					 
255	menuitem				 
256	emit_rsrc				 
257	.echo					 
258	$time					 

Appendix.3 Sample programs

A win32 sample application

; Sol_Asm assembler sample
; Copyright (c) 2004-2008, Bogdan Valentin Ontanu
; All rights reserved.

; define imports
from_dll 	kernel32.dll
	import	ExitProcess
	import	GetStdHandle

from_dll	user32.dll
	import	MessageBox alias MessageBoxA

; define sections
section "code" 		class_code
section "data"  	class_data
section "idata" 	class_imports

	sz_message	db	"First Win32 PE application",0
	sz_title	db	"Sol_ASM",0

	; define entry point
	.entry Start

	; the classical message box
	invoke	MessageBox, 0, sz_message, sz_title, 3

	; done here, exit nicely
	invoke	ExitProcess,0

Assuming the file in named: test_win32.asm and Sol_Asm is in path you can build this sample with the following command:

	sol_asm2  test_win32.asm test_win32.exe -pe32

The resulted executable should display a message box when run.