asm6809—6809 cross-assembler
asm6809 [OPTION]… [SOURCE-FILE]…
asm6809 is a portable macro cross assembler targeting the Motorola 6809 and Hitachi 6309 processors. These processors are most commonly encountered in the Dragon and Tandy Colour Computer.
-B
, --bin
-D
, --dragondos
-C
, --coco
-S
, --srec
-H
, --hex
-e
, --exec
addr
-8
, -9
, --6809
-3
, --6309
-d
, --define
sym[=number]
--setdp
value
-P
, --max-passes
n
-o
, --output
file
-l
, --listing
file
-E
, --exports
file
-s
, --symbols
file
-q
, --quiet
-v
, --verbose
--help
--version
If more than one SOURCE-FILE is specified, they are assembled as though they were all in one file.
Text is read in and parsed, then as many passes are made over the parsed source as necessary (up to a limit), until symbols are resolved and addresses are stable. The fastest or smallest representation should always be chosen where there is ambiguity.
Output formats are: Raw binary, DragonDOS binary, CoCo RS-DOS (“DECB”) binary, Motorola SREC, Intel HEX.
Additional optional output files are:
EXPORT
pseudo-op. Suitable for
inclusion in subsequent source files.
Home page: <https://www.6809.org.uk/asm6809/>
Motorola syntax allows a comment to follow any operands, separated from them only by whitespace. To an extent, this assembler accepts that, but be aware that as spaces are allowed within expressions, if the comment looks like it is continuing an expression it will generate bad code (or raise an error if the result is syntactically incorrect). Example:
0000 8605 lda #5 0002 C60A ldb #5 * 2 twice first number
A strict Motorola assembler would generate bytes C6 05 for the second line,
as the “* 2” would be ignored. For consistency, it is best to introduce end of
line comments with a ;
character. An asterisk (*
)
can introduce whole line comments.
An unquoted semicolon always introduces a comment. The alternate form of the
6309 instructions AIM
, OIM
, etc. listed in some
documentation that uses a semicolon as a separator is not accepted.
A symbol may be forward referenced; any time a reference is unresolvable, another pass is triggered, up to some defined maximum.
In 6809 indexed addressing, the offset size will default to the fastest
possible form, e.g. if the offset is an expression that happens to evaluate to
zero, the no offset form will be used. Prepend <<
to coerce a 5 bit offset, <
to coerce 8 bits or
>
to coerce 16 bits.
asm6809 currently has no support for OS-9 modules or multiple object linking.
Program files are considered line by line. Each line contains up to three
fields, separated by whitespace: label, instruction and arguments. An unquoted
semicolon (;
) indicates that the rest of the line is to be
considered a comment. Whole line comments may be introduced with an asterisk
(*
). Motorola-style end of line comments without a
;
are accepted, but see the notes about assembler differences.
Any label must appear at the very beginning of the line. If a label is omitted, whitespace must appear before the operator field. Certain pseudo-ops may affect a label's meaning, but usually labels define a symbol referring to the current position in the code (Program Counter, or PC).
The instruction field contains either an instruction op-code (mnemonic), a pseudo-op (assembler directive), or a macro name for expansion.
Pseudo-ops allow conditional assembly and inline data, can affect code placement and symbol values and be used to include further files inline. See the section on Pseudo-ops for more information.
Arguments are a comma-separated list: either instruction operands or arguments to a pseudo-op or macro. Permitted arguments are specific to the instruction or pseudo-op, but in general they may be:
[
and ]
. This is
generally only used to indicate indirect indexed addressing.
In addition, any argument may be preceded by:
#
, indicate immediate value.
<<
, force 5-bit index offset.
<
, force direct addressing, 8-bit value or 8-bit index offset.
>
, force extended addressing, 16-bit value or 16-bit index offset.
Expressions are formed of:
@
or with a leading 0
.
%
or 0b
.
$
or 0x
.
.
).
-
) or plus (+
).
/
.
The assembler uses multiple passes to resolve expressions. If an expression
refers to a symbol that cannot currently be resolved, an extra pass is
triggered. Similarly, if a symbol is assigned a value (e.g. by an
EQU
pseudo-op) that differs to its value on the previous pass,
another is triggered until it becomes stable.
When not directly used for their contents (e.g. by FCC
),
strings can be used in place of integer values. The ASCII value of each
character is used to represent 8 bits of the integer result up to 32 bits.
Example:
0000 CC443A ldd #"D:"
The following operators are available, listed in descending order of precedence (where operators share a precedence, left-to-right evaluation is performed):
Operator | Description |
---|---|
+ | unary plus |
- | unary minus |
! ~ | logical, bitwise NOT |
* | multiplication |
/ | division |
% | modulo |
+ | addition |
- | subtraction |
<< | bitwise shift left |
>> | bitwise shift right |
< <= | relational operators |
> >= | relational operators |
== | relational equal |
!= | relational not equal |
& | bitwise AND |
^ | bitwise XOR |
| | bitwise OR |
&& | logical AND |
|| | logical OR |
?: | ternary operator |
Division always returns a floating point result. Other arithmetic operators return integers if both operands are integers, otherwise floating point. Bitwise operators and modulo all cast their operands to integers and return an integer. Relational and logical operators result in 0 if false, 1 if true. Integer calculations are performed using the platform's int64_t type, floating point uses double.
The pseudo-ops IF
, ELSIF
,
ELSE
and ENDIF
guide conditional assembly.
IF
and ELSIF
take one argument, which is
evaluated as an integer. If the result is non-zero, the following code will be
assembled, else it will be skipped. Undefined symbols encountered while
evaluating the condition are interpreted as zero (false) rather than raising an
error.
Conditional assembly pseudo-ops are permitted within macro definitions and will be evaluated at the time of expansion, therefore positional variables can be used to affect macro expansion.
Code can be placed into named sections with the SECTION
pseudo-op. This can make breaking source into multiple input files more
comfortable. Without ORG
or PUT
directives,
sections will follow each other in memory in the order they are first defined.
Within each section, there may exist multiple spans of discontiguous data. Certain output formats are able to represent this, for the others (e.g. DragonDOS), the spans are combined first, with the gaps between them padded with zero bytes.
Local labels are considered local to the current section. A local label is any decimal number used in the label field, and the same local label may appear mulitple times, unlike other labels.
As an operand, a decimal number followed by B
or F
is considered to be a back or forward reference to the previous or next
occurrence of that numerical local label in the section.
In this example, the 1
label occurs twice, but each use of
1B
refers to the closest one searching backwards:
0000 8E0400 scroll ldx #$0400 0003 EC8820 1 ldd 32,x 0006 ED81 std ,x++ 0008 8C05E0 cmpx #$05e0 000B 25F6 blo 1B 000D CC6060 ldd #$6060 0010 ED81 1 std ,x++ 0012 8C0600 cmpx #$0600 0015 25F9 blo 1B 0017 39 rts
An exclamation mark (!
) in the label field is treated as a
local label numbered zero. Operands of <
and >
are considered equivalent to 0B
and 0F
respectively,
and can therefore refer to the !
local label. This is included for
compatibility with other assemblers.
As local labels can be repeated, their position is used to distinguish them. For this reason, all file inclusions and macro expansion must occur during the first pass so that the absolute line count at which each local label is encountered remains the same between passes.
Start a macro definition by specifying a name for it in the label field, and
MACRO
in the instruction field. Finish the definition with
ENDM
in the instruction field.
Use a macro by specifying its name in the instruction field. Any arguments given will be available during expansion as a positional variable.
Positional variables can be used within strings, or pasted to form symbol names. In either case, they must be quoted or they will be passed by value, which will result in an error if they do not correspond to valid symbols by themselves.
The positional variables are referred to with \{1}
,
\{2}
, …, \{n}
. For the first nine
arguments, the braces are not required, so \1
, \2
,
…, \9
are valid alternatives. For compatibility with the
TSC Flex assembler, another form is accepted: &{1}
,
&{2}
, …, &{n}
. Within a
string, the shorter &1
, &2
, …,
&9
is still valid, but as this can be confused with bitwise
AND, it is not permitted elsewhere.
Here's a silly example demonstrating positional variables and symbol pasting. Consider the following macro definition and utilising code:
go_left equ -1 go_right equ +1 move macro lda x_position adda #go_\1 sta x_position endm do_move move "right" rts x_position rmb 1
The main code generated is as follows:
0000 do_move 0000 move "right" 0000 B60009 lda x_position 0003 8B01 adda #go_\1 0005 B70009 sta x_position 0008 39 rts
Conditional assembly:
IF
condition
ELSIF
condition
IF
and
ELSIF
pseudo-ops evaluated to false (zero) and
condition evaluates to true (non-zero).
ELSE
IF
and ELSIF
pseudo-ops evaluated to false (zero).
ENDIF
IF
statement.
Macro definition:
MACRO
ENDM
pseudo-op will not be
assembled until the macro is expanded. Macro definitions may be nested; that
is, using a macro may define another macro.
ENDM
MACRO
.
Inline data:
FCB
value[,value]…
FCC
value[,value]…
Historically, FCB
handled bytes and FCC
(Form
Constant Character string) handled strings. asm6809 treats
them as synonymous, but is rather more strict about what is allowed as a string
delimiter.
FCN
value[,value]…
FCC
, but a terminating zero byte is stored
after the data. Included to increase compatibility with other assemblers.
FCS
value[,value]…
Like FCC
, but the last byte in each value has its
top bit set. This is the format used to represent keywords in the Dragon and
Tandy Colour Computer BASIC ROMs.
FCV
value[,value]…
Like FCC
, but ASCII is translated into the values typically
required for display by the MC6847 VDG as present in the Dragon and Tandy
Colour Computer.
FCI
value[,value]…
Like FCV
, but inverts bit 6 for inverse video.
FDB
value[,value]…
FQB
value[,value]…
FILL
value,count
RZB
with its arguments swapped.
RZB
count[,value]
ZMB
count[,value]
BSZ
count[,value]
FILL
with its arguments swapped.
ZMB
and BSZ
are alternate forms
recognised for compatibility with other assemblers.
Code placement & addressing:
ALIGN
alignment[,value]…
RMB
instead.
ORG
address
PUT
pseudo-op, this will also be
the instruction's actual address in memory. A label on the same line will
define a symbol with a value of the specified address.
PUT
address
RMB
count
SECTION
name
CODE
DATA
BSS
RAM
AUTO
Each of CODE
, DATA
, BSS
,
RAM
, and AUTO
switches to a section named
after the pseudo-op. They are recognised for compatibility with other
assemblers.
SETDP
page
DP
) register to
page for subsequent instructions. Any non-negative page
is truncated to 8 bits, or specify a negative number to disable automatic
direct addressing.
See the section on Direct Page addressing for more information.
Symbols:
EQU
value
EXPORT
name[,name]…
SET
value
EQU
, this must be used with a label and defines a
symbol with the specified value. Unlike EQU
, you can
use SET
multiple times to assign different values to the same
symbol without error.
Files:
END
[address]
Optionally specifies an EXEC address to be included in the output, where supported by the output format. An EXEC address specified on the command line will override any value specified here.
INCLUDE
filename
/
characters.
INCLUDEBIN
filename
INCLUDE
must be a delimited string) directly.
The 6809 extends the zero page concept from other processors by allowing
fast accesses to whichever page is selected by the Direct Page register
(DP
). An assembler is not able to keep track of what the code has
set this register to, but the information is useful when deciding which
addressing mode to use for an instruction. The SETDP
pseudo-op, or
--setdp
option, informs the assembler that the supplied value is
to be assumed for DP
. Set this to a negative number to undefine
it and disable automatic use of direct addressing (this is the default).
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.