THE UNOFFICIAL ULTIMA UNDERWORLD SPECIFICATIONS
===============================================



0.1 Table of contents

0.1	Table of contents (you're looking at it)
1.1	Summary of data files
2.1	Game strings
2.2	Level maps
2.2.1	Level map header
2.2.2	Level map
2.2.3
2.2.4	Level texture list
2.3
2.3.x	Object IDs
2.4	Graphics files
2.4.1	Graphics file header
2.4.2	Bitmap format
2.5	Conversations



1.1 SUMMARY OF DATA FILES

Filename	    Section	Description

data/3dwin.gr		2.4
data/animo.gr		2.4
data/armor_f.gr		2.4	Armour bitmaps (worn) (female)
data/armor_m.gr		2.4	Armour bitmaps (worn) (male)
data/bodies.gr		2.4
data/buttons.gr		2.4
data/chains.gr		2.4
data/charhead.gr	2.4
data/chrbtns.gr		2.4
data/cnv.ark		2.5	Conversation scripts
data/comobj.dat			Common object properties
data/compass.gr		2.4	Your compass (bottom of view screen)
data/converse.gr	2.4
data/cursors.gr		2.4
data/doors.gr		2.4
data/dragons.gr		2.4	Those cute dragons below the viewscreen
data/eyes.gr		2.4
data/flasks.gr		2.4
data/gempt.gr	 (UW2)	2.4
data/genhead.gr		2.4
data/ghed.gr	 (UW2)	2.4
data/heads.gr		2.4
data/inv.gr		2.4
data/lev.ark		2.2	Level map archive
data/lfti.gr		2.4
data/objects.gr		2.4
data/opbtn.gr		2.4
data/optb.gr		2.4
data/optbtns.gr		2.4
data/panels.gr		2.4
data/power.gr		2.4
data/question.gr	2.4
data/scrledge.gr	2.4
data/spells.gr		2.4	Active spell icons
data/strings.pak	2.1	Game strings
data/tmflat.gr		2.4
data/tmobj.gr		2.4
data/views.gr		2.4
data/weap.gr	 (UW2)	2.4
data/weapons.gr	 (UW1)	2.4


2.1 GAME STRINGS, file data/strings.pak

This file uses a Huffman compression scheme to store its strings. The first 
2
bytes of the file gives the number of nodes in the tree. Then follow the
nodes themselves, 4 bytes each:

0000	char	Symbol
0001	int8	Parent node
0002	int8	Left child
0003	int8	Right child

The last node stored in the file is the head of the tree.
Following the nodes is a 16-bit word giving the number of string blocks in 
the
file. Then follows the block directory, 6 bytes per block as follows:

0000	int16	Block ID
0002	int32	Offset in file of start of block

Each block contains a variable number of strings. The block header is:

0000	int16	No. of strings
0002	int16	Relative offset from end of block header

Strings are compressed using the Huffman tree in the usual way. Bits are
extracted big-endian i.e. rotated out of the top of each byte in turn.
Starting with the root node (last node), if a 1 bit is encountered the right
branch is taken, otherwise take the left. Repeat until a leaf (node with -1
for its children) is reached, at which point output the symbol for that 
node.
For the next bit we start again from the root. End of string is marked with 
a
`|' character.






2.2   LEVEL MAPS, file data/lev.ark


2.2.1 Level map header

2.2.2 Level map

Object information, normal object

This is 8 bytes long; it is convenient to consider it as 2 32-bit words 
split
along non-byte boundaries as follows

Word 1		 0- 9	Object ID (see below)
(bytes 0-3)	10-15	[unknown] Alistair calls this Unk1
		16-22	Object Z position (0-127)
		22-25	Orientation (*45 deg)
		26-28	Object Y position (0-7)
		29-31	Object X position (0-7)

Word 2		 0- 5	Quality
(bytes 4-7)	 6-15	Link1 (object chain)
		16-21	[unknown] Alistair calls this Unk2
		22-31	Quantity / Link2

Note: I got this from examining the dump from Alistair Brown's rather 
excellent
UW2 editor. Get it at http://


2.2.3

2.2.4 Level texture list

For Underworld I this contains 48 16-bit words for the wall textures (64x64
texture ID in data/w64.tr), followed by 10 16-bit words for the floor 
textures
(32x32 texture ID in data/f32.tr), followed by 6 bytes whose meaning I 
haven't
yet deciphered.

For Underworld II this simply consists of 64 16-bit words giving the main
(64x64) texture ID, in data/t64.tr, to use for each possible map texture.






2.3

2.3.x Object IDs

0000-001F	Weapons and missiles
0020-003F	Armour and clothing
0040-007F	Monsters
0080-008F	Containers
0090-0097	Light sources
0098-009F	Wands
00A0-00AF	Treasure
00B0-00BF	Comestibles
00C0-00DF	Scenery and junk
00E0-00FF	Runes and bits of the Key of Infinity
0100-010F	Keys, lockpick, lock
0110-011F	Quest items
0120-012F	Inventory items, misc stuff
0130-013F	Books and scrolls
0140-014F	Doors
0150-015F	Furniture
0160-016F	Pillar, some decals, force field, special tmap (?)
0170-017F	Switches
0180-019F	Traps
01A0-01BF	Triggers
01C0-01CF	Explosions/splats, fountain, silver tree, moving things







2.4 GRAPHICS FILES, data/*.gr

2.4.1 Graphics file header

0000		int8	?? always seems to be 1
0001		int16	no. bitmaps
0003 - xxxx	int32	offset to bitmap





2.4.1 Bitmap format

00	int8	type	:	04 8-bit uncompressed
				08 4-bit run-length
				0A 4-bit uncompressed
01	int8	width
02	int8	height

For the 4-bit formats (08 and 0A) there follows a byte indicating which of 
the
auxiliary palettes in data/allpals.dat to use. This file is simply a set of
16-byte tables containing, for each possible nybble, the index in the main
palette it represents in this bitmap.

Then follows, for all formats, a 16-bit word giving the size of the data
stored in the file. NOTE however that this depends on the word length; for
8-bit formats it is the number of bytes but for 4-bit formats it is the 
number
of nybbles.

After that follows the bitmap data itself. Palette index 0 is transparent.

UNCOMPRESSED BITMAPS (type 04)

These are straightforward and should require no additional explanation 8-)

4-BIT RUN-LENGTH COMPRESSED BITMAPS (type 08)

Type 8 (4-bit run-length) bitmaps are a little more interesting. The word
length is 4 (nybbles); we take the high nybble first if we only need one 
from
a byte. (in general for LG files, if we only need part of a byte take the 
high
bits first and save the low for later).

A _count_ is obtained as follows: take a nybble, call it c. If c == 0, take 
the
next 2, then c = (n1 << wordsize) + n2. [wordsize is 4 in this case] if
c="=" 0 still, take the next 3 nybbles, then c="(n1" <<
2*wordsize) + (n2 << wordsize) + n3. i haven't encountered any
case where more than this is needed. a count is therefore between
1 and 6 nybbles long. there are 2 types of record: run of bytes
and repeated byte. (this should come as no surprise). a run
record consists of a count followed by that number of nybbles;
for each of these the byte output is the palette index in the
auxiliary palette corresponding to that nybble. a repeat record
consists of a count followed by a single nybble; the
corresponding palette index is written (count) times. we start
off with a repeat record and (normally) alternate between repeats
and runs. however, as there is no point in repeating a nybble
fewer than 3 times, counts 1 and 2 in a repeat record are
special. 1 skip this record. no repeat is performed, the next run
follows immediately. this is used only at the very beginning of
the compressed data if it should start with a run rather than a
repeat. 2 multiple repeats. get another count, then extract that
number of repeat records before the next run. 3+ normal repeat
record, this is the repeat count. it also looks as if a run
record containing a single zero byte (10) marks the end of the
compressed data, but this is not always present. note that there
also exists a 5-bit compressed format which is exactly the same
as the above except that the word length is 5 bits instead of 4.
this is used for critter animation frames in the crit/
subdirectory. the auxiliary palette contains 32 entries and is
stored with the animation. 4-bit uncompressed bitmaps (type 0a)
these are simple enough: for each nybble in the file, the colour
index in the bitmap proper is the corresponding index in the
auxiliary palette. 2.5 conversation scripts, data/cnv.ark this
file controls the conversations with the various npcs in the
game. the header is very simple, and consists of the number of
available conversation slots followed by the file offset to each
conversation. 0000 int16 number of conversation slots (not all
need be used) 0002 .. int32 file offsets to conversations the
name of the npc involved in conversation n is in string (n+16) in
chunk 6 of strings.pak . if a conversation is absent its offset
will appear as zero in the file header. each conversation has a
header of its own as follows: 0000 int32 ?? 0004 int32 length of
code in 16-bit words following the imported functions list is the
code itself. conversation code is run on a 16-bit stack-based
virtual machine. opcode no. immediate operands | name | no. stack
operands | | | | no. values saved to stack | | | | | action | | |
| | | 00 nop 0 0 0 do nothing. 01 opadd 0 2 1 push s[0] + s[1] 02
opmul 0 2 1 push s[0] * s[1] 03 opsub 0 2 1 push s[1] s[0] 04
opdiv 0 2 1 push s[1] / s[0] 05 opmod 0 2 1 push s[1] % s[0] 06
opor 0 2 1 logical or of top two values. 07 opand 0 2 1 logical
and of top two values. 08 opnot 0 1 1 logical not of top value.
09 tstgt 0 2 1 greater-than. nonzero if s[1]> s[0].
0A  TSTGE	0 2 1  Greater-than-or-equal.
0B  TSTLT	0 2 1  Less-than.
0C  TSTLE	0 2 1  Less-than-or-equal.
0D  TSTEQ	0 2 1  Equality. Nonzero if s[1] == s[0].
0E  TSTNE	0 2 1  Non-equal.
0F  JMP		1 0 0  Jump absolute. Address is measured in words from the
		       start of the code.
10  BEQ		1 1 0  Branch on equal. Pop a value, branch relative if zero.
11  BNE		1 1 0  Branch on Not Equal. As BEQ but branch if the value
		        popped is non-zero.
12  BRA		1 0 0  Branch. Always branch relative to the offset address.
13  CALL	1 0 1  Call subroutine. Push the next instruction address and
		        jump to the absolute address (in words) given.
14  CALLI	1 0 0  Call imported subroutine.
15  RET		0 1 0  Return from subroutine. Pop the return address off the
		        stack and jump to it.
16  PUSHI	1 0 1  Push immediate value onto the stack.
17  PUSHI_EFF	1 0 1  Push effective address onto the stack. The value pushed
		        is the current frame pointer address plus the immediate
		        operand. This allows local variables and function
		        parameters.
18  POP		0 1 0  Pop a value from the stack (and throw it away).
19  SWAP	0 2 2  Swap the top two stack values.
1A  PUSHBP	0 0 1  Push the current frame pointer onto the stack.
1B  POPBP	0 1 0  Pop the frame pointer from the stack
1C  SPTOBP	0 0 0  New frame. Set the frame pointer to the stack pointer.
1D  BPTOSP	0 0 0  Exit frame. Set the stack pointer to the frame pointer.
1E  ADDSP	0 1 *  Pop a value, add (subtract) to the stack pointer. Used
		        to reserve stack space for variables.
1F  FETCHM	0 1 1  Pop address, push the value of the variable pointed to.
20  STO		0 2 0  Store s[0] in the variable pointed to by s[1].
21  OFFSET	0 2 1  Array offset. Add s[1] - 1 to the effective address in
		        s[0], push this as a new effective address.
22  START	0 0 0  Start program.
23  SAVE_REG	0 1 0  Pop a value from the stack and store it in the result
		        register.
24  PUSH_REG	0 0 1  Push the value of the result register on the stack.
25  STRCMP	? ? ?  String compare.
26  EXIT_OP	0 0 0  End program (?)
27  SAY_OP	0 1 0  NPC says something. Print a conversation string (from
		        the stack).
28  RESPOND_OP 	? ? ?  Respond (?)
29  OPNEG	0 1 1  Negate. s[0] -> -s[0].

(*) ADDSP, of course, doesn't actually push anything onto the stack, but its
     effect on the stack pointer is of pushing as many values as its operand
     specifies.
(?) I haven't yet encountered these in the wild, so don't know exactly what
     they do.