Reverse engineering "Frank Herbert's Dune"

One of my foremost interests outside of my professional work is enjoying science fiction entertainment. Despite having never read the books, I find the Dune universe by Frank Herbert to be amazing. The strangest work within the Dune universe is the video game "Frank Herbert's Dune" released in 2001 by Cryo Interactive. This game is interesting in that it was the last title released by Cryo Interactive. Wikipedia has some more information about this game. Spoiler: it isn't very good.

Attempting to understand how the video game engine works turned out to be much more interesting.

The game comes with only a few files that the installer places in C:\Program Files (x86)\Frank Herbert's Dune by default. Outside of the obvious dune.exe which is the games executable all the game data is stored in the globals.dun and locale.dun files. So I set out to figure out how to unpack these files and figure out what they contain.

Binary archive formats - a crash course

Anytime you want to pack multiple entries into a single file, you are going to need to use some sort of containter format. Either using some sort of standardized format like zip or tar files, or an ad-hoc format.

It really doesn't matter what format you use, all archive formats basically store data using one of the following strategies

  1. Null or sentinel separated values
  2. Length prefixed values
  3. Type & length prefixed values
  4. Name, type, & length prefixed values

The first option, "null separated" only works when you're doing something simple like trying to store a list of ASCII strings in a file. We can basically be sure that such an approach isn't going to be used for any video game made asset file.

Length prefixed values are much more common. Effectively, each value is preceded by its length in the file. This works well, but you have no idea how to tell what a given entry represents. It could be an image file for a graphics texture or a WAV file for game audio.

If you prefix each entry with a type & length, it is much simpler to work with. Of course, you are still stuck referring to everything by its position in the file. "Entry #632" is not particularly descriptive compared to characters.skins.business_suit. So most archives wind up storing the name of an entry, the type, & length. This is sometimes called a "TLV" format for short.

The one monkey wrench that gets throwing into this is that almost all archive formats include a "header" of some sort. Sometimes the name, type & length is stored in the header with an offset into the file. So it is possible to quickly scan the header to locate any single item and then jump to its location in the archive.

Reverse engineering the asset archives

I needed to figure out exactly what the developers of this game used. Thankfully, they gave us two files which hopefully use the same binary file format. So I used hd to view the start of each file

Start of globals.dun

00000000  72 10 ea f4 00 00 00 00  d4 69 af 1a 52 0c 00 00  |r........i..R...|
00000010  0a 23 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |.#--------------|
00000020  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |----------------|
Start of locales.dun
00000000  72 10 ea f4 00 00 00 00  3b 5f 8a 06 07 07 00 00  |r.......;_......|
00000010  23 20 43 49 4e 45 4d 41  54 49 51 55 45 5f 31 5f  |# CINEMATIQUE_1_|
00000020  53 43 45 4e 45 31 5f 31  0a 43 49 4e 45 4d 41 54  |SCENE1_1.CINEMAT|
00000030  49 51 55 45 5f 31 5f 53  43 45 4e 45 31 5f 31 5f  |IQUE_1_SCENE1_1_|
00000040  31 5f 54 58 54 20 20 20  20 20 52 3a 5c 64 75 6e  |1_TXT     R:\dun|
00000050  65 5c 64 61 74 61 5c 61  6c 6c 5c 6c 6f 63 61 6c  |e\data\all\local|
00000060  65 5c 75 73 5c 74 78 74  5c 54 45 58 54 30 30 30  |e\us\txt\TEXT000|
00000070  31 2e 74 78 74 20 20 20  20 20 30 0a 23 20 43 49  |1.txt     0.# CI|

The only thing you can quickly figure out by looking at this is that both files start with the hex sequence 72 10 ea f4 00 00 00 00. This is just a magic number and has no useful information it. So just ignore the first 8 bytes of each file when trying to reverse engineer this.

After looking over the files for a while, I was able to tell I was looking at a bunch of entries but couldn't make any sense of them. I decided to jump to the end of each file next.

End of globals.dun

1ab285d8  dd c3 ec 15 5c 06 00 00  52 3a 5c 64 75 6e 65 5c  |....\...R:\dune\|
1ab285e8  6e 65 77 5f 72 75 6e 74  69 6d 65 5c 73 6f 75 6e  |new_runtime\soun|
1ab285f8  64 5c 6d 61 75 6c 61 5f  63 6c 69 63 6b 2e 77 61  |d\maula_click.wa|
1ab28608  76 0a 32 79 47 17 9a 04  00 00 52 3a 5c 64 75 6e  |v.2yG.....R:\dun|
1ab28618  65 5c 6e 65 77 5f 72 75  6e 74 69 6d 65 5c 73 6f  |e\new_runtime\so|
1ab28628  75 6e 64 5c 6d 65 6e 75  5f 63 6c 69 63 6b 2e 77  |und\menu_click.w|
1ab28638  61 76 0a cc 7d 47 17 60  12 00 00 72 65 73 6f 75  |av..}G.`...resou|
1ab28648  72 63 65 2e 64 61 74 0a  10 00 00 00 5d 8c 02 00  |rce.dat.....]...|
1ab28658
End of locale.dun

068bde62  52 3a 5c 64 75 6e 65 5c  67 61 6d 65 5c 63 68 61  |R:\dune\game\cha|
068bde72  72 61 63 74 65 72 73 5c  50 46 5c 46 52 5f 50 46  |racters\PF\FR_PF|
068bde82  5f 48 45 41 44 2e 73 6b  6c 0a ee ed 62 00 6f 03  |_HEAD.skl...b.o.|
068bde92  00 00 52 3a 5c 64 75 6e  65 5c 67 61 6d 65 5c 63  |..R:\dune\game\c|
068bdea2  68 61 72 61 63 74 65 72  73 5c 66 72 5f 63 68 61  |haracters\fr_cha|
068bdeb2  5c 46 52 5f 63 68 61 5f  48 45 41 44 5f 73 6b 6c  |\FR_cha_HEAD_skl|
068bdec2  2e 73 6b 6c 0a bb d7 bf  00 6f 03 00 00 6c 6f 63  |.skl.....o...loc|
068bded2  61 6c 65 2e 64 61 74 0a  10 00 00 00 37 1c 02 00  |ale.dat.....7...|
068bdee2

It appears that the end of globals.dun references resource.dat and that the end of locale.dun references locale.dat. The other odd thing here is all the strings that reference a complete Microsoft Windows file path. This left me scratching my head. Almost all video games seem to wind up with references to the path on the build machine used to make the final product. It is unusually inconsequential. In this case, each and every file seemed to be stored by its absolute path.

I am the worlds laziest reverse engineer. So I installed the game in Wine and used strace to launch the game under linux. All I really had to do was start the first level then immediately end the game. This generates a huge log file, but all that I really care about is finding any indication of how globals.dun was accessed.

 strace_log.zip 3.8 MB

This is the complete output of strace I got when running the game via wine

Note: The game itself is 32-bit Windows executables. I couldn't get it to run under any modern version of Windows.

I searched for globals.dun in the log and found this

[pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", {st_mode=S_IFREG|0644, st_size=447907416, ...}) = 0
[pid 28743] lstat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", {st_mode=S_IFREG|0644, st_size=447907416, ...}) = 0
[pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", {st_mode=S_IFREG|0644, st_size=447907416, ...}) = 0
[pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK) = 30
[pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun",  <unfinished ...>
[pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK) = 100
[pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun",  <unfinished ...>
[pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK <unfinished ...>
[pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun",  <unfinished ...>
[pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK) = 100

Apparently openat was called several time for the file that I cared about. The number on the right hand side is the file descriptor number that was assigned as a result of the call to openat. So basically 30 or 100 is the file descriptor in question. I couldn't actually find anything being done with those file descriptors that are interesting. But I remember seeing a bunch of strings starting with R: in both files like this

[pid 28743] read(13, "PF_HEAD_SKL_PHONEM_O.anm\n\0\207J\0304\7\0"..., 4096) = 4096
[pid 28743] read(13, "AD_labial_a.anm\nB\352M\30\4\22\0\0R:\\dune\\"..., 4096) = 4096
[pid 28743] read(13, "une\\data\\all\\interface\\frontend\\"..., 4096) = 4096
[pid 28743] read(13, "\0R:\\dune\\data\\all\\interface\\micr"..., 4096) = 4096

The strace utility is nice enough to show the first few bytes read as a result of each call to read that the game made. At this point I decided that file descriptor 13 was the one I actually cared about. The problem is file descriptors are generally re-used by a program as it opens & closes files. So I couldn't just look at every usage of file descriptor 13. So I looked in the general vicinity of the lines I had alreayd found and noticed this

[pid 28743] recvmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="X\0\0\0", iov_len=4}], msg_iovlen=1, msg_control=[{cmsg_len=16, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[13]}], msg_controllen=16, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 4
[pid 28743] fcntl64(13, F_SETFD, FD_CLOEXEC) = 0

It appears that file descriptor 13 comes across as a result of a call to recvmsg which is receiving a file descriptor in this case. So I'm just going to guess that here 13 is referring to either globals.dun or locale.dun. They appear to both be the same format, so either one is fine to try and understand the format. I gathered up all the calls to the read function from the strace log in the region right after the file descriptor is opened.

[pid 28743] read(13, "r\20\352\364\0\0\0\0\324i\257\32R\f\0\0\n#--------------"..., 4096) = 4096
[pid 28743] _llseek(13, 447703508, [447703508], SEEK_SET) = 0
[pid 28743] read(13, "..\\code_scenarique\\Actors\\ActorF"..., 512) = 512
[pid 28743] read(13, "Gard\\__init__.pyo\n\273.\5\0x\0\0\0..\\cod"..., 4096) = 4096
[pid 28743] read(13, "que\\Behaviours\\followOnPathBhv\\f"..., 4096) = 4096
[pid 28743] read(13, "ematic03\\Behaviours\\cinematicFre"..., 4096) = 4096
[pid 28743] read(13, "tates.pyo\n\202\26\316\22(\25\0\0..\\code_scenar"..., 4096) = 4096
[pid 28743] read(13, "de_scenarique\\Cinematic\\Cinemati"..., 4096) = 4096
[pid 28743] read(13, "atic15\\Behaviours\\__init__.pyo\n\271"..., 4096) = 4096
[pid 28743] read(13, "cenarique\\Cinematic\\Cinematic20\\"..., 4096) = 4096
[pid 28743] read(13, "\\code_scenarique\\Mission01_In.py"..., 4096) = 4096

The first call to read is probably reading from the very start of the file. Both files started with the hex sequence 72 10 ea f4 00 00 00 00. The logs from strace attempts to display each byte as read. If we look at the first call to read we can see that it returned r\20\352\364\0\0\0\0. This is the same value at the head of the file but given in octal. So we can be sure that the call to read here is at the start of the file. It also read 4096 bytes and then immediately called _llseek to reposition to offset 447703508 in the same file, then read 512 bytes with a call to read. I looked at the offset 447703508 in globals.dun using hd

1aaf69d4  2e 2e 5c 63 6f 64 65 5f  73 63 65 6e 61 72 69 71  |..\code_scenariq|
1aaf69e4  75 65 5c 41 63 74 6f 72  73 5c 41 63 74 6f 72 46  |ue\Actors\ActorF|
1aaf69f4

We can see that the value read by the call to read is obviously coming from globals.dun. So at this point we can be sure that file descriptor 13 is reading from globals.dun here. The other interesting thing is that the offset 447703508 is almost the end of globals.dun. So the following happened

  1. globals.dun is opened as file descriptor 13
  2. 4096 bytes at the start are read from the file
  3. The offset 447703508 is jumped to
  4. More data is read from the file

I basically guessed that the offset 447703508 must be stored somewhere in the first 4096 bytes of globals.dun. I used the tool okteta which is a graphical hex editor that shows you everything at your cursor in multiple formats

I put my cursor on the 8th byte in the file and it showed me that a 32-bit unsigned integer is stored there with a value of 447703508. So I can basically be sure that bytes 8-12 of the file tell you an offset to jump to at the start. Finding this value was basically just dumb luck. The authors of this game made no real effort to obfuscate the archive format whatsoever. Even a simple obfuscation trick like masking the stored values with a bitmask can make finding things like this incredibly difficult.

I had to wonder why it was necessary to open a file and then read an offset which seeks to the end. After some consideration, I tried to think what I would have done if it was the year 2000 and I had to write a script to package up a bunch of arbitrary files. The simplest algorithm I could think of was

  1. Open a new file, skip forward a fixed amount to leave some bytes for a header
  2. Write every single asset into the file, one after the other, keeping track of it in a data structure somewhere
  3. Dump the data structure that describes each asset that was packed into this archive
  4. Go back and write the header, including the offset to the thing stored in step 3

This works and is probably very simple to implement. Cryo interactive was effectively bankrupt as this game was developed, so I doubt anyone wasted time optimizing anything. My hypothesis was the start of each .dun is laid out like this

Byte positionFormatContents
0-4uint32Magic number
4-8uint32Always zero
8-12uint32offset into this file, where each resource is identified

So I grabbed everything in globals.dun past offset 447703508 and it goes like this

1aaf69d4  2e 2e 5c 63 6f 64 65 5f  73 63 65 6e 61 72 69 71  |..\code_scenariq|
1aaf69e4  75 65 5c 41 63 74 6f 72  73 5c 41 63 74 6f 72 46  |ue\Actors\ActorF|
1aaf69f4  69 6c 65 2e 70 79 6f 0a  d5 18 3d 02 77 31 00 00  |ile.pyo...=.w1..|
1aaf6a04  2e 2e 5c 63 6f 64 65 5f  73 63 65 6e 61 72 69 71  |..\code_scenariq|
1aaf6a14  75 65 5c 41 63 74 6f 72  73 5c 42 6f 78 5c 5f 5f  |ue\Actors\Box\__|
1aaf6a24  69 6e 69 74 5f 5f 2e 70  79 6f 0a 86 97 09 00 77  |init__.pyo.....w|
1aaf6a34  00 00 00 2e 2e 5c 63 6f  64 65 5f 73 63 65 6e 61  |.....\code_scena|
1aaf6a44  72 69 71 75 65 5c 41 63  74 6f 72 73 5c 42 6f 78  |rique\Actors\Box|
1aaf6a54  5c 62 61 73 69 63 42 6f  78 2e 70 79 6f 0a fd 97  |\basicBox.pyo...|
1aaf6a64  09 00 e1 09 00 00 2e 2e  5c 63 6f 64 65 5f 73 63  |........\code_sc|
1aaf6a74  65 6e 61 72 69 71 75 65  5c 41 63 74 6f 72 73 5c  |enarique\Actors\|
1aaf6a84  43 61 6d 65 72 61 5c 5f  5f 69 6e 69 74 5f 5f 2e  |Camera\__init__.|
1aaf6a94  70 79 6f 0a 2e 1c 07 00  7a 00 00 00 2e 2e 5c 63  |pyo.....z.....\c|
1aaf6aa4  6f 64 65 5f 73 63 65 6e  61 72 69 71 75 65 5c 41  |ode_scenarique\A|
1aaf6ab4  63 74 6f 72 73 5c 43 61  6d 65 72 61 5c 62 61 73  |ctors\Camera\bas|
1aaf6ac4  69 63 43 61 6d 65 72 61  2e 70 79 6f 0a a8 1c 07  |icCamera.pyo....|
1aaf6ad4

It apparently is nothing more than a bunch of file paths with 9 bytes of other data in between. After each path is the hex value 0a, which is the newline in ASCII. So 0a is just marking the end of the path string. This leaves 8 bytes. Of those, we can basically be sure that 4 bytes are a position within this file. After looking at them a while it turns out what is stored along each path is the offset into the file and then the entry length. So each of these entries has a format like

Byte positionFormatContents
0 to (N-1)ASCII StringFile path
NByteAlways newline
N+1, N+2, N+3, N+4uint32offset into file
N+5, N+6, N+7, N+8uint32length of entry

Soto dump the full contents of each archive, you only need to do the following

  1. Skip the first 8 bytes
  2. Read bytes 8-12, jump to that offset
  3. Read path until 0x0A is reached
  4. Read 4 bytes, this is the location of the entry in the file
  5. Read 4 bytes, this is the length of that entry
  6. Keep repeating 3-5 until the end of the file is reached.

I combined this into a Python 3 script and used it to dump everything out

 dun_dumper.py 2.1 kB

This python script can unpack a "dun" file

import struct
import sys
import os

HEADER_SIZE = 16

fin = open(sys.argv[1], 'rb')

fin.seek(0)
header_data = fin.read(HEADER_SIZE)
magic, _, header_start, _ = struct.unpack('<IIII', header_data)

fin.seek(header_start)

class HeaderParser(object):
  def __init__(self):
    self.state = 'name'
    self.filename = []
    self.offset = None
    self.entry_length = None
    self.entries = []

  def feed(self, data):
    for x in data:
      if self.state == 'name':
        if x == 0x0a:
          self.state = 'offset'
          self.offset = []
          self.filename = str(bytes(self.filename), 'ascii')
        else:
          self.filename.append(x)

      elif self.state == 'offset':
        self.offset.append(x)
        if len(self.offset) == 4:
          self.offset, = struct.unpack('<I', bytes(self.offset))
          self.entry_length = []
          self.state = 'entry_length'

      elif self.state == 'entry_length':
        self.entry_length.append(x)
        if len(self.entry_length) == 4:
          self.entry_length, = struct.unpack('<I', bytes(self.entry_length))
          self.emit()

  def emit(self):
    entry = (self.filename, self.offset, self.entry_length)
    sys.stdout.write("File: %r\tOffset: %x\tLength: %x\n" % entry)
    self.entries.append(entry)
    self.state = 'name'
    self.filename = []

parser = HeaderParser()
while True:
  data = fin.read(32)
  if len(data) == 0:
    break
  parser.feed(data)

def expand_path(p):
  parts = p.split('\\')
  # Check for drive letter
  if p[1] == ':':
    parts[0] = 'drive_%s' % (parts[0][0].lower(),)
  elif parts[0] == '.' or parts[0] == '..':
    parts = parts[1:]

  return os.path.join(*parts)

for filename, offset, entry_length in parser.entries:
  converted_filename = expand_path(filename)
  dst = os.path.join(os.environ['HOME'], 'tmp', 'dune2001_assets', converted_filename)
  contained_in, _ = os.path.split(dst)
  os.makedirs(contained_in, exist_ok = True)
  with open(dst, 'wb') as fout:
    fin.seek(offset)
    fout.write(fin.read(entry_length))

You can run this script with python3 ./dun_dumper.py /path/to/globals.dun. This dumps everything into your home directory under ~/tmp/dune2001_assets. Since the "name" of each entry in the archive is actually a path it does some work to try and un-uglify all that stuff and make it friendly to being unpacked onto a unix machine.

Figuring out the assets

Looking through the assets, the following interesting things jump out

  1. 269 WAV files, totalling 272 megabytes
  2. 597 BMP files, totalling 70 megabytes
  3. 112 PNG files, totalling 9.1 megabytes
  4. 85 TIFF files, totalling 3.5 megabytes
  5. 1030 Python object code files, totalling 5.3 megabytes

Media assets

It's really bizarre that uncompressed audio was used in the year 2001. The MP3 format had been around since the 90s, but I guess its possible Cryo Interactive did not want to pay the licensing fees around it. The 70 megabytes of uncompressed BMP files is really weird when you consider that PNG and TIFF is also used in this project. I can't really see why someone would put uncompressed images in this project. So out of the 428 megabytes that makes up globals.dun there are 342 megabytes of uncompressed assets. That makes up 79.9% of the file size! So if you played this game back when it was released and wondered why it was slow to load, then it was probably because no one could be bothered to compress the assets.

This funny image was found in the games assets

This image is one of the higher resolution asset files and appears to have been used as a skybox

Other

One of the stranger files was the file found at the path R:\dune\game\ps2memc\mc_icon.sys. This file starts with PS2D as the first four bytes. It's actually some sort of icon file for the PlayStation 2. I guess basically everything got packaged up into the PC release, including things which weren't needed at all.

Python object code

What I found really odd was 1030 Python object code files. Why are Python object code files in a video game asset archive? Here is a quick sample of some of the file names

./code_scenarique/Missions/Mission05/States/sergeantMajor/SergeantMajorStates.pyo
./code_scenarique/Missions/Mission05/States/sergeantMajor/__init__.pyo
./code_scenarique/Missions/Mission05/States/cinematic05.pyo
./code_scenarique/Missions/Mission05/States/rabban/RabbanStates.pyo
./code_scenarique/Missions/Mission05/States/rabban/__init__.pyo
./code_scenarique/Missions/Mission05/States/StatueState.pyo
./code_scenarique/Missions/Mission05/States/__init__.pyo
./code_scenarique/Missions/Mission05/Objectives/Objectives05.pyo
./code_scenarique/Missions/Mission05/Objectives/__init__.pyo
./code_scenarique/Missions/Mission05/Actors/SergeantMajor/sergeantMajor.pyo

It appears that each file is used to define almost everything there is about levels, missions, cinematics, etc. in a game. This is unexpected! I used the file command line utility to confirm that the files are in fact python 1.5/1.6 byte-compiled. So this is actually cutting edge Python code from back in the year 2000. I did some more looking around and realized that the resource.dat file is in fact a big manifest, referencing other assets from the file. It has lines like this in it

MISSION05_LEVEL_CODE            ..\code_scenarique\Mission05.py         0

This line references the entire directory under ./code_scenarique/Mission05.pyo, since that defines a full Python module. I wanted to confirm that the game actually contains a full Python interpreter, so I used strings to quickly search through dune.exe. I found a bunch of strings like this

PyInterpreterState_Delete: invalid interp
PyThreadState_Delete: invalid tstate
PyThreadState_Delete: NULL interp
PyThreadState_Delete: tstate is still current
PyThreadState_Delete: NULL tstate
PyThreadState_Clear: warning: thread still has a frame
PyThreadState_Get: no current thread
PyThreadState_GetDict: no current thread

So yes, the game does in fact contain a complete copy of the Python interpreter. This is interesting because Python 1.6.1 was the first version released under an open source license. The actual license is quite complex and not like anything else used in the open source world. Speficially, these clauses B.2 and B.3 from the license are interesting

2. Subject to the terms and conditions of this License Agreement, CNRI
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python 1.6.1
alone or in any derivative version, provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2001 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6.1 alone or in any derivative
version prepared by Licensee.  Alternately, in lieu of CNRI's License
Agreement, Licensee may substitute the following text (omitting the
quotes): "Python 1.6.1 is made available subject to the terms and
conditions in CNRI's License Agreement.  This Agreement together with
Python 1.6.1 may be located on the Internet using the following
unique, persistent identifier (known as a handle): 1895.22/1013.  This
Agreement may also be obtained from a proxy server on the Internet
using the following URL: http://hdl.handle.net/1895.22/1013".

3. In the event Licensee prepares a derivative work that is based on
or incorporates Python 1.6.1 or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python 1.6.1.

This license text suggests that Cryo Interactive is responsible for giving notice that their software contains Python and furthermore including any changes they made. I'm pretty sure Cryo Interactive is in violation of the terms of the license. Given the 20+ years they've been gone, I don't think anything is going to change at this point.

The reason why each of the Python files is named .pyo is because they are compiled Python code. Python source code doesn't compile like C or C++, but it still gets compiled to an intermediate representation before being executed by the interpreter. I was considering trying to compile the Python 1.6.1 compiler to see if I could use it to disassemble the object code. This turns out to be entirely unnecessary. The pycdc project is able to decompile basically any Python object code. Running the pycdc executable doesn't produce much in the way of helpful output, but the pycdas provides a human readable form of the Python object code. I didn't bother going through and trying to understand how everything works, but it appears that the mission Python files get total control to setup everything.

Looking at the file code_scenarique/Missions/Mission03/Objectives/Objectives03.pyo using the pycdc produces the following disassembly

 0       LOAD_GLOBAL             0: ClassObjectivesManager
                3       LOAD_GLOBAL             1: scenaric
                6       LOAD_ATTR               2: AddObjective
                9       LOAD_GLOBAL             1: scenaric
                12      LOAD_ATTR               3: SetObjectiveComplete
                15      LOAD_GLOBAL             4: success
                18      CALL_FUNCTION           3
                21      STORE_FAST              0: objectiveManager
                24      LOAD_FAST               0: objectiveManager
                27      LOAD_ATTR               6: addObjective
                30      LOAD_CONST              1: 'PiloteDialog'
                33      CALL_FUNCTION           1
                36      POP_TOP                 
                37      LOAD_FAST               0: objectiveManager
                40      LOAD_ATTR               6: addObjective
                43      LOAD_CONST              2: 'FindFlightPlan'
                46      CALL_FUNCTION           1
                49      POP_TOP                 
                50      LOAD_FAST               0: objectiveManager
                53      LOAD_ATTR               6: addObjective
                56      LOAD_CONST              3: 'RobotRoom'
                59      CALL_FUNCTION           1
                62      POP_TOP                 
                63      LOAD_FAST               0: objectiveManager
                66      LOAD_ATTR               6: addObjective
                69      LOAD_CONST              4: 'DestroyRobot'
                72      CALL_FUNCTION           1
                75      POP_TOP                 
                76      LOAD_FAST               0: objectiveManager
                79      LOAD_ATTR               6: addObjective
                82      LOAD_CONST              5: 'MeetTheAmbassador'
                85      CALL_FUNCTION           1
                88      POP_TOP                 
                89      LOAD_FAST               0: objectiveManager
                92      LOAD_ATTR               6: addObjective
                95      LOAD_CONST              6: 'CitadelOutput'
                98      CALL_FUNCTION           1
                101     POP_TOP                 
                102     LOAD_FAST               0: objectiveManager
                105     LOAD_ATTR               6: addObjective
                108     LOAD_CONST              7: 'GetOutTheCitadel'
                111     CALL_FUNCTION           1
                114     POP_TOP                 
                115     LOAD_FAST               0: objectiveManager
                118     LOAD_ATTR               7: addNameTable
                121     LOAD_CONST              2: 'FindFlightPlan'
                124     LOAD_GLOBAL             8: resource
                127     LOAD_ATTR               9: MISSION3_OBJECTIFS_2_1_TXT
                130     LOAD_GLOBAL             8: resource
                133     LOAD_ATTR               10: MISSION3_OBJECTIFS_2_2_TXT

This file appears to configure a number of objectives for Mission 3. Interestingly, some of the paths are in French but most of the objectives are in English. This makes me think that the programming team was based in France, but the creative team responsible for game design was probably based elsewhere. It appears that the development work was done by Widescreen Games which was based in Lyon, France.

One unique aspect of how this game is packaged is the reference to MISSION3_OBJECTIFS_2_1_TXT. The python bytecode using this is LOAD_GLOBAL followed by LOAD_ATTR. In python code this is basically from resource import MISSION3_OBJECTIFS_2_1_TXT. The only other reference I could find was from the asset file locale.dat with a line like this

MISSION3_OBJECTIFS_2_1_TXT     R:\dune\data\all\locale\us\txt\TEXT0369.txt     0

The file R:\dune\data\all\locale\us\txt\TEXT0369.txt was unpacked from locale.dun and is just a single line

Finding the maps of the secret base

It seems that the Python runtime was extended so it could import constants from the locale.dun file as if they were Python code.

This game was quite a long way ahead of it time, because it was a complete 3D game engine that used Python as a scripting language to setup everything for each mission the user played through. Most game engines work something like this, but the tools for how things are setup is generally closed source. In the case of Dune, the only real tooling needed is the ability to edit media files and the repackage Python code. I haven't actually tried this, however.

So what game engine is this?

The Wikipedia entry for this video game lists "RenderWare" as the game engine. Renderware isn't so much a video game engine is a portable library for 3D graphics, sound, input, etc. On PC it predates the creation of DirectX. However, the use of the Python interpreter has nothing to do with the Renderware Engine. This appears to be a custom project developed by someone. Logically, I wanted to know who. The credits for the game list Pierre Deltour & Matthieu Imbert as "Technical Directors" for the game. Both are still active developers in France. I couldn't find anything connecting Pierre Deltour to the Python project. However, Matthieu Imbert is still active in the Python community, maintaining some open source modules. I found a complete CV for him. It contains this

Widescreen Games WSG, Lyon, France - 1999-2002

Project lead

Lead the R&D team. Managed the development of software libraries and tools used internally at the WSG game development studio.

In charge of the 3D rendering engine, the sound engine, the software services layer, the logic engine for the development of games on PC and Playstation 2. Designed and implemented the 3D engine (built on top of Renderware) and the sound engine on both PC and Playstation 2. Main technologies: C/C++, Visual Studio, GCC, CodeWarrior, VSS, GNU tools, Python, DirectX, Win32, STL, MFC, RenderWare.

So, my best guess is that Matthieu was responsible for the architecture & implementation that uses Python to tie together everything.


Copyright Eric Urban 2021, or the respective entity where indicated