Reverse engineering "Frank Herbert's Dune"
- Monday May 31 2021
- python video-games reverse-engineering
One of my foremost interests outside of my professional work is enjoying science fiction entertainment. Despite having never read the books, I find the Dune universe by Frank Herbert to be amazing. The strangest work within the Dune universe is the video game "Frank Herbert's Dune" released in 2001 by Cryo Interactive. This game is interesting in that it was the last title released by Cryo Interactive. Wikipedia has some more information about this game. Spoiler: it isn't very good.
Attempting to understand how the video game engine works turned out to be much more interesting.
The game comes with only a few files that the installer places in C:\Program Files (x86)\Frank Herbert's Dune
by default. Outside of the obvious dune.exe
which is the games executable all the game data is stored in the globals.dun
and locale.dun
files. So I set out to figure out how to unpack these files and figure out what they contain.
Binary archive formats - a crash course
Anytime you want to pack multiple entries into a single file, you are going to need to use some sort of containter format. Either using some sort of standardized format like zip or tar files, or an ad-hoc format.
It really doesn't matter what format you use, all archive formats basically store data using one of the following strategies
- Null or sentinel separated values
- Length prefixed values
- Type & length prefixed values
- Name, type, & length prefixed values
The first option, "null separated" only works when you're doing something simple like trying to store a list of ASCII strings in a file. We can basically be sure that such an approach isn't going to be used for any video game made asset file.
Length prefixed values are much more common. Effectively, each value is preceded by its length in the file. This works well, but you have no idea how to tell what a given entry represents. It could be an image file for a graphics texture or a WAV file for game audio.
If you prefix each entry with a type & length, it is much simpler to work with. Of course, you are still stuck referring to everything by its position in the file. "Entry #632" is not particularly descriptive compared to characters.skins.business_suit
. So most archives wind up storing the name of an entry, the type, & length. This is sometimes called a "TLV" format for short.
The one monkey wrench that gets throwing into this is that almost all archive formats include a "header" of some sort. Sometimes the name, type & length is stored in the header with an offset into the file. So it is possible to quickly scan the header to locate any single item and then jump to its location in the archive.
Reverse engineering the asset archives
I needed to figure out exactly what the developers of this game used. Thankfully, they gave us two files which hopefully use the same binary file format. So I used hd
to view the start of each file
Start of globals.dun 00000000 72 10 ea f4 00 00 00 00 d4 69 af 1a 52 0c 00 00 |r........i..R...| 00000010 0a 23 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d |.#--------------| 00000020 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d |----------------|
Start of locales.dun 00000000 72 10 ea f4 00 00 00 00 3b 5f 8a 06 07 07 00 00 |r.......;_......| 00000010 23 20 43 49 4e 45 4d 41 54 49 51 55 45 5f 31 5f |# CINEMATIQUE_1_| 00000020 53 43 45 4e 45 31 5f 31 0a 43 49 4e 45 4d 41 54 |SCENE1_1.CINEMAT| 00000030 49 51 55 45 5f 31 5f 53 43 45 4e 45 31 5f 31 5f |IQUE_1_SCENE1_1_| 00000040 31 5f 54 58 54 20 20 20 20 20 52 3a 5c 64 75 6e |1_TXT R:\dun| 00000050 65 5c 64 61 74 61 5c 61 6c 6c 5c 6c 6f 63 61 6c |e\data\all\local| 00000060 65 5c 75 73 5c 74 78 74 5c 54 45 58 54 30 30 30 |e\us\txt\TEXT000| 00000070 31 2e 74 78 74 20 20 20 20 20 30 0a 23 20 43 49 |1.txt 0.# CI|
The only thing you can quickly figure out by looking at this is that both files start with the hex sequence 72 10 ea f4 00 00 00 00
. This is just a magic number and has no useful information it. So just ignore the first 8 bytes of each file when trying to reverse engineer this.
After looking over the files for a while, I was able to tell I was looking at a bunch of entries but couldn't make any sense of them. I decided to jump to the end of each file next.
End of globals.dun 1ab285d8 dd c3 ec 15 5c 06 00 00 52 3a 5c 64 75 6e 65 5c |....\...R:\dune\| 1ab285e8 6e 65 77 5f 72 75 6e 74 69 6d 65 5c 73 6f 75 6e |new_runtime\soun| 1ab285f8 64 5c 6d 61 75 6c 61 5f 63 6c 69 63 6b 2e 77 61 |d\maula_click.wa| 1ab28608 76 0a 32 79 47 17 9a 04 00 00 52 3a 5c 64 75 6e |v.2yG.....R:\dun| 1ab28618 65 5c 6e 65 77 5f 72 75 6e 74 69 6d 65 5c 73 6f |e\new_runtime\so| 1ab28628 75 6e 64 5c 6d 65 6e 75 5f 63 6c 69 63 6b 2e 77 |und\menu_click.w| 1ab28638 61 76 0a cc 7d 47 17 60 12 00 00 72 65 73 6f 75 |av..}G.`...resou| 1ab28648 72 63 65 2e 64 61 74 0a 10 00 00 00 5d 8c 02 00 |rce.dat.....]...| 1ab28658
End of locale.dun 068bde62 52 3a 5c 64 75 6e 65 5c 67 61 6d 65 5c 63 68 61 |R:\dune\game\cha| 068bde72 72 61 63 74 65 72 73 5c 50 46 5c 46 52 5f 50 46 |racters\PF\FR_PF| 068bde82 5f 48 45 41 44 2e 73 6b 6c 0a ee ed 62 00 6f 03 |_HEAD.skl...b.o.| 068bde92 00 00 52 3a 5c 64 75 6e 65 5c 67 61 6d 65 5c 63 |..R:\dune\game\c| 068bdea2 68 61 72 61 63 74 65 72 73 5c 66 72 5f 63 68 61 |haracters\fr_cha| 068bdeb2 5c 46 52 5f 63 68 61 5f 48 45 41 44 5f 73 6b 6c |\FR_cha_HEAD_skl| 068bdec2 2e 73 6b 6c 0a bb d7 bf 00 6f 03 00 00 6c 6f 63 |.skl.....o...loc| 068bded2 61 6c 65 2e 64 61 74 0a 10 00 00 00 37 1c 02 00 |ale.dat.....7...| 068bdee2
It appears that the end of globals.dun
references resource.dat
and that the end of locale.dun
references locale.dat
. The other odd thing here is all the strings that reference a complete Microsoft Windows file path. This left me scratching my head. Almost all video games seem to wind up with references to the path on the build machine used to make the final product. It is unusually inconsequential. In this case, each and every file seemed to be stored by its absolute path.
I am the worlds laziest reverse engineer. So I installed the game in Wine and used strace
to launch the game under linux. All I really had to do was start the first level then immediately end the game. This generates a huge log file, but all that I really care about is finding any indication of how globals.dun
was accessed.
Note: The game itself is 32-bit Windows executables. I couldn't get it to run under any modern version of Windows.
I searched for globals.dun
in the log and found this
[pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", {st_mode=S_IFREG|0644, st_size=447907416, ...}) = 0 [pid 28743] lstat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", {st_mode=S_IFREG|0644, st_size=447907416, ...}) = 0 [pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", {st_mode=S_IFREG|0644, st_size=447907416, ...}) = 0 [pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK) = 30 [pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", <unfinished ...> [pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK) = 100 [pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", <unfinished ...> [pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK <unfinished ...> [pid 28743] stat64("/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", <unfinished ...> [pid 28746] openat(AT_FDCWD, "/home/ericu/.wine/dosdevices/c:/Program Files (x86)/Frank Herbert's Dune/globals.dun", O_RDONLY|O_NONBLOCK) = 100
Apparently openat
was called several time for the file that I cared about. The number on the right hand side is the file descriptor number that was assigned as a result of the call to openat
. So basically 30
or 100
is the file descriptor in question. I couldn't actually find anything being done with those file descriptors that are interesting. But I remember seeing a bunch of strings starting with R:
in both files like this
[pid 28743] read(13, "PF_HEAD_SKL_PHONEM_O.anm\n\0\207J\0304\7\0"..., 4096) = 4096 [pid 28743] read(13, "AD_labial_a.anm\nB\352M\30\4\22\0\0R:\\dune\\"..., 4096) = 4096 [pid 28743] read(13, "une\\data\\all\\interface\\frontend\\"..., 4096) = 4096 [pid 28743] read(13, "\0R:\\dune\\data\\all\\interface\\micr"..., 4096) = 4096
The strace
utility is nice enough to show the first few bytes read as a result of each call to read
that the game made. At this point I decided that file descriptor 13
was the one I actually cared about. The problem is file descriptors are generally re-used by a program as it opens & closes files. So I couldn't just look at every usage of file descriptor 13
. So I looked in the general vicinity of the lines I had alreayd found and noticed this
[pid 28743] recvmsg(4, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="X\0\0\0", iov_len=4}], msg_iovlen=1, msg_control=[{cmsg_len=16, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[13]}], msg_controllen=16, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 4 [pid 28743] fcntl64(13, F_SETFD, FD_CLOEXEC) = 0
It appears that file descriptor 13
comes across as a result of a call to recvmsg
which is receiving a file descriptor in this case. So I'm just going to guess that here 13
is referring to either globals.dun
or locale.dun
. They appear to both be the same format, so either one is fine to try and understand the format. I gathered up all the calls to the read
function from the strace log in the region right after the file descriptor is opened.
[pid 28743] read(13, "r\20\352\364\0\0\0\0\324i\257\32R\f\0\0\n#--------------"..., 4096) = 4096 [pid 28743] _llseek(13, 447703508, [447703508], SEEK_SET) = 0 [pid 28743] read(13, "..\\code_scenarique\\Actors\\ActorF"..., 512) = 512 [pid 28743] read(13, "Gard\\__init__.pyo\n\273.\5\0x\0\0\0..\\cod"..., 4096) = 4096 [pid 28743] read(13, "que\\Behaviours\\followOnPathBhv\\f"..., 4096) = 4096 [pid 28743] read(13, "ematic03\\Behaviours\\cinematicFre"..., 4096) = 4096 [pid 28743] read(13, "tates.pyo\n\202\26\316\22(\25\0\0..\\code_scenar"..., 4096) = 4096 [pid 28743] read(13, "de_scenarique\\Cinematic\\Cinemati"..., 4096) = 4096 [pid 28743] read(13, "atic15\\Behaviours\\__init__.pyo\n\271"..., 4096) = 4096 [pid 28743] read(13, "cenarique\\Cinematic\\Cinematic20\\"..., 4096) = 4096 [pid 28743] read(13, "\\code_scenarique\\Mission01_In.py"..., 4096) = 4096
The first call to read
is probably reading from the very start of the file. Both files started with the hex sequence 72 10 ea f4 00 00 00 00
. The logs from strace
attempts to display each byte as read. If we look at the first call to read
we can see that it returned r\20\352\364\0\0\0\0
. This is the same value at the head of the file but given in octal. So we can be sure that the call to read here is at the start of the file. It also read 4096 bytes and then immediately called _llseek
to reposition to offset 447703508
in the same file, then read 512 bytes with a call to read
. I looked at the offset 447703508
in globals.dun
using hd
1aaf69d4 2e 2e 5c 63 6f 64 65 5f 73 63 65 6e 61 72 69 71 |..\code_scenariq| 1aaf69e4 75 65 5c 41 63 74 6f 72 73 5c 41 63 74 6f 72 46 |ue\Actors\ActorF| 1aaf69f4
We can see that the value read by the call to read
is obviously coming from globals.dun
. So at this point we can be sure that file descriptor 13
is reading from globals.dun
here. The other interesting thing is that the offset 447703508
is almost the end of globals.dun
. So the following happened
globals.dun
is opened as file descriptor13
- 4096 bytes at the start are read from the file
- The offset 447703508 is jumped to
- More data is read from the file
I basically guessed that the offset 447703508
must be stored somewhere in the first 4096 bytes of globals.dun
. I used the tool okteta
which is a graphical hex editor that shows you everything at your cursor in multiple formats
I put my cursor on the 8th byte in the file and it showed me that a 32-bit unsigned integer is stored there with a value of 447703508
. So I can basically be sure that bytes 8-12 of the file tell you an offset to jump to at the start. Finding this value was basically just dumb luck. The authors of this game made no real effort to obfuscate the archive format whatsoever. Even a simple obfuscation trick like masking the stored values with a bitmask can make finding things like this incredibly difficult.
I had to wonder why it was necessary to open a file and then read an offset which seeks to the end. After some consideration, I tried to think what I would have done if it was the year 2000 and I had to write a script to package up a bunch of arbitrary files. The simplest algorithm I could think of was
- Open a new file, skip forward a fixed amount to leave some bytes for a header
- Write every single asset into the file, one after the other, keeping track of it in a data structure somewhere
- Dump the data structure that describes each asset that was packed into this archive
- Go back and write the header, including the offset to the thing stored in step 3
This works and is probably very simple to implement. Cryo interactive was effectively bankrupt as this game was developed, so I doubt anyone wasted time optimizing anything. My hypothesis was the start of each .dun
is laid out like this
Byte position | Format | Contents |
---|---|---|
0-4 | uint32 | Magic number |
4-8 | uint32 | Always zero |
8-12 | uint32 | offset into this file, where each resource is identified |
So I grabbed everything in globals.dun
past offset 447703508 and it goes like this
1aaf69d4 2e 2e 5c 63 6f 64 65 5f 73 63 65 6e 61 72 69 71 |..\code_scenariq| 1aaf69e4 75 65 5c 41 63 74 6f 72 73 5c 41 63 74 6f 72 46 |ue\Actors\ActorF| 1aaf69f4 69 6c 65 2e 70 79 6f 0a d5 18 3d 02 77 31 00 00 |ile.pyo...=.w1..| 1aaf6a04 2e 2e 5c 63 6f 64 65 5f 73 63 65 6e 61 72 69 71 |..\code_scenariq| 1aaf6a14 75 65 5c 41 63 74 6f 72 73 5c 42 6f 78 5c 5f 5f |ue\Actors\Box\__| 1aaf6a24 69 6e 69 74 5f 5f 2e 70 79 6f 0a 86 97 09 00 77 |init__.pyo.....w| 1aaf6a34 00 00 00 2e 2e 5c 63 6f 64 65 5f 73 63 65 6e 61 |.....\code_scena| 1aaf6a44 72 69 71 75 65 5c 41 63 74 6f 72 73 5c 42 6f 78 |rique\Actors\Box| 1aaf6a54 5c 62 61 73 69 63 42 6f 78 2e 70 79 6f 0a fd 97 |\basicBox.pyo...| 1aaf6a64 09 00 e1 09 00 00 2e 2e 5c 63 6f 64 65 5f 73 63 |........\code_sc| 1aaf6a74 65 6e 61 72 69 71 75 65 5c 41 63 74 6f 72 73 5c |enarique\Actors\| 1aaf6a84 43 61 6d 65 72 61 5c 5f 5f 69 6e 69 74 5f 5f 2e |Camera\__init__.| 1aaf6a94 70 79 6f 0a 2e 1c 07 00 7a 00 00 00 2e 2e 5c 63 |pyo.....z.....\c| 1aaf6aa4 6f 64 65 5f 73 63 65 6e 61 72 69 71 75 65 5c 41 |ode_scenarique\A| 1aaf6ab4 63 74 6f 72 73 5c 43 61 6d 65 72 61 5c 62 61 73 |ctors\Camera\bas| 1aaf6ac4 69 63 43 61 6d 65 72 61 2e 70 79 6f 0a a8 1c 07 |icCamera.pyo....| 1aaf6ad4
It apparently is nothing more than a bunch of file paths with 9 bytes of other data in between. After each path is the hex value 0a
, which is the newline in ASCII. So 0a
is just marking the end of the path string. This leaves 8 bytes. Of those, we can basically be sure that 4 bytes are a position within this file. After looking at them a while it turns out what is stored along each path is the offset into the file and then the entry length. So each of these entries has a format like
Byte position | Format | Contents |
---|---|---|
0 to (N-1) | ASCII String | File path |
N | Byte | Always newline |
N+1, N+2, N+3, N+4 | uint32 | offset into file |
N+5, N+6, N+7, N+8 | uint32 | length of entry |
Soto dump the full contents of each archive, you only need to do the following
- Skip the first 8 bytes
- Read bytes 8-12, jump to that offset
- Read path until 0x0A is reached
- Read 4 bytes, this is the location of the entry in the file
- Read 4 bytes, this is the length of that entry
- Keep repeating 3-5 until the end of the file is reached.
I combined this into a Python 3 script and used it to dump everything out
import struct import sys import os HEADER_SIZE = 16 fin = open(sys.argv[1], 'rb') fin.seek(0) header_data = fin.read(HEADER_SIZE) magic, _, header_start, _ = struct.unpack('<IIII', header_data) fin.seek(header_start) class HeaderParser(object): def __init__(self): self.state = 'name' self.filename = [] self.offset = None self.entry_length = None self.entries = [] def feed(self, data): for x in data: if self.state == 'name': if x == 0x0a: self.state = 'offset' self.offset = [] self.filename = str(bytes(self.filename), 'ascii') else: self.filename.append(x) elif self.state == 'offset': self.offset.append(x) if len(self.offset) == 4: self.offset, = struct.unpack('<I', bytes(self.offset)) self.entry_length = [] self.state = 'entry_length' elif self.state == 'entry_length': self.entry_length.append(x) if len(self.entry_length) == 4: self.entry_length, = struct.unpack('<I', bytes(self.entry_length)) self.emit() def emit(self): entry = (self.filename, self.offset, self.entry_length) sys.stdout.write("File: %r\tOffset: %x\tLength: %x\n" % entry) self.entries.append(entry) self.state = 'name' self.filename = [] parser = HeaderParser() while True: data = fin.read(32) if len(data) == 0: break parser.feed(data) def expand_path(p): parts = p.split('\\') # Check for drive letter if p[1] == ':': parts[0] = 'drive_%s' % (parts[0][0].lower(),) elif parts[0] == '.' or parts[0] == '..': parts = parts[1:] return os.path.join(*parts) for filename, offset, entry_length in parser.entries: converted_filename = expand_path(filename) dst = os.path.join(os.environ['HOME'], 'tmp', 'dune2001_assets', converted_filename) contained_in, _ = os.path.split(dst) os.makedirs(contained_in, exist_ok = True) with open(dst, 'wb') as fout: fin.seek(offset) fout.write(fin.read(entry_length))
You can run this script with python3 ./dun_dumper.py /path/to/globals.dun
. This dumps everything into your home directory under ~/tmp/dune2001_assets
. Since the "name" of each entry in the archive is actually a path it does some work to try and un-uglify all that stuff and make it friendly to being unpacked onto a unix machine.
Figuring out the assets
Looking through the assets, the following interesting things jump out
- 269 WAV files, totalling 272 megabytes
- 597 BMP files, totalling 70 megabytes
- 112 PNG files, totalling 9.1 megabytes
- 85 TIFF files, totalling 3.5 megabytes
- 1030 Python object code files, totalling 5.3 megabytes
Media assets
It's really bizarre that uncompressed audio was used in the year 2001. The MP3 format had been around since the 90s, but I guess its possible Cryo Interactive did not want to pay the licensing fees around it. The 70 megabytes of uncompressed BMP files is really weird when you consider that PNG and TIFF is also used in this project. I can't really see why someone would put uncompressed images in this project. So out of the 428 megabytes that makes up globals.dun
there are 342 megabytes of uncompressed assets. That makes up 79.9% of the file size! So if you played this game back when it was released and wondered why it was slow to load, then it was probably because no one could be bothered to compress the assets.
Other
One of the stranger files was the file found at the path R:\dune\game\ps2memc\mc_icon.sys
. This file starts with PS2D
as the first four bytes. It's actually some sort of icon file for the PlayStation 2. I guess basically everything got packaged up into the PC release, including things which weren't needed at all.
Python object code
What I found really odd was 1030 Python object code files. Why are Python object code files in a video game asset archive? Here is a quick sample of some of the file names
./code_scenarique/Missions/Mission05/States/sergeantMajor/SergeantMajorStates.pyo ./code_scenarique/Missions/Mission05/States/sergeantMajor/__init__.pyo ./code_scenarique/Missions/Mission05/States/cinematic05.pyo ./code_scenarique/Missions/Mission05/States/rabban/RabbanStates.pyo ./code_scenarique/Missions/Mission05/States/rabban/__init__.pyo ./code_scenarique/Missions/Mission05/States/StatueState.pyo ./code_scenarique/Missions/Mission05/States/__init__.pyo ./code_scenarique/Missions/Mission05/Objectives/Objectives05.pyo ./code_scenarique/Missions/Mission05/Objectives/__init__.pyo ./code_scenarique/Missions/Mission05/Actors/SergeantMajor/sergeantMajor.pyo
It appears that each file is used to define almost everything there is about levels, missions, cinematics, etc. in a game. This is unexpected! I used the file
command line utility to confirm that the files are in fact python 1.5/1.6 byte-compiled
. So this is actually cutting edge Python code from back in the year 2000. I did some more looking around and realized that the resource.dat
file is in fact a big manifest, referencing other assets from the file. It has lines like this in it
MISSION05_LEVEL_CODE ..\code_scenarique\Mission05.py 0
This line references the entire directory under ./code_scenarique/Mission05.pyo
, since that defines a full Python module. I wanted to confirm that the game actually contains a full Python interpreter, so I used strings
to quickly search through dune.exe
. I found a bunch of strings like this
PyInterpreterState_Delete: invalid interp PyThreadState_Delete: invalid tstate PyThreadState_Delete: NULL interp PyThreadState_Delete: tstate is still current PyThreadState_Delete: NULL tstate PyThreadState_Clear: warning: thread still has a frame PyThreadState_Get: no current thread PyThreadState_GetDict: no current thread
So yes, the game does in fact contain a complete copy of the Python interpreter. This is interesting because Python 1.6.1 was the first version released under an open source license. The actual license is quite complex and not like anything else used in the open source world. Speficially, these clauses B.2 and B.3 from the license are interesting
2. Subject to the terms and conditions of this License Agreement, CNRI hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python 1.6.1 alone or in any derivative version, provided, however, that CNRI's License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) 1995-2001 Corporation for National Research Initiatives; All Rights Reserved" are retained in Python 1.6.1 alone or in any derivative version prepared by Licensee. Alternately, in lieu of CNRI's License Agreement, Licensee may substitute the following text (omitting the quotes): "Python 1.6.1 is made available subject to the terms and conditions in CNRI's License Agreement. This Agreement together with Python 1.6.1 may be located on the Internet using the following unique, persistent identifier (known as a handle): 1895.22/1013. This Agreement may also be obtained from a proxy server on the Internet using the following URL: http://hdl.handle.net/1895.22/1013". 3. In the event Licensee prepares a derivative work that is based on or incorporates Python 1.6.1 or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python 1.6.1.
This license text suggests that Cryo Interactive is responsible for giving notice that their software contains Python and furthermore including any changes they made. I'm pretty sure Cryo Interactive is in violation of the terms of the license. Given the 20+ years they've been gone, I don't think anything is going to change at this point.
The reason why each of the Python files is named .pyo
is because they are compiled Python code. Python source code doesn't compile like C or C++, but it still gets compiled to an intermediate representation before being executed by the interpreter. I was considering trying to compile the Python 1.6.1 compiler to see if I could use it to disassemble the object code. This turns out to be entirely unnecessary. The pycdc project is able to decompile basically any Python object code. Running the pycdc
executable doesn't produce much in the way of helpful output, but the pycdas
provides a human readable form of the Python object code. I didn't bother going through and trying to understand how everything works, but it appears that the mission Python files get total control to setup everything.
Looking at the file code_scenarique/Missions/Mission03/Objectives/Objectives03.pyo
using the pycdc
produces the following disassembly
0 LOAD_GLOBAL 0: ClassObjectivesManager 3 LOAD_GLOBAL 1: scenaric 6 LOAD_ATTR 2: AddObjective 9 LOAD_GLOBAL 1: scenaric 12 LOAD_ATTR 3: SetObjectiveComplete 15 LOAD_GLOBAL 4: success 18 CALL_FUNCTION 3 21 STORE_FAST 0: objectiveManager 24 LOAD_FAST 0: objectiveManager 27 LOAD_ATTR 6: addObjective 30 LOAD_CONST 1: 'PiloteDialog' 33 CALL_FUNCTION 1 36 POP_TOP 37 LOAD_FAST 0: objectiveManager 40 LOAD_ATTR 6: addObjective 43 LOAD_CONST 2: 'FindFlightPlan' 46 CALL_FUNCTION 1 49 POP_TOP 50 LOAD_FAST 0: objectiveManager 53 LOAD_ATTR 6: addObjective 56 LOAD_CONST 3: 'RobotRoom' 59 CALL_FUNCTION 1 62 POP_TOP 63 LOAD_FAST 0: objectiveManager 66 LOAD_ATTR 6: addObjective 69 LOAD_CONST 4: 'DestroyRobot' 72 CALL_FUNCTION 1 75 POP_TOP 76 LOAD_FAST 0: objectiveManager 79 LOAD_ATTR 6: addObjective 82 LOAD_CONST 5: 'MeetTheAmbassador' 85 CALL_FUNCTION 1 88 POP_TOP 89 LOAD_FAST 0: objectiveManager 92 LOAD_ATTR 6: addObjective 95 LOAD_CONST 6: 'CitadelOutput' 98 CALL_FUNCTION 1 101 POP_TOP 102 LOAD_FAST 0: objectiveManager 105 LOAD_ATTR 6: addObjective 108 LOAD_CONST 7: 'GetOutTheCitadel' 111 CALL_FUNCTION 1 114 POP_TOP 115 LOAD_FAST 0: objectiveManager 118 LOAD_ATTR 7: addNameTable 121 LOAD_CONST 2: 'FindFlightPlan' 124 LOAD_GLOBAL 8: resource 127 LOAD_ATTR 9: MISSION3_OBJECTIFS_2_1_TXT 130 LOAD_GLOBAL 8: resource 133 LOAD_ATTR 10: MISSION3_OBJECTIFS_2_2_TXT
This file appears to configure a number of objectives for Mission 3. Interestingly, some of the paths are in French but most of the objectives are in English. This makes me think that the programming team was based in France, but the creative team responsible for game design was probably based elsewhere. It appears that the development work was done by Widescreen Games which was based in Lyon, France.
One unique aspect of how this game is packaged is the reference to MISSION3_OBJECTIFS_2_1_TXT
. The python bytecode using this is LOAD_GLOBAL
followed by LOAD_ATTR
. In python code this is basically from resource import MISSION3_OBJECTIFS_2_1_TXT
. The only other reference I could find was from the asset file locale.dat
with a line like this
MISSION3_OBJECTIFS_2_1_TXT R:\dune\data\all\locale\us\txt\TEXT0369.txt 0
The file R:\dune\data\all\locale\us\txt\TEXT0369.txt
was unpacked from locale.dun
and is just a single line
Finding the maps of the secret base
It seems that the Python runtime was extended so it could import constants from the locale.dun
file as if they were Python code.
This game was quite a long way ahead of it time, because it was a complete 3D game engine that used Python as a scripting language to setup everything for each mission the user played through. Most game engines work something like this, but the tools for how things are setup is generally closed source. In the case of Dune, the only real tooling needed is the ability to edit media files and the repackage Python code. I haven't actually tried this, however.
So what game engine is this?
The Wikipedia entry for this video game lists "RenderWare" as the game engine. Renderware isn't so much a video game engine is a portable library for 3D graphics, sound, input, etc. On PC it predates the creation of DirectX. However, the use of the Python interpreter has nothing to do with the Renderware Engine. This appears to be a custom project developed by someone. Logically, I wanted to know who. The credits for the game list Pierre Deltour & Matthieu Imbert as "Technical Directors" for the game. Both are still active developers in France. I couldn't find anything connecting Pierre Deltour to the Python project. However, Matthieu Imbert is still active in the Python community, maintaining some open source modules. I found a complete CV for him. It contains this
Widescreen Games WSG, Lyon, France - 1999-2002
Project lead
Lead the R&D team. Managed the development of software libraries and tools used internally at the WSG game development studio.
In charge of the 3D rendering engine, the sound engine, the software services layer, the logic engine for the development of games on PC and Playstation 2. Designed and implemented the 3D engine (built on top of Renderware) and the sound engine on both PC and Playstation 2. Main technologies: C/C++, Visual Studio, GCC, CodeWarrior, VSS, GNU tools, Python, DirectX, Win32, STL, MFC, RenderWare.
So, my best guess is that Matthieu was responsible for the architecture & implementation that uses Python to tie together everything.