Some help reverse engennering a piece of code?

All the inane chatter goes in here. If you're curious about whether we will support a game, post HERE not in General Discussion :)

Moderator: ScummVM Team

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Some help reverse engennering a piece of code?

Post by Serious Callers Only »

It's not for a adventure, but for a rpg.

Vampire the Masquerade : Bloodlines.

I want to replace the version of python that the game uses (2.1, cut down, without libraries) with the 2.7 version with libraries (so modding it is not so painful, ie, win32 for mouse events instead of clobbering half-life 2 settings etc).

My (small, pathetic) findings are on this page:
http://forum.bloodlinesresurgence.com/s ... php?tid=75

Basically i found for now that the version of python that Troika used has 3 new functions. One is a nop (Py_SetGameInteface, just RETN - not sure about the arguments either though). Another 2 are used to send python commands from the game engine to the interpreter.

I thought noOp-ing them wasn't the cause of the crash, but the last disassembly (of the piece of code that setups python) uses those functions to add the path of the game python scripts to the interpreter path, so they obviously need to be implemented.

Problem is, i suck at assembly/c++. I'm not even sure i have the right function signatures.

Halp

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

No interest at all?

User avatar
MusicallyInspired
Posts: 1042
Joined: Fri Mar 02, 2007 8:03 am
Location: Manitoba, Canada
Contact:

Post by MusicallyInspired »

Doesn't that game use the Source engine? You could probably post on the Source Development Community board on the Steam forums.

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

I don't think it's very related. Troika never used the released sdk and it looks (to me) that the code in that dll is just python + reading the halflife console stream (likely a argument i can't figure out), tokenizing it somehow, sending the commands to python, and sending the response to another dll (the &tier0.Msg call on flushconsoleoutput).

What is really needed is a reverse engineer wizard.

(another problem: i have no idea how to compile the replacement dll to call that &tier0.Msg call without the source code for the other dlls).

KuroShiro
Posts: 456
Joined: Thu May 15, 2008 7:42 am
Location: Miyazaki, Japan

Post by KuroShiro »

It IS the source engine, an early version.

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

Sure, but the source engine itself never used python. This is the python dll that troika glued with the engine, and these functions are the ones at the border (inside the python dll).

As i said, they probably take a string taken from the console (or a inputsteam), tokenize them, pass it to the python interpreter and output the response to the console again (that tier0.Msg call).

I actually know now that PyRun_ConsoleString takes a char * since i managed to print it (and a int 256) in the function i replaced it with on the new dll. (it is what you wrote on the console).

Afterwards it crashes.

If you write "quit" or some other half-life 2 command it doesn't crash, so the seperation of what has to go to the source console or the python interpreter is being done outside of the dll.


Maybe there is additional code that is not called yet, so i don't know if it needs to be implemented. But i don't think so, the only intermodular call that is not a windows (msvcrt for ex) call is that tier0.dll (that indeed is a dll in the same folder) call on "Py_FlushConsoleOutput".

If only troika had separated all their code from the dll this would probably be much easier (drag&drop even).

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

This is how i'm doing it if anyone want to try:


install VS (i'm using 2003)

get the python sources here:
http://www.python.org/ftp/python/2.7.1/Python-2.7.1.tgz

extract them, and open up visual studio. The project to open differs depending on the visual studio version you have.

PCBuild is 2008, the others (in windows) are in PC\
I used PC\7.1 (for vs 2003).
Open VS,
File -> Open Project go to the Pythonsource\PC\7.1\pcbuild.sln file.

Now all the projects that this will open will not build, because of missing dependencies, but you don't need them, just the pythoncore (that makes the python27.dll

To avoid crashes when you replace vampire_python21.dll, open the pythoncore project and edit pythonrun.c addition these three functions/headers:
PyAPI_FUNC(void) Py_SetGameInterface(void);
void Py_SetGameInterface(void){
Py_NoSiteFlag = 1;
return;
}

PyAPI_FUNC(void) Py_FlushConsoleOutput(void);
void Py_FlushConsoleOutput(void){
return;
}

PyAPI_FUNC(void) PyRun_ConsoleString(char *, int);
void PyRun_ConsoleString(char *str, int size){
printf("%s %d\n", str, size);
return;
}

(i'm sure the arguments and possibly the returns are wrong, and will crash the game, however the signatures are needed for the game to start up).

(if you're on windows - i was testing this on wine so i didn't notice since it is more lax),
edit the object.c file, find these lines
/* for binary compatibility with 2.2 */
#undef _PyObject_Del
void
_PyObject_Del(PyObject *op)
{
PyObject_FREE(op);
}

Add a line like so:
/* for binary compatibility with 2.2 */
#undef _PyObject_Del
PyAPI_FUNC(void) _PyObject_Del(PyObject *);
void
_PyObject_Del(PyObject *op)
{
PyObject_FREE(op);
}

(the function was deprecated on python 2.3 -that is why it is not exported - but still exists and is used by bloodlines)


Don't forget to change from "debug" to "release" on a drop down box on the main window of VS.
Now compile python core (right mouse button on the pythoncore subproject, build).
wait for it to end
On the PC\VS7.1 folder there should be a python27.dll.

backup the game vampire_python21.dll, and move/rename the the python27.dll file to the Game\Bin\vampire_python21.dll
My preferred method to move/rename it to the game folder is on the command line

Then start the game with the -console 1 argument. It should crash right away if you input a non source command in the game console, but it should print what you inputed (that printf).

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

stackoverflow.com says that c++ function signatures can be reversed with dependencywalker (unfortunately the functions are plain C ).

It also says that dlls can be linked when you don't have the source with
* open the dll up in depends.exe shipped with (Visual Studio)
* verify the signature of the function you want to call
* use LoadLibrary() to get load this dll (be careful about the path)
* use GetProcAddress() to get a pointer to the function you want to call
* use this pointer-to-function to make a call with valid arguments
* use FreeLibrary() to release the handle

BTW: This method is also commonly referred to as runtime dynamic linking as opposed to compile-time dynamic linking where you compile your sources with the associated lib file.
But i'm not sure this will work as intended since the dll is probably already open in the process.

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

Ok, this is the code that is calling the function that is crashing the dll
(PyRun_ConsoleString to check for the return and arguments)

Code: Select all

CPU Disasm
Address   Hex dump          Command                                                             Comments
200DAB20  /$  68 D8961A20   PUSH OFFSET 201A96D8                                                ; ASCII "__main__"
200DAB25  |.  FF15 B0331720 CALL DWORD PTR DS&#58;&#91;<&vampire_python21.PyImport_AddModule>&#93;
200DAB2B  |.  83C4 04       ADD ESP,4
200DAB2E  |.  85C0          TEST EAX,EAX
200DAB30  |.  74 4C         JE SHORT 200DAB7E
200DAB32  |.  50            PUSH EAX
200DAB33  |.  FF15 8C331720 CALL DWORD PTR DS&#58;&#91;<&vampire_python21.PyModule_GetDict>&#93;
200DAB39  |.  50            PUSH EAX
200DAB3A  |.  50            PUSH EAX
200DAB3B  |.  8B4424 10     MOV EAX,DWORD PTR SS&#58;&#91;ESP+10&#93;
200DAB3F  |.  68 00010000   PUSH 100
200DAB44  |.  50            PUSH EAX
200DAB45  |.  FF15 90331720 CALL DWORD PTR DS&#58;&#91;<&vampire_python21.PyRun_ConsoleString>&#93;
200DAB4B  |.  83C4 14       ADD ESP,14
200DAB4E  |.  85C0          TEST EAX,EAX
200DAB50  |.  75 08         JNE SHORT 200DAB5A
200DAB52  |.  FF15 94331720 CALL DWORD PTR DS&#58;&#91;<&vampire_python21.PyErr_Print>&#93;
200DAB58  |.- EB 1E         JMP SHORT <JMP.&vampire_python21.Py_FlushConsoleOutput>             ; Jump to vampire_python21.Py_FlushConsoleOutput
200DAB5A  |>  FF08          DEC DWORD PTR DS&#58;&#91;EAX&#93;
200DAB5C  |.  75 0A         JNE SHORT 200DAB68
200DAB5E  |.  8B48 04       MOV ECX,DWORD PTR DS&#58;&#91;EAX+4&#93;
200DAB61  |.  50            PUSH EAX
200DAB62  |.  FF51 18       CALL DWORD PTR DS&#58;&#91;ECX+18&#93;
200DAB65  |.  83C4 04       ADD ESP,4
200DAB68  |>  FF15 98331720 CALL DWORD PTR DS&#58;&#91;<&vampire_python21.Py_FlushLine>&#93;
200DAB6E  |.  85C0          TEST EAX,EAX
200DAB70  |.- 74 06         JE SHORT <JMP.&vampire_python21.Py_FlushConsoleOutput>              ; Jump to vampire_python21.Py_FlushConsoleOutput
200DAB72  |.  FF15 9C331720 CALL DWORD PTR DS&#58;&#91;<&vampire_python21.PyErr_Clear>&#93;
200DAB78  |>- FF25 BC331720 JMP DWORD PTR DS&#58;&#91;<&vampire_python21.Py_FlushConsoleOutput>&#93;
200DAB7E  \>  C3            RETN
It's obvious that the return is a boolean (int), saying if python could exe the code or not - the test makes it unmistakable. But what about the arguments?

(And why is it still crashing).

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

Hmmm the code manipulates ESP by adding 3 dwords + a word.

4+4+4+2

So 3 x 32 bits values + 1 x 16 bits values as arguments?

Something is wrong with this argument, since it appears (from printing inside the function) that a third 32 bit argument has trash values.

If Py_GetDict works in the usual way, it's return value should be in EAX.

It is saved twice (pushed twice), then the segment 100 (256 constant i think) then a char * (ESP + 10) from the user input. From this rationale, the third argument should be a PyObject * (GetDict return).

And it is saved twice to prevent clobbering?

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

YES! That worked.

User avatar
md5
ScummVM Developer
Posts: 2261
Joined: Thu Nov 03, 2005 9:31 pm
Location: Athens, Greece

Post by md5 »

Out of curiosity... what worked?

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

Just executing my code without crashing on the changed function signature.

The dll i was replacing had a function with a signature:
PyAPI_FUNC(int) PyRun_ConsoleString(const char *, int, PyObject*);

And i was trying various variations of
PyAPI_FUNC(int) PyRun_ConsoleString(const char *, int, char*);

The stack was smashed and crashed after the call.

I still need help if you're interested.

Now that i think about it i'm not so sure it isn't:

PyAPI_FUNC(int) PyRun_ConsoleString(const char *, int, PyObject*, PyObject*);

It is after all pushed twice...
and would follow the pattern of some other python interperter functions. Those Dicts are records for the globals and locals.

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

mmm it is still crashing in case it returns -1 (failure).

It is because of that PyErrorPrint.

http://docs.python.org/c-api/veryhigh.h ... tringFlags

Python actually crashes if there is no actual error when you check for errors and that function (that i'm using as a kludge to exe the statement) doesn't save exceptions.

I also have a problem, the functions that do save errors there, only exec part of the statement (more like expressions i suspect.
For instance a compound statement
<code>;<code>
extremely common, only exec the first, fails at the second and still returns true (!)

Maybe that is what the pyrun_consolestring/flushconsole separation is: one tokenizes and execs statments one by one returning if they fail, the other prints to the console.

Clumsy as hell, i don't want to parse python code.

Can you tell me on that PyRun_ConsoleString function if it looks as if the return of GetDict (the PyObject* ) is a argument for that function once or twice? It doesn't crash either way...

Serious Callers Only
Got a warning
Posts: 166
Joined: Thu Feb 25, 2010 7:44 am

Post by Serious Callers Only »

Ok there is something pretty strange occurring if i return -1 (failure parsing the python expression).
Maybe you can help me, the calling function is up there above.

My function now, seems to work ok in the success case, prints and execs the code as long as it is correct.

Code: Select all

int PyRun_ConsoleString&#40;const char *str, int typeOfExpression, PyObject* globals, PyObject* locals&#41;&#123;
	PyObject * code;

	//read this
	//http&#58;//docs.python.org/c-api/veryhigh.html#Py_eval_input
	//typeOfExpression is always equal to 256 or Py_single_input. 
	//So the input for the console is a series of statements, not expressions. BUT
	//one of the primitive python statements is the expression_stmt&#58;
	//http&#58;//docs.python.org/reference/simple_stmts.html#grammar-token-expression_stmt
	//that indeed evaluates expressions like 1+2 so this can be used for exec and output
	assert&#40;typeOfExpression == Py_single_input&#41;;
	code = PyRun_String&#40;str, typeOfExpression, globals, locals&#41;;
	if&#40;!code&#41;&#123;
	    if&#40;PyErr_Occurred&#40;&#41;&#41;
			printf&#40;"OK\r\n"&#41;;
		else
			printf&#40;"WTF\r\n"&#41;;
		return -1;
	&#125;else&#123;
		printf&#40;"BRANCH 2\r\n"&#41;;
		//print the console evaluation result to std out
		PyObject_Print&#40;code, stdout, NULL&#41;; 
		printf&#40;"\r\n"&#41;;
		Py_DECREF&#40;code&#41;;
		code = NULL;
		return 0;
	&#125;
&#125;
That if-else in the first branch was just to test, that, indeed, the Python "exception" is set (because they warn on the c api that cleaning the error when no error occurred crashes). PyErr_Occurred has no side effects on that.

However what happens it that (as expected, there is a error).

The error:
Unhandled exception: page fault on write access to 0xffffffff in 32-bit code (0x200dab5a).

0xffffffff is -1 in two complements right? It is treating the output as a pointer? It's possible the return is not -1 but the returned pyObject and that test on the assembly above is to test pyObject == NULL {do nothing} else {do something}, and the print wasn't supposed to be even there.

Locked