Expanding function size in UPK

wghost81 · December 11, 2013

Here is what I finally came to.

MoveObject function first reads object data. If object is of Function type, MoveObject alters its header and body to reflect new size. If not, MoveObject simply appends zeros to original data to fit new object size. Then MoveObject appends new object data to the end of upk file and alters MovedObject.ObjectFileSize and MovedObject.DataOffset fields inside an Object Table.

After that MoveObject function appends additional info to the end of new object data:

16 bytes of PatchUPK ID, which are md5hash of "PatchUPK": 7A A0 56 C9 60 5F 7B 31 72 5D 4B C4 7C D2 4D D9
4 bytes of OldObjectFileSize
4 bytes of OldDataOffset

New UndoMoveObject function searches for (MovedObject.DataOffset + MovedObject.ObjectFileSize) offset and reads next 16 bytes to determine if this is PatchUPK ID. If true, UndoMoveObject reads following 8 bytes (OldObjectFileSize and OldDataOffset) and re-writes corresponding MovedObject.ObjectFileSize and MovedObject.DataOffset fields inside an Object Table. No data are removed by this operation, as there could be other expanded functions and attempting to remove object data will result in broken offsets for those.

I haven't updated src/release yet. I will update after actual uninstaller will be implemented.

Amineri · December 21, 2013

With wghost's permission (and hopefully blessing), I'm working on adding this functionality to expand (and incidentally relocate) a given function to the end of the upk -- into the new UPKmodder tool XMTS and I are working on.

In order to maintain compability I plan on incorporating the same md5hash above : 7A A0 56 C9 60 5F 7B 31 72 5D 4B C4 7C D2 4D D9

as well as the old objects file size and file offset.

Just so that I'm certain, you are planning on writing the old object's size and position in LITTLE_ENDIAN? (which would be consistent with all other numbers in UPK)

wghost81 · December 21, 2013

Yes, numbers are in LITTLE_ENDIAN. And md5hash is a sequence of bytes which are saved directly as they are.

I added some comments to the UPKUtils::MoveObject function after you mentioned you'll be re-writing it in Java. But I don't know what will be easier: try to re-write C++ code in Java, or use the general idea and write a new Java code. I used a couple of "low-level" C++ functions, don't know if same functionality is available in Java.

In any case, I'd be happy to help in any way I could.

Amineri · December 21, 2013

I'm trying to make sure we're compatible -- did you put the md5hash and original size after the 15 byte footer data? It seems it has to as if I put it before reads the md5hash entries as the flags and everything is all messed up.

I've got a prototype working now (just running with a hard-coded upk and function from a test function), but I think it would be quite worthwhile to make things compatible.

As a side note, what happens if a user runs the "expand and relocation" function a second time? I presume it would take the 2nd, bigger function and make a 3rd even bigger copy at the tail of the upk. In this case I'm not sure how the uninstall would work.

I'm not suggesting that anyone would deliberately do this -- it's just when managing a bunch of files (or with people that aren't as intimately familiar with the upk as we are) mistakes can happen. I'm just considering what would happen...

------

Edit:

Also, necessity is the mother of invention here. I've been working on the EW version of dynamic upgrades for aliens but Firaxis changed some of the functions around, and got rid of my old helper function XGTacticalGameCore.GenerateArmorFragments.

So I'm faced with the choice between having to modify 4 functions to get limited functionality, or make 1 small change to 1 function and expand a second in order to get full functionality. The latter definitely looks more attractive :)

Edited December 21, 2013 by Amineri

wghost81 · December 21, 2013

Expanding function procedure:

1. Set Object.ObjectFileSize = NewFileSize

2. Set Object.DataOffset = upkFileSize (i.e. end of upk file)

3. Construct new function from the old one by changing memory and file size info and adding 0x0B tokens to fill the newly added space. Not necessary if you're not planning to use the old code.

4. Write new data at the end of upk.

5. Write uninstall info AFTER newly added object data.

Some clarification on terms.

By ObjectFileSize I understand the whole object size, as it is set in Export Table. If object is of "Function" type, it's ObjectFileSize includes function header (0x30 bytes) + function body + footer. Last 4 bytes of function header define the value which is often called "function file size". It is not object file size, it is function body size (including 0x53 token). Function file size must not be confused with object file size!

To move/expand function you need to set it's full object data, including header, body and footer.

In UPK Utils it works like that:

1. You specify function name and it's new size.

2. Program is checking if new size > old size, if not it stops and outputs error message. That way you can't apply the same mod twice, which effectively prevents upk growth. Although, if I'd remove this check, it still won't make a bigger function, just a copy of already existing with the same size, as NEW_SIZE parameter is absolute object size, not relative!

3. If specified object is of "Function" type, program will construct a new function object from an old one. If not — it will just fill the difference with zeros.

4. After that program will write new table data for specified object (size and offset) and new object data. And add an uninstall info after newly added object data:

end of vanilla upk
expanded object 1
expanded object 1 uninstall info
expanded object 2
expanded object 2 uninstall info
...

When using expanded functions, you must stop thinking in absolute file offsets terms and start thinking in relative offsets terms. As you can't predict which place your function will end up (user may have more than one "expanding" mods installed), you must use it's full name to access it's size and offset data in Export Table and then determine it's actual location inside upk.

I noticed, your modded hex starts with memory size (0x28 offset relative to function offset in upk) and ends with 0x53 token. If you don't want to keep header/footer data inside your modded hex, you need to reconstruct expanded function and then write your data at newFunctionOffset + 0x28.

Edited December 21, 2013 by wghost81

Drakous79 · December 21, 2013

2. Program is checking if new size > old size, if not it stops and outputs error message. That way you can't apply the same mod twice, which effectively prevents upk growth. Although, if I'd remove this check, it still won't make a bigger function, just a copy of already existing with the same size, as NEW_SIZE parameter is absolute object size, not relative!

Here's a scenario: A mod, that expands a function is installed. Updated version of the mod is available later. The tool could identify what to do - patch or expand and patch.

If new size > old size - expand and patch.

Else if there is tool's hash present, skip expanding and just patch.

Else output error.

wghost81 · December 21, 2013

Drakous79, yes, I agree. I will add this functionality to PatchUPK.

Amineri · December 21, 2013

I've found a case where the "Expand Function" algorithm might not work.

Unreal Engine support a thing called "States".

For example in XComGame.upk, if you look at XGAIBehavior, in addition to the functions, there are a bunch of 'States' (e.g. Active, AttackState, Berserk).

States have function code within them but also have functions which can override the functions in the containing class. So I guess they kind of act like embedded child classes. Each state also has a default BeginState and EndState function which generally has to be instantiated.

Looking at the hex buffer for XGAIBehavior.state_Active, it is:

//header
E2 9D 00 00 76 5F 00 00 00 00 00 00 C7 9D 00 00 00 00 00 00 00 00 00 00 CB 9D 00 00 00 00 00 00 99 10 00 00 9C 3B 02 00 5D 01 00 00 19 01 00 00 

//body
0F 01 55 9A 00 00 1D 58 02 00 00 07 B7 00 82 82 77 2E FB 7F 00 00 19 1C 54 FC FF FF 16 09 00 7E F9 FF FF 00 01 7E F9 FF FF 2A 16 18 3F 00 19 2E FB 7F 00 00 19 1C 54 FC FF FF 16 09 00 7E F9 FF FF 00 01 7E F9 FF FF 0A 00 35 7B 00 00 00 2D 01 35 7B 00 00 16 18 22 00 81 19 01 8F 9A 00 00 0A 00 83 3A 00 00 00 1B 18 45 00 00 00 00 00 00 16 16 16 71 21 BD 29 00 00 00 00 00 00 4A 4A 4A 16 06 42 01 1B F9 41 00 00 00 00 00 00 16 19 01 90 9A 00 00 0A 00 00 00 00 00 00 1B B7 68 00 00 00 00 00 00 16 0F 01 55 9A 00 00 1D 59 02 00 00 07 25 01 1B 0C 44 00 00 00 00 00 00 1F 41 49 42 65 68 2D 41 63 74 69 76 65 2D 42 61 74 74 6C 65 2E 49 73 50 61 75 73 65 64 00 16 61 00 1E 00 00 00 00 16 06 EF 00 0F 01 55 9A 00 00 1D 5A 02 00 00 71 21 79 30 00 00 00 00 00 00 4A 4A 4A 16 08 0C B2 09 00 00 00 00 00 00 00 00 00 00 76 5F 00 00 00 00 00 00 FF FF 00 00 53 

//footer
00 04 00 00 44 01 00 00 00 00 02 00 00 00 C8 29 00 00 00 00 00 00 CB 9D 00 00 C7 09 00 00 00 00 00 00 C9 9D 00 00

In this case the footer is 38 bytes long instead of the 15 for "regular" functions. Presumably this is to handle the extra subfunctions within the state.

And yes, I have had to modify code within states before :)

Amineri · December 21, 2013

Some clarification on terms.

By ObjectFileSize I understand the whole object size, as it is set in Export Table. If object is of "Function" type, it's ObjectFileSize includes function header (0x30 bytes) + function body + footer. Last 4 bytes of function header define the value which is often called "function file size". It is not object file size, it is function body size (including 0x53 token). Function file size must not be confused with object file size!

I agree that clarification is good.

So there are really three values here :

1) Function object file size (which is defined in the object table entry) -- file size of entire function object, including function header + body + footer

2) Function body file size (which is defined at at position 44 = x2C in the function header) -- file size of just the body portion, ending at 0x53 EOS token

3) Function body memory size (which is defined at position 40 = x28 in the function header) -- the memory size needed to load the function body. Includes +4 bytes for each export/import table reference.

I noticed, your modded hex starts with memory size (0x28 offset relative to function offset in upk) and ends with 0x53 token. If you don't want to keep header/footer data inside your modded hex, you need to reconstruct expanded function and then write your data at newFunctionOffset + 0x28.

The reason I'm doing this is to try and maintain compability between game versions. The function header contains some references and non-references that can change from one game version to another. I haven't deciphered what exactly the rest of the function header does, so am not able to update it -- hence restricting myself to just the last 2 bytes (primarily because most most have to update the memory size).

In general the BEFORE and AFTER hex work on a limited Search & Replace mechanic but confined within the scope of the size/offset of the function object (as defined in the object/export table).

It's possible to change out a single line this way without having to replace the entire function. Or even to make 2 or 3 isolated changes within a function.

In general I'd probably break this up into 2 separate operations:

1st) Perform any required function expansion, simply copying the hex present

2nd) Perform S&R on expanded function

This would mean that the BEFORE section would need to include the correct number of 0B tokens prior to the 53 EOS token to match properly, but I think this is reasonable.

I'm thinking that something like and EXPAND=<newSize> tag at the beginning would define the target size. Still mulling over possibilities.

---------------------

I'm re-considering the option of expanding the function "in-place" and adjusting all subsequent object entry positions. I think this has the advantage of being more reversable plus not leaving copies of function code hanging about.

I think the "undo" operation would resize down the target function and the adjust all subsequent object positions down by the adjustment, effectively allowing in place sizing up and down. Being able to size up or down "in place" is what would enable relatively seamless undo operations.

The downside is of course that the replace operation requires more drastic surgery to the upk, potentially altering many 1000's of object positions.

I'll have to experiment and see if I can get anything working...

Edited December 21, 2013 by Amineri

Drakous79 · December 21, 2013

Drakous79, yes, I agree. I will add this functionality to PatchUPK.

Thank you wghost81 :smile: I hope it makes sense, because I suck when it comes to if...else...elseif statements.

// apply for EXPAND_FUNCTION
if (new size > old size) {
    expand and patch
}
else {
    if ((new size == old size) && (hash is present ) {
        patch
    }
    else {
        output error
    }
}

<I'm line break - ignore me>

I've found a case where the "Expand Function" algorithm might not work.

Unreal Engine support a thing called "States".

Expanding functions within states should work - tried yesterday and seemed ok. But seeing XComGame.XComTacticalInput.ActiveUnit_Moving state's buffer proves your point, that state's footer is flexible.

Edited December 21, 2013 by Drakous79

XCOM

Expanding function size in UPK

Recommended Posts

wghost81

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Amineri

wghost81

Amineri

wghost81

Drakous79

wghost81

Amineri

Amineri

Drakous79

Recently Browsing 0 members

Browse

Activity

My Activity Streams

Explore