Expanding function size in UPK

Amineri · December 21, 2013

Expanding functions within states should work - tried yesterday and seemed ok. But seeing XComGame.XComTacticalInput.ActiveUnit_Moving state's buffer proves your point, that state's footer is flexible.

Right... expanding a function defined within a state should work. However the state itself contains script code as well, so potentially one might wish to expand the state itself.

UE Explore decompiles XGAIBehavior.state_Active as :

state Active
{
    ignores BeginState;

    simulated event EndState(name N)
    {
        //return;        
    }

Begin:
    m_iDebugHangLocation = 600;
    // End:0xB7
    if(((XComTacticalCheatManager(GetALocalPlayerController().CheatManager) != none) && XComTacticalCheatManager(GetALocalPlayerController().CheatManager).bSkipNonMeleeAI) && !m_kUnit.IsMeleeOnly())
    {
        GotoState('EndOfTurn');
    }
    // End:0x142
    else
    {
        InitTurn();
        m_kPlayer.ResetHangTimer();
        m_iDebugHangLocation = 601;
        J0xEF:
        // End:0x125 [Loop If]
        if(IsBattlePaused("AIBeh-Active-Battle.IsPaused"))
        {
            Sleep(0.0);
            // [Loop Continue]
            goto J0xEF;
        }
        m_iDebugHangLocation = 602;
        GotoState('ExecutingAI');
    }
    stop;        
}

The part after "Begin:" is the script part of "Active" itself, and is not part of a function within the Active state.

Amineri · December 21, 2013

I'm going to preface this post by saying that appending the expanded function to the end of the upk is a great, straightforward, and stable way to expand a single function.

-------

Upon some thought about expanding functions within the context of a larger mod (with dozens or hundreds of function changes), I've come to the conclusion that appending the expanded function poses some problems.

1) An expansion via append doesn't allow for an "undo"

In general the upk can't be reverted back to its original state.

Ideally the sequence of user operations:

a) Expand function 1

b) Expand function 2

c) Undo/revert function 1

would result in an identical upk to the user operation:

a) Expand function 2

However the append operation leaves behind its "dead" copy of the expanded function 1 that can't easily be garbage collected from the file.

2) Expansion via appending doesn't allowing reverting and resizing up a second time

Again, the user operations :

a) Expand function 1 to size A

b) Undo/revert function 1 to original size

c) Expand function 1 to size B, where B > A

should result in an identical upk state as performing:

a) Expand function 1 to size B

In the context of a smaller mod with only 1 or 2 changes that won't ever require much rework, these limitations are perfectly okay.

However in the context of a larger mod, where functions may have to be changed multiple times, these limitations become much more troublesome.

-----------------

So, I'm thinking of trying to implement an "in place" function resizer. This wouldn't create a copy of the function but would instead insert extra space into the middle of the upk. This requires much more drastic corrective action to the upk than the "append" resizer. Every Export table entry that points to a file position after the insertion point would have that file position increased by the amount of the function size increase.

However, adding the capability to do this also allows for sizing the function back down, reverting the upk back to its original state.

Drakous79 · December 22, 2013

In current modding environment, where we are putting pieces of hex together, is every way good to go. Also not every function needs expanding.

Move and resize

In case of garbage collection, there were 6 patches in the past that took care of it.

4 kB (XGStrategyAI.GetPossibleAliens' size on disk) * 100 (hundred resized functions) = +400 kB appended... that is not much in today's world.

Resize in place

Feels more correct, but is more demanding. You will have to master updating of indexes so a mod doesn't take ages to un/install. How long does it take to compute and update 200000 indexes?

Have to say, you guys are true UPK surgeons. When you get to it, bytes are flying around :smile:

Amineri · December 22, 2013

Well, I had my first success. I was able to insert 5 new bytes (9 memory bytes) in-place into a function in XGUnit.

Original function :

simulated function int GetFragGrenades()
{
    return m_iFragGrenades;
    return ReturnValue;
}

resized function

simulated function int GetFragGrenades()
{
    return m_iFragGrenades;
    return ReturnValue;
    m_iFragGrenades    
}

For this modest test I was only inserting 5 bytes. My (currently quite inefficent so I could debug it more easily) code that updates the 1000s of object entry file positions takes ~14 sec on my modest computer.

The .mod_upk file I used for this is :

MODFILEVERSION=4
UPKFILE=XComGame.upk
GUID=5B 06 B8 18 67 22 12 44 85 9B A8 5B 9D 57 1D 4B  // XComGame_EW_patch1.upk
FUNCTION=GetFragGrenades@XGUnit
RESIZE=05 // amount to resize in hex

[BEFORE_HEX]
[HEADER]
15 00 00 00 0D 00 00 00 
[/HEADER]
[code]
//return m_iFragGrenades
04 01 D8 39 00 00 

//return ReturnValue
04 3A 04 C4 00 00 

//EOS
53 
[/CODE]
[/BEFORE_HEX]


[AFTER_HEX]
[HEADER]
1E 00 00 00 12 00 00 00 
[/HEADER]
[code]
//return m_iFragGrenades
04 01 D8 39 00 00 

//return ReturnValue
04 3A 04 C4 00 00 

//m_iFragGrenades
01 D8 39 00 00 

//EOS
53 
[/CODE]
[/AFTER_HEX]

What I implemented was a combination "resize and replace" function that can be undone. That is, if the BEFORE_HEX and AFTER_HEX are different sizes (as above, with REPLACE= for explicit error checking), then doing an "apply" operation resizes the upk replaces the BEFORE_HEX with the AFTER_HEX. Doing a "revert" operation resizes the upk down and replaces the AFTER_HEX with the BEFORE_HEX.

Note that in this case it is perfectly legitimate to make the function smaller. Say ... if you wanted to avoid a ton of 0x0B operations at the end of the function.

My plan is that if RESIZE= is omitted then it is assumed that any differences in size between BEFORE and AFTER blocks are an error and reports them as such.

We already had (as has wghost, I believe) functions that parse through the upk's header and extracts the object list entries.

I'm also including lots of error checking to try and avoid as much accidental mangling of the upk as possible :smile:

The steps are :

Verify that object exists and is an object list entry (and not on the import table, for example)
Verify that BEFORE_HEX.length + resizeAmount == AFTER_HEX.length
Find file position of BEFORE_HEX (failing if it is not found)
Rename the old upk with a .bak extension (similar to how HxD works)
Create a new file with the same name as the target upk
Use .bak as source file and .upk as destination :
1. Copy from start of file to file position of BEFORE_HEX
2. Write AFTER_HEX to destination file
3. Copy from source (file position after BEFORE_HEX) to destination (append to end of file)
Step 6 inserts the new hex (with resizing up or down) into a new copy of the upk
Update the changed object's referenced size by resizeAmount-- its position remains unchanged
For every object in the object list (and thats ~55.000 for EW XComGame.upk)
1. If object's file position is > inserted position, add resizeAmount to object's position

Because resizing up or down is allowed, the operation is reversible, which I've verified. Reverting the change results in a upk file that is identical at the byte level to the original (verified using HxD "compare files" tool).

wghost81 · December 22, 2013

First: I've updated UPK Tools to skip expand function if function already has correct size.

Second: expanding function inside the state does work, but I haven't looked into states themselves. Anyway, if you try to expand anything, which is not a function by type, program will not try to reconstruct objects structure. It will fill new space with zeros, leaving actual reconstruction to modder.

Now about in-place resize. I really don't see any need for so much work. Primary argument: it will certainly make ToolBoks mods incompatible with expanded upk, as ToolBoks changes are offset-based. Any type of expansion will "break" upk structure anyway, but I think appending functions to the end of upk is a lesser evil, as it keeps compatibility with smaller mods.

As Drakous79 pointed out, "garbage" is not much of a problem: even 100Mb of garbage won't cause a considerable size growth, compared to 8Gb of game size. And with the Steam we do have a perfect tool for garbage collection: cache verification. Plus, an installer can make a "clean" backup and offer complete uninstall by reverting to that backup. Bigger mods tend to require vanilla game for installation and have their own installers, so I don't see any problems with reverting to exact same vanilla copy.

One thing to understand: function offset is irrelevant to the game engine. In fact, only header matters. Everything else is offset/size based, which means you can move things around, add any kind of "garbage" — and the game will still work. Any changes to upk create a compatibility problem, but since we now know upk structure, those problems could be solved. We need to think in object-oriented terms: base unit is an object and an object knows itself. We modify objects and let the tools take care of the rest.

BTW, figuring out data formats for other (non-functions) types is number one priority to make expand feature more safe.

PS Impressive work, Amineri!

PPS If you really want to be able to undo function move, you (or me :smile: ) can add a custom header before function data, which will point to this object offset in Export Table. This way you can remove newly added function and shift all other newly added functions (if there are any) to new positions. But I still think moving other functions around is bad for compatibility.

Edited December 22, 2013 by wghost81

Amineri · December 22, 2013

I agree with you 100% that doing 'in-place' is totally unnecessary for small mods, and has the really bad side effect of breaking ToolBoks' position-oriented game patching. It may be preferable to develop the mods using 'in-place' but for final installation use the appending, since they typically won't be messed with as much.

However for bigger mods (and while doing the actual development) using the in-place method is going to save us a lot of headaches. Both johnnylump and I are quite excited by it. :geek: However when it comes to actually patching, the format I've supplied doesn't specify whether the change is in-place or appended. It just specifies the basic information :

upk
function
size change
before hex
after hex

It could be applied either in-place or appended.

Advantage of appended is that it preserves file positions so a positional-patching tool like ToolBoks won't break upks by writing to the wrong position. (of course if it's trying to patch a function that has been resized then it will have no effect)

Advantage of in-place is that it is a reversible operation. The benefit of reducing the upk file size is (as you say) pretty marginal, as the sizes we are dealing with are negligible compared to the movie files.

Bigger mods tend to require vanilla game for installation and have their own installers, so I don't see any problems with reverting to exact same vanilla copy.

The way Long War (and Warspace, which it derived the installer from) works is to verify that the XCOM game exists and then make backups of the original upks, putting new ones (included in the installer) in their place.

However we still have to create those modded upks in the first place, and with 100s of function changes we typically don't start from a vanilla copy. Instead we iterate forward on top of previous changes, so for us this is quite a big deal :)

-------

One interesting thing I've noted is that around 40% of the space in the upk is actually taken up with the "header" information -- the namelist, object list, and import list. Just barely over half of the upk actually consists of actual objects (this is for XComGame.upk). Then when you add in the individual function header/footers contained within the actual objects I'd say that less than half of the upk is executable script and more than half is the cataloguing/indexing mechanisms.

-------

Also, I my testing code was really unoptimized, a really simple change to move some stuff out of the object-list loop resulted in the test running in 0.759 seconds, which includes :

1) reading the modfile and parsing the any unreal bytecode it contains (quasi decompiling for highlighting purposes)

2) reading the upk and parsing the header, constructing all lookup-able names (import list and export list)

3) creating the backup file, copying over the contents (resizing it) and updating all needed object list entries.

PS Amineri, my offer still stays. :wink: If you focus your utility on making mod script, I can add any needed functionality to PatchUPK to handle those scripts and perform the actual patching for you.

Fortunately for me johnnylump handles all the actual details of creating the actual installer, so I get to focus on (what is for me, anyhow) the "fun" stuff. So my suggestion (in terms of Long War installer) would be to talk to him.

But yes, the UPKmodder tool is focussed on creating mods not on installing them. And medium-to-large mods at that. I generally avoid trying to release "directly installable" mods, since providing support for them could easily take away all my time, leaving my time to write new mods and perform exploratory upk-surgery!

Funny story --

I think I might have been one of the first people to notice the EU bug that was causing the dropship screen to appear forever. Basically every time you launched the game it was appending some load info from the Slingshot DLC to one of the config files.

It was at the point where it was taking ~45 seconds to switch from the tactical to strategy games. When I dug into the config file I found that it was loading the same assets ~400 times, meaning that I had closed and relaunched the game 400 times since my last Steam verify.

Sooo ... yes, I end up doing some crazy things :ermm: , and being able to undo these types of changes will make it a lot easier to iterate forward. That said, when new patches come out we do have to start from scratch, which is why it's such a headache.

Bertilsson · December 22, 2013

However we still have to create those modded upks in the first place, and with 100s of function changes we typically don't start from a vanilla copy. Instead we iterate forward on top of previous changes, so for us this is quite a big deal :smile:

Not sure I follow the logic on this one...

Regardless of append or in-place expansion, wouldn't you still need exactly same number of backup files?

So we are still talking about only the dead weight itself being the problem? And the problem being that when you eventually have several hundreds of those backup files, a fraction of a percent would be dead weight?

To me a process to limit the number of backups or getting rid of no longer relevant backups seems astronomically more storage effective than eliminating dead weight inside the files.

If the concern was that you may need to expand each function multiple times during development, resulting in multiplied dead weight for each expanded function, then again it would be more logical to change that development process.

Instead of only expanding the exact needed amount of bytes, which will not be enough for next version, it would seem more logical to make it over-sized the first time, so that multiple expansion is not necessary... Or to make the tool in-place expand only if the function is the very last function in the file (end of function + 24 bytes == end of file).

If it was up to me, I would simply make the tool add 640 extra bytes whenever expansion is needed to prevent iterative expansion needs. That way you have a buffer which "ought to be be big enough for anybody" (sorry could not resist). Then we are talking about maybe 50 expanded functions in a gigantic sized mod á la Long War == 32KB added buffer + size of the original functions which I would expect to be less than 640 bytes on average, resulting in less than 64KB in total with buffer included.

But then again, if up to me, I would actually prefer to have ANY function change resulting in a new appended function, since that would allow me to incrementally rollback unwanted changes to any function without the need to have full file backups in other than very special situations. No need for multiple full file backups would in that situation save lots of disk space instead of wasting it on backups of unchanged irrelevant data.

BUT I still like the fact that you have been experimenting with re-sizing the function as it gives hope to the future possibility to add new objects, which would actually be a very good reason to offset things in the upk files :smile:

Amineri · December 22, 2013

Not sure I follow the logic on this one...
Regardless of append or in-place expansion, wouldn't you still need exactly same number of backup files?

I can't say that I can fully answer what my concerns are, as I don't fully understand them myself :)

I don't understand all of the long-term ramifications of trying to manage a large-ish mod where there were a lot of appended functions, and I suppose that's my concern. I haven't been able to model in my head what's going to happen, which gives me a definite sense of unease.

I imagine scenarios like the following :

Working on new mod, I think I need to use a particular function and make it bigger, so I do so
I finish up that version of that mod
I then add more changes
Two weeks later a user reports a bug, and I discover that I changed the wrong function, or something makes me have to roll back that change (it happened with both the ammo and the alien base mods)
There's no possible way I can go to a backup file (it would be two weeks old and missing all of my subsequent changes)
The remaining option is to revert the function back to vanilla but leave the extra dead object in place -- this is probably okay but less than satisfactory, as this is the file that has to be distributed with the installer

So, like I said it's probably okay, but I have an ill-defined sense of unease.

Regarding backups ... I really don't keep very many backups. I'm very careful about always being able to undo my changes (when I've been working with Notepad++ and HxD) and so have almost never had to revert to a backup. And despite the crazy amount of mods that I do. I guess I'm just a little bit extra-particular about these types of things, and that is extending to my design philosophy here :)

My "backup philosophy" is then basically to keep really detailed notes about each of the hex changes that I make (before/after) so I can always go back (even months later) and undo a specific change that I'd made. I do this instead of actually making backups of the files themselves.

------------------

I currently have both options (append and in-place) resizing operations coded and working in a utility class in the UPKmodder tool (which JL has been beta-testing while working on the new version of LW for EW). I can easily make both options available as side-by-side.

Currently the append version copies the existing function and fills out the excess space with 0Bs, appending the 24 byte header as specified by wghost.

The in-place version can completely revert the upk without any extra uninstall data -- everything is present in the modfile.

BUT I still like the fact that you have been experimenting with re-sizing the function as it gives hope to the future possibility to add new objects, which would actually be a very good reason to offset things in the upk files :smile:

Since I can insert into the file and fix up all of the object list entries, it should be possible to insert additional object entries. (instead of the extra space going into the function area it would go into the object entry table. Of course some of the primitive header info (namelist size/pos, importlist size/pos, and object size/pos) may have to be updated as well.

This can cascade to a lot of required changes, since both the functions in a class and locals for a function are defined by a range. Inserting a new object means that all reference all over the upk that come after that reference would have to be updated. This includes both the actual script as well as in various headers. The compiler never has to do this as it just iteratively add new objects as each new class is added to the package. In this case it's much harder to insert something then it was to build it in the first place.

The trick is I still don't quite understand the adjacency structuring of the functions in classes and variables/parameters in functions. Once that is sorted out, though, it may actually be possible to both insert new function in existing classes and possibly insert entirely new classes. In general adding a class (or even a function) is going to involve inserting a fair number of new objects. Each class seems to require at least two objects, each class variable requires one, each function seems to require 2, and each function localvar/ReturnValue/parameter requires one.

I'm not entirely sure the need as yet, since Firaxis did leave us a lot of debug classes lying around that can be taken over :D I will be keeping it in mind, though!

Amineri · December 22, 2013

One more note about adding new objects... In most cases it should be able to gotten around by the following trick:

Each function has a set of local variables, ReturnValue (possibly) and parameters. They form a contiguous block of reference numbers, so contiguous objects, basically. However it is possible to shift a local variable to be a parameter, or vice versa. Or to shift a parameter/local to become a ReturnValue or vice versa.

This doesn't violate the "rule of object conservation" ;)

Further, local variables can be "stolen" from other functions. The trick is just to make sure that the local won't ever be called in your current call chain (so two functions accessing it at the same time) and to make sure that it isn't a watched variable that triggers some sort of processing when changed.

So, if another function parameter or a Return Value is needed, a local variable can be transformed into it, and the function rewritten to use a local variable from another class/function.

What can't be shared in this manner are the class variables (since they are dynamically instanced), nor can entirely new functions be easily added.

Bertilsson · December 22, 2013

I can't say that I can fully answer what my concerns are, as I don't fully understand them myself :smile:
....
I'm not entirely sure the need as yet, since Firaxis did leave us a lot of debug classes lying around that can be taken over :D I will be keeping it in mind, though!

I think you are sort of semi-stuck in thinking in low-level and high-level at the same time, which wouldn't be strange at all since you are excellent at doing both.

The general idea is to stop making mods which reference absolute storage file positions (or exact before/after byte strings) and instead making it high level references which will be fairly patch-resistant.

On top of this the possibility to expand function size at a cost has come up (cost being either added dead space or offset absolute positions).

The offsetting file positions would basically be exactly what EW patch 1 and EU patch 6 did... Simply put not very popular :smile:

The waste of space alternative seems really trivial by comparison. Especially since you would be making high level mod files which would only result in at most a single dead space function when applied to a fresh copy of the game which you would sooner or later have to revert to in any case.

Even if you during development had to make 100 changes to 100 functions and each change resulted in loss of hundreds of bytes, we are still talking about temporary waste of a few MB data at most, and realistically speaking probably a lot less as I can never imagine close to 100 iterations needed for more than a very few functions.

In addition to this there is the perfectionist side of things. Perfectionist is usually a very good trait for any programmer but in this case I think a tad of "just getting it done with least amount of effort" approach may be more rational :smile:

On the topic of reasons for adding new objects, I can't think of any current need either, it would only be a cool possibility and possibly a catalyst into someone finding it useful :smile:

On a somewhat semi-unrelated note I do now have a few weeks of vacation time, so perhaps I will finally get around to doing some experiments on my own instead of pestering everyone about theoretical possibilities :smile:

Edited December 22, 2013 by Bertilsson

Expanding function size in UPK

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Recently Browsing 0 members