Jump to content

UPK file format


wghost81

Recommended Posts

 

// name                 //??                    //size                  //value
82 32 00 00 00 00 00 00 ED 33 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 C0 3F 

 

Basic property fields are

UNameIndex NameIdx

UNameIndex TypeIdx

int32 Size

int32 ArrayIndex

 

ArrayIndex is zero for variables and counts from 1 for arrays.

 

Value format is determined by property type, it can be 1-byte value, object reference, integer, float, etc...

 

Property list ends with NameIdx of "None".

 

Sorry, I still don't have time to put this all into UPK format document. But here are some more good news:

 

 

Attempting deserialization:
UObject:
ObjRef = 0x000051E1
UDefaultProperty:
Stub!
UField:
NextRef = 0x000051DF
ParentRef = 0x00000000
UStruct:
ScriptTextRef = 0x00000000
FirstChildRef = 0x000051E1
CppTextRef = 0x00000000
Line = 0x00000CF5
TextPos = 0x00018739
ScriptMemorySize = 0x0000007E
ScriptSerialSize = 0x0000006E
Script decompiler is not implemented!
UFunction:
NativeToken = 0x0000
OperPrecedence = 0x00
FunctionFlags = 0x00020002
NameIdx = 0x00001131 (Index) 0x00000000 (Numeric)

 

 

This is deserialization example. I'm not trying to re-create UE Explorer, but I'm trying to output every possible field in object's serialized data so it will be possible to analyze values and construct our very own objects.

Link to comment
Share on other sites

  • Replies 111
  • Created
  • Last Reply

Top Posters In This Topic

This is how Enums look like from the inside:

 

 

FindObjectEntry
Name to find: XGStrategyActorNativeBase.EItemCategory
Found Export Object:
0x000000B9 (185): Enum'XGStrategyActorNativeBase.EItemCategory'
	TypeRef: 0xFFFFFEEB -> Enum
	ParentClassRef: 0x00000000 -> 
	OwnerRef: 0x00000146 -> XGStrategyActorNativeBase
	NameIdx: 0x00000C79 (Index) 0x00000000 (Numeric) -> EItemCategory
	ArchetypeRef: 0x00000000 -> 
	ObjectFlagsH: 0x00000000
	ObjectFlagsL: 0x00070004
		0x00000004: Public
		0x00010000: LoadForClient
		0x00020000: LoadForServer
		0x00040000: LoadForEdit
	SerialSize: 0x00000064 (100)
	SerialOffset: 0x001FEF9E
	ExportFlags: 0x00000000
	NetObjectCount: 0
	GUID: 00000000000000000000000000000000
	Unknown1: 0x00000000
Attempting deserialization:
UObject:
	ObjRef = 0x000000B8 -> ObjRef + 1 = 0x000000B9 -> EItemCategory
UDefaultProperty:
	End of property list: 0x00002594 (Index) 0x00000000 (Numeric) -> None
UField:
	NextRef = 0x000000B8 -> TUFORecord
UEnum:
	NumNames = 0x0000000A (10)
	Names[0]:
		0x00000C70 (Index) 0x00000000 (Numeric) -> eItemCat_All
	Names[1]:
		0x00000C78 (Index) 0x00000000 (Numeric) -> eItemCat_Weapons
	Names[2]:
		0x00000C71 (Index) 0x00000000 (Numeric) -> eItemCat_Armor
	Names[3]:
		0x00000C76 (Index) 0x00000000 (Numeric) -> eItemCat_Vehicles
	Names[4]:
		0x00000C77 (Index) 0x00000000 (Numeric) -> eItemCat_VehicleUpgrades
	Names[5]:
		0x00000C6F (Index) 0x00000000 (Numeric) -> eItemCat_Alien
	Names[6]:
		0x00000C72 (Index) 0x00000000 (Numeric) -> eItemCat_Corpses
	Names[7]:
		0x00000C73 (Index) 0x00000000 (Numeric) -> eItemCat_Facilities
	Names[8]:
		0x00000C75 (Index) 0x00000000 (Numeric) -> eItemCat_Staff
	Names[9]:
		0x00000C74 (Index) 0x00000000 (Numeric) -> eItemCat_MAX

 

Those are just names with no associated objects. So I think is is theoretically possible to expand this list using some weird names from name table.

 

It seems that object reference (first 4 bytes of any objects serialized data) points either to previous object or to this object but using base 0. At least for scripts. For maps this value points to some very weird objects, seemingly unrelated to this object.

 

PS Another UPK format updated. With corrected errors!

 

[attachment deleted]

Edited by wghost81
Link to comment
Share on other sites

Interesting find on the enums :)

 

We've found that at the hex-editing level, we can pretty much just ignore the enums, as they've already been compiled out of the bytecode itself. The only way that enums manifest themselves is when setting up the configuration file, as they allow you to use the enum names instead of just numbers (I suspect that this is the only reason why the enums get separate objects baked into the upk).

 

In our EW LW development, we've simply extended the localization array size and stripped off the enum association, which has proven sufficient to allow the addition of more localization variables in the .int files. This has allowed us to (so far) expand the number of foundry projects, techs, and soldier names (since there are going to be effectively 16 subclasses in EW LW).

Link to comment
Share on other sites

I hate default properties. :smile: Deserializing them is a pain. But I always was a stubborn one. :smile:

 

UPK format document moved from alpha to beta stage. UPK Utils 2.0 too. :smile: But I'm not ready to release a full utilities set and source files just yet. Need to clean up some mess in default properties. :smile: After that I'll move to upgrading PatchUPK for more object-oriented patching.

 

I decided not to add new objects/names functionality into PatchUPK. It is not safe to handle such things in patch manner. If some mod some day will require something like this, it will be BIG and require entire upk upload for installation anyway. And right now I'm just happy to know we're not bound by any limitations. :smile: Well... at least not much. :smile:

 

BTW, I was able to deserialize some map objects, like MainSequence and WaveSystem. May still return to modding DLC/final mission. :smile:

 

[attachment deleted]

Edited by wghost81
Link to comment
Share on other sites

Interesting find on the enums :smile:

 

We've found that at the hex-editing level, we can pretty much just ignore the enums, as they've already been compiled out of the bytecode itself. The only way that enums manifest themselves is when setting up the configuration file, as they allow you to use the enum names instead of just numbers (I suspect that this is the only reason why the enums get separate objects baked into the upk).

 

In our EW LW development, we've simply extended the localization array size and stripped off the enum association, which has proven sufficient to allow the addition of more localization variables in the .int files. This has allowed us to (so far) expand the number of foundry projects, techs, and soldier names (since there are going to be effectively 16 subclasses in EW LW).

 

One of many reasons.

 

For example it can be used at runtime in UnrealScript through the help of casting and native functions.

Secondly it is also usable by the engine's console commands such as "EditObj" "EditActor" "EditDefault" (might contain typos), and "Set", "Get". And then the Unreal Editor.

Edited by EliotVU
Link to comment
Share on other sites

I want to summarize a part of my conversation with Drakous79 here. It is quite interesting and concerns packages compression.

 

Thanks to the info from Gildor's forums, I was able to read compressed chunk header.

 

Now I can say for sure, that packages with _size files are in fact one giant compressed chunk. They hold entire compressed package inside and have no compression info attached. Seems, they are referenced from some other "startup" package (I'm suspecting GlobalPersistentCookerData.upk), which holds the necessary information and manages those files.

 

GlobalPersistentCookerData.upk seems to hold one object of PersistentCookerData with package names and other stuff. Exact structure unknown.

 

Normal packages can be either compressed or uncompressed and compression info is stored in package and compression flags. Compressed packages have the usual header, followed by NumCompressedChunks of compressed chunks data.

 

Here are C++ structures, which describe compressed header and data

struct FCompressedChunkBlock
{
    uint32_t CompressedSize;
    uint32_t UncompressedSize;
};

struct FCompressedChunkHeader
{
    uint32_t Signature;  // equals to package signature (0x9E2A83C1)
    uint32_t BlockSize;  // maximal size of uncompressed block, always the same
    uint32_t CompressedSize;
    uint32_t UncompressedSize;
    uint32_t NumBlocks;
    std::vector<FCompressedChunkBlock> Blocks;
};

struct FCompressedChunk
{
    uint32_t UncompressedOffset;
    uint32_t UncompressedSize;
    uint32_t CompressedOffset;
    uint32_t CompressedSize;
};
Link to comment
Share on other sites

So I've been digging a bit into structures, and am trying to line up what's in your latest rev with what I'm seeing in the upk.

 

My goal here was to change the structure:

struct TPsiTrainee
{
    var XGStrategySoldier kSoldier;
    var int iHoursLeft;
    var boolean bPsiGift;

    structdefaultproperties
    {
        kSoldier=none
        iHoursLeft=0
        bPsiGift=false
    }
};

into:

struct TPsiTrainee
{
    var XGStrategySoldier kSoldier;
    var int iHoursLeft;
    var int bPsiGift;

    structdefaultproperties
    {
        kSoldier=none
        iHoursLeft=0
        bPsiGift=0
    }
};

(My goal is to re-purpose the bPsiGift variable into storing the psi perk that is being trained)

 

This appears to need to happen in 2 parts:

1) Change the XGFacility_PsiLabs.TPsiTrainee.bPsiGift Object Table entry from BoolProperty to IntProperty.

Doing this changes the type of the value in the structure, but leaves the default property bPsiLabs=false

 

2) Change the default properties in XGFacility_PsiLabs.TPsiTrainee to set default property for bPsiGift.

 

The raw hex for the XGFacility_PsiLabs.TPsiTrainee object is :

79 38 00 00 94 25 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
79 38 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
0F 1A 00 00 00 00 00 00 D2 25 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 
9C 15 00 00 00 00 00 00 A3 16 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 
AF 04 00 00 00 00 00 00 95 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
94 25 00 00 00 00 00 00

which appears to correspond to:

{|TPsiTrainee|} <|None|> 0 0 0 
{|TPsiTrainee|} 0 0 0 0 0 0 
<|kSoldier|> <|ObjectProperty|> 4 0 0 
<|iHoursLeft|> <|IntProperty|> 4 0 0 
<|bPsiGift|> <|BoolProperty|> 0 0 00
<|None|>

I use <| |> to represent 8-byte name indices and {| |} to represent 4-byte object indices. Singleton 0's represent a 4-byte value, while the 00 represents a 1-byte value.

 

The individual DefaultProperties do mostly correspond to the UDefaultProperty definition layed out in the document (it's not quite clear that the 1 byte int8 BoolValue BoolProperty only line is entirely omitted except for boolean default properties), although it appears that you've mixed together items that only appear once per Default Properties list and items that appear for each individual Default Property.

 

Each individual Default Property appears to look like :

  • 8 bytes / UNameIndex / NameIdx / Property name
  • 8 bytes / UNameIndex / TypeIdx / Property type
  • 4 bytes / int32 / Size / Property size in bytes -- 0 for BoolProperty
  • 4 bytes / int32 / ArrayIdx / -- 0 for BoolProperty
  • (optional)1 byte / int8 / BoolValue / BoolProperty only -- omitted for non-BoolProperty
  • (optional) Size / variable / PropertyValue / Size bytes of property value -- omitted for BoolProperty (since Size=0)

The StructNameIdx field doesn't appear to be defined for each individual Default Property value (as it seems in the latest rev), but is only placed once at the end, and is a name reference to "None".

 

The overall Struct composition appears to follow :

  • 4 bytes UObjectReference StructObject
  • 8 bytes UNameIndex TypeIdx Property type -- "None"
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • 4 bytes UObjectReference StructObject
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • 4 byte int32 -- unknown
  • variable list of individual Default Properties, terminated with 8 bytes UNameIndex "None"

Some of the unknown 4-byte fields may of course be joined and represent 8-byte name fields. This only roughly appears to correspond to the UStruct description in the latest rev (which has a lot of 'variable' size objects).

 

-------------------------

 

Defining a boolean default properties takes 3 fewer bytes than defining a null class or integer default property. Fortunately the in-place resizing method works beautifully here, allowing me to resize the structure up by 3 bytes in order to properly define the new integer member's default value. (I realize that you could also duplicate the entire object and resize it up 3 bytes, re-linking the object table entry to the new object).

 

Currently I'm doing raw hex replacements as the UPKmodder parser can't handle default properties formating to allow me to use named references. The UPKmodder file that resizes the structure and puts in the new hex is :

 

 

MODFILEVERSION=4
UPKFILE=XComStrategyGame.upk
GUID=31 9C 3B 3F 9C 5D E4 40 AB AF 92 8E 25 65 74 F2 // XComStrategyGame_EW_patch1.upk
FUNCTION=TPsiTrainee@XGFacility_PsiLabs
RESIZE=3

// change default value of bPsiGift to 0 to reflect changed variable type

[BEFORE_HEX]
//<|bPsiGift@TPsiTrainee@XGFacility_PsiLabs|> 00 00 00 00 {|Core:BoolProperty@Core|} 00 00 00 00 00 00 00 00 00 00 00 00 00 
AF 04 00 00 00 00 00 00 95 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
[/BEFORE_HEX]


[AFTER_HEX]
//<|bPsiGift@TPsiTrainee@XGFacility_PsiLabs|> 00 00 00 00 {|Core:IntProperty@Core|} 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
AF 04 00 00 00 00 00 00 A3 16 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 
[/AFTER_HEX]
 

 

 

The raw hex is all that is being replaced .. the earlier line is a comment-note about what is being replaced.

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.

×
×
  • Create New...