johnnylump Posted December 24, 2013 Share Posted December 24, 2013 (edited) NVM, wrong place Edited December 24, 2013 by johnnylump Link to comment Share on other sites More sharing options...
wghost81 Posted December 24, 2013 Author Share Posted December 24, 2013 (edited) Amineri, yes, namelist indexes are 8 bytes long. I almost ready to write all UPK object data format description. It will answer a LOT of questions, as there are a LOT of things I learned lately. :smile: First more important thing is: packages are data streams. This is the place where all programmers grin and leave. :smile: The rest may be clear to programmers, but I'm not a programmer. :smile: So I needed a great amount of time to understand what it means. Basically, all objects have persistent data which are serialized into binary data stream (i.e. package). If some class inherits from the other class, parent class data are serialized first. For example, UState inherits from UStrust, which inherits from UField, which inherits from UObject (all UE classes inherit from UObject, BTW). So, first bytes in State serial will describe UObject persistent data, second UField and so on. UObject data typically consist of ObjectIndex and a list of serialized properties (for non-Class objects), which ends with "None". And that's exactly what you can see in function "header": 4 bytes of ObjectReference + 8 bytes of uint64 namelist index to "None", as functions don't typically have properties and "None" indicates the end of property list. And so on. Memory and file sizes and script tokens are actually a part of UStruct serialized data, while function "footer" is UFunction data. And another important thing: almost everything in UE is variable-size! So, to be on a safe side, we can't assume anything, we need to rebuild object data if, for example, we're altering its type. Edited December 24, 2013 by wghost81 Link to comment Share on other sites More sharing options...
Amineri Posted December 24, 2013 Share Posted December 24, 2013 At least everything in Unreal byte code is byte-aligned... In actionscript the designers went so far in trying to squeeze out every drop of bandwidth that they use variables with variable bit-width. So when I was decoding/re-encoding the perk tree sprites sometimes the objects were 13 bit integers and sometimes 14 bit integers. And not guaranteed to start on a byte boundary! :sick: I think, though, it's safe to assume that all the unreal bytecode is at least byte-aligned ;) Link to comment Share on other sites More sharing options...
XMarksTheSpot Posted December 24, 2013 Share Posted December 24, 2013 (edited) In actionscript the designers went so far in trying to squeeze out every drop of bandwidth that they use variables with variable bit-width. So when I was decoding/re-encoding the perk tree sprites sometimes the objects were 13 bit integers and sometimes 14 bit integers. And not guaranteed to start on a byte boundary! :sick:Actually the ActionScript portions of a flash file are byte-aligned as it's stored as byte code. The various other SWF tags like sprite and shape definitions however are bit-packed to the max :smile: </smart-ass> Edited December 24, 2013 by XMarksTheSpot Link to comment Share on other sites More sharing options...
wghost81 Posted December 25, 2013 Author Share Posted December 25, 2013 (edited) While I'm still working on format description document, I figured I will post some interesting info. Name index is actually a part of a structure (or class). I called it UNameIndex, as I don't know how it is actually called in UE sources. :smile: UNameIndex serializable data are: uint32 NameTableIdx: 4 bytes, index to Name Table; uint32 NameCount: 4 bytes, numeric. This structure is used in Import/Export Tables as well as in serialized object data. If NameCount is zero, name string is equal to Name Table name, extracted by NameTableIdx. Otherwise, numeric is added to name sring, equal to NameCount-1. Example: "SomeObjectName_0", "SomeObjectName_1", ... "Text" variables are also a part of a structure/class. I called it UString. UString serializable data are: int32 StringLength: 4 bytes, length of a string; variable-size String: ASCII or Unicode string. If StringLength > 0, String is an ASCII null-terminated string. If StringLength < 0, String is an Unicode string. With this notations used, Name Table Entry consists of two fields: UString Name uint64 NameFlags Import Table Entry: UNameIndex PackageIdx UNameIndex TypeIdx int32 OwnerRef UNameIndex NameIdx Edited December 25, 2013 by wghost81 Link to comment Share on other sites More sharing options...
wghost81 Posted December 25, 2013 Author Share Posted December 25, 2013 This next-to-last hex line starts with 0x0C, which is a "Lable Table Token" and is then followed by 24 bytes. Looking for possible interpretations, the B2 09 00 00 looks like it could be a namelist reference to "Begin", while 76 5F 00 00 looks like it could be a namelist reference to "None" So in general it looks like the token parses as 0C <namelist reference> 00 00 00 00 <4 bytes> <namelist reference> <4 bytes>This is LabelTable. One of the state binary data fields defines LabelTableOffset, which is memory offset of LabelTable. LabelTable entry: UNameIndex NameTableIdx uint32 LabelMemoryPos LabelTable begins with 0x0C token. Last LabelTable entry contains NameTableIdx to "None" with LabelMemoryPos = 0x0000FFFF (which is ignored, obviously). Link to comment Share on other sites More sharing options...
Amineri Posted December 25, 2013 Share Posted December 25, 2013 Is the uint32 LabelMemoryPos filled out at run time (i.e. is it a memory position in actual memory), or is it relative to the function start, like most of the uint16 memory positions used (e.g. 0x07 JumpIfNot token). I'm wondering if I should decipher that uint32 as a jump offset that is (in all the example I saw in XCOM code) pointing to the head of the function, or should I just leave it alone? It doesn't seem like a value that'd be likely to be modded, but you never know ... Link to comment Share on other sites More sharing options...
wghost81 Posted December 26, 2013 Author Share Posted December 26, 2013 This offset is located in state "header" and if you'll want to expand state script, you'll most certainly need to change this value. LabelMemoryPos is similar to jump offset, i.e. it points to memory position relative to current script start. All this will be described in UPK format document. I'm not ready to release it yes, as it is a bit messy. :smile: Link to comment Share on other sites More sharing options...
wghost81 Posted December 27, 2013 Author Share Posted December 27, 2013 (edited) OK, I'm ready to release an alfa version of UPK format description document. It is incomplete and may contain errors, so be careful. I'm not taking any credits for discovering all these things, I just gathered the info from different sources. For the last two days I tried to use this info to add new objects into the package. But I had no success. :sad: Even adding a new name to Name Table causes the came to crash. But an interesting thing is, UE Explorer decompiles my modified UPK correctly: I was able to add new names and objects and even add new variables to existing structures... but this all doesn't work with actual game. :sad: I don't know either I made some mistakes or XCOM just doesn't allow it. :sad: [attachment deleted] Edited March 17, 2014 by wghost81 Link to comment Share on other sites More sharing options...
wghost81 Posted December 27, 2013 Author Share Posted December 27, 2013 (edited) OK, I'm a biggest idiot on this planet. :smile: Everything works: I am able to add new objects as child objects of existing objects! In short: I am able to add new variables! Hooray! :smile: Christmas miracle. :smile: PS You will be laughing, but I forgot to delete uncompressed_size files from my game files. :smile: I was verifying the game lately to check for meld arrows mod bugs and I completely forgot about it. :smile: Edited December 27, 2013 by wghost81 Link to comment Share on other sites More sharing options...
Recommended Posts