Updating wiki articles

wghost81 · March 23, 2015

Since there are many new modders joining our community I suggest reviewing and updating wiki articles on XCOM modding as some of them are not up to date.

I took a liberty to update an article on hex values: http://wiki.tesnexus.com/index.php/Hex_values_XCOM_Modding

I haven't hand-crafted those tables (I mean wiki formatting) :smile: , but reformatted them with regexp from the document I created while I was working on pseudo-code decompiler: https://drive.google.com/open?id=0B5MAcyqYBx4dSjQ3Nk05akFfVEE&authuser=0

Any suggestions on how this can be improved are welcomed.

I also suggest updating this article: http://wiki.tesnexus.com/index.php/Hex_editing_UPK_files

It's getting a bit old and there is a lot of new data we can add into it to make learning curve for newcomers a bit smoother. :smile:

Edited March 23, 2015 by wghost81

dubiousintent · March 24, 2015

If someone :whistling: will provide the updated information (especially identifying what needs to be replaced if necessary), I'll be glad to get it into the wiki articles. I think it really takes someone who has been deeply involved in the modding process to understand what could stand to be updated.

I took a quick look at the "Hex values XCOM Modding" article. Other than a couple of lines that have hex values like "0x61" in the Name field and nothing else on the line (that needs some explanation like "Native Function call"), which look like they are intended to be some sort of section separators like you did with "Cast Tokens" (0x38), what would you like formatted? Basically make it into your Tokens table from that pseudo-compiler document?

I would suggest making the Native Functions "0x61", "0x62", "0X65", "0x69", "0x6A", "0x6F", and "Primitive Cast" into their own labelled tables. Shouldn't we also show the function call syntax and parameters?

"Hex Codes" page however, will wind up with too much empty space if we use separate columns for each Native Function to show the different meanings of the code in that circumstance. Looking them over I don't see more than two Native Function or Primitive Cast meanings at a time for any value. So we could probably get by with two "Native Function/Primitive Cast (x)" column labels, and put the "0x6#: usage" in the field.

Also we could make "0x6#: usage" entry into a link to the appropriate "Native Functions/Primitive Cast" table. There is considerable overlap between those pages, so linking between specialized tables that are on the same page may work better and make the relationships clearer.

-Dubious-

Edited March 24, 2015 by dubiousintent

wghost81 · March 24, 2015

If you download the ods table I linked (or open it with GoogleSheets), you see that table in two different forms (google gives a print preview and this one wasn't made for printing). I believe the first page is close to one of the variants you suggested. But IMO it isn't that clear to understand.

The thing with 0x61-0x6F tokens is that they allow to extend native table beyond 256 functions.

Since tokens are 1 byte in size, they can hold values from 0x00 to 0xFF, which gives 256 possible values. And there are way more native functions in UE than 256. To solve this problem UE uses 0x60-0x6F tokens as markers which indicate that the index of this function exceeds 0x00-0xFF range and that the next token should be used to identify a native function with index more than 0xFF. This effectively makes extended native function tokens 2 bytes long and increases possible indexes range up to 0xFFF.

Token 0x60 marks the beginning of extended native tokens group and never actually used in scripts. I believe they did it to save memory space, as the most frequently used tokens are 1 byte long and less frequently used extended tokens are 2 bytes long.

So, indexes 0x00-0x5F belong to the most frequently used tokens, which define basic unrealscript operators. Those operators are defined in C++ code and are not declared inside Core and Engine unrealscript classes. Indexes 0x70 to 0xFF are used for common operations, like comparison and summation. Corresponding functions are declared in base unrealscript classes (like Object and Actor) but defined in C++ code. All the other functions with indexes from 0x100 to 0xFFF are declared in unrealscript classes and defined in C++ code. Such functions have native(XXX) keyword, which indicates that they are a part of native table (XXX is a native table index).

Basic unrealscript functions have specific hex data format and all the other native functions follow the same pattern:

<ExtendedNativeToken> <FunctionToken> <Expression1> ... <ExpressionN> <EndParmExp>

For tokens 0x70 through 0xFF <ExtendedNativeToken> (0x60) is omitted to save space. For tokens 0x100 through 0xFFF

<ExtendedNativeToken> = 0x60 + (0xF00 & NativeTableIndex) >> 2
<FunctionToken> = 0x0FF & NativeTableIndex

So, for the function with index 921 = 0x399

<ExtendedNativeToken> = 0x60 + (0xF00 & 0x399) >> 2 = 0x60 + 0x300 >> 2 = 0x63
<FunctionToken> = 0x0FF & NativeTableIndex = 0x0FF & 0x399 = 0x99

The shortened version of this explanation is given inside tokens table on wiki page. Since all the extended functions follow the same pattern, their hex data format is identical.

So, to make the long story short, the table is indeed long and 0x61, 0x62, etc. tokens mark the different "pages" of the table. I agree there have to be more explanations on how it works, but right now I only copied my working data from ods document.

The same way, cast tokens are belong to the different table, as they are not reused tokes but a second part of 2 bytes long token 0x38.

Since tokens 0x00 through 0x5F do not have associated Package.Class and have different hex format and a lot of notes, we can make a separate table just for them alone. Then we can put all the explanations about 0x60-0x6F group and all the other tokens can be put into separate table with Dec, Hex, Name, Expression and Package.Class rows.

What do you think?

UPD: Yes, I agree to provide an update on hex editing upk files. I won't make it PatchUPK specific and will try to stick to simple hex examples. But it's a lot of work and I hope the others will share their thoughts and experience on this matter either. :wink:

Edited March 24, 2015 by wghost81

dubiousintent · March 25, 2015

I was indeed looking at a downloaded copy of the ODS file and trying to make sense of it. With that additional explanation I think we have a basis for starting with.

So first we point out that this is a "tokens" table, that "tokens" are for basic UnrealScript and Native Functions, and the table is intended to be used with both UE and direct hex editing. Then explain the token size limit and how the "Extended Native tokens" take the function indexes beyond the 256 limit and are listed in separate tables. And then we explain the Cast Tokens.

Keeping tables separate helps both with suitable formatting and explanations. Some examples of how each table would be used to identify and create a hex change would probably help as well.

I think I have enough to put together an improved draft version. Then you can straighten out anything I get confused, and I'll proofread for clarity. I tend to put the article together "offline" (especially with tables) and make final adjustments "online", so you may not see any changes immediately.

I don't think there is any problem with putting PatchUPK specific information into the article, as long as it is identified as such and that it is not the only possible method. Outside of the Long War team, PatcherGUI/PatchUPK is probably the most utilized patching tool, if only because of it's versatility.

-Dubious-

wghost81 · March 26, 2015

Here's my first attempt on updating hex editing article: https://docs.google.com/document/d/1nCVBdUvEof8m4pjvFq2R9W5bDikwgay0ogy6OQiU0pY/edit?usp=sharing

tracktwo · March 27, 2015

I tried to update the voice/sound editing wiki page but I had a lot of trouble with timeouts. I did put together a more complete document/tutorial on creating voice packs, so dubiousintent if you feel like it feel tree to take any useful information/files/screenshots from here and put them up on the Wiki.

The tutorial folder is here: https://drive.google.com/open?id=0BylHgRB_52WHflZSVXd4T18xSlU4bjA4bFdTQlpOQUpCWkZWdHBocUhlckNxeEFuZGhiV0k&authuser=0

With the main document here: https://drive.google.com/open?id=1prH2gxib8uiTGEy6fdNVayioQXsAn97x_qaomJNNz1U&authuser=0

Edit: It's mostly geared toward long war, but the other information in the voice info article and/or thread should hopefully be enough info to enable the custom voices in vanilla too.

Edited March 27, 2015 by tracktwo

dubiousintent · March 28, 2015

@tracktwo: Thanks. Added to the list of updates.

@wghost81: I'm tackling the "Hex Editing" article now. The "Tokens table" update will be next after that.

-Dubious-

wghost81 · March 28, 2015

dubiousintent, thanks!

dubiousintent · March 28, 2015

The initial pass at updating the "Hex editing UPK files" article is now up.

-Dubious-

Edited March 28, 2015 by dubiousintent

wghost81 · March 28, 2015

Thanks, dubiousintent!

IMO, these sections are obsolete, as manipulating the object's buffer is no longer needed.

Manipulating hex code
Once we've located the function we want to change and we have its hex code, we'll want that hex code in a place where we can edit it. So while in View Buffer in UE Explorer, we click Edit, then Dump Bytes. This copies the hex buffer to the clipboard. Then, open Notepad++ and paste it into a text document. It will show each byte separated by "-", so to make it more readable we can select the whole text (<Ctrl+Home>, then <Shift+Ctrl+End>), and make a search and replace (<Ctrl+H> opens the search-and-replace window). There in "find what" type a hyphen, and in "replace for" enter a blank space. Finally, select replace all and it's done. It may be useful as well to break the code into lines of 16 bytes, to match UE Explorer's buffer view, but beware it may not perform searches as desired because of the line breaks.

Locating a hex value within a function
Now comes the fun part. There are three possible approaches to find a specific value inside a function hex code:
...

Original article was created when UEE didn't have decompiled/disassembled tokens view, so modders had to practically repeat all the decompilation steps manually to locate a hex value inside the buffer. Although it's a good historical reference, leaving it as it is along with decompiled/disassembled tokens functionality mixes things up and potentially confuses readers.

XCOM

Updating wiki articles

Recommended Posts

wghost81

Link to comment

Share on other sites

dubiousintent

Link to comment

Share on other sites

wghost81

Link to comment

Share on other sites

dubiousintent

Link to comment

Share on other sites

wghost81

Link to comment

Share on other sites

tracktwo

Link to comment

Share on other sites

dubiousintent

Link to comment

Share on other sites

wghost81

Link to comment

Share on other sites

dubiousintent

Link to comment

Share on other sites

wghost81

Link to comment

Share on other sites

Recently Browsing 0 members

Browse

Activity

My Activity Streams

Explore