CerebralPolicy Posted October 17, 2023 Share Posted October 17, 2023 (edited) I've been working on some retextures for my own pseudo-faction and now that SF1Edit has been released as a beta, I'd like to make them as non-replacements. Now in previous games this would have been easy, even if Material Editor hasn't been updated a .bgsm file is simply a .json config. I could edit it in Notepad++ and boom done. One tiny issue. What black magic is this Bethesda? I can't open this. I know for a fact that the materials files are in this database since the nifs refer to a .mat file. I can generate a .mat file using blender, but it would be easier if I could crack open this database. Hopefully someone will make a tool that would allow this, since I can't know what the _mask textures actually do until I can see Bethesda's .mats. UPDATE: Blender can't really generate a .mat and I have no idea how the .mats are formatted to begin with. Edited October 17, 2023 by CerebralPolicy Link to comment Share on other sites More sharing options...
StandardTacticalKnight Posted October 19, 2023 Share Posted October 19, 2023 Yeah, from opening it with a hex editor it is some sort of hashmap for all of these materials. If you open it in a hex or text editor, it is all stored in plaintext to you are able to ctrl+f for the relevant .dds filenames you need to have I guess but knowing the proper format for the .mat files would be nice... for a standalone item though yeah, good luck, I guess you could try and find some unused or test mat and highjack it's material references? Link to comment Share on other sites More sharing options...
CerebralPolicy Posted October 19, 2023 Author Share Posted October 19, 2023 Yeah, I'll see if I can do that. Essentially I'd like to have uniforms for each section of my mercenary company, right now I can only have four since that's the number of First outfits avaliable. Link to comment Share on other sites More sharing options...
grizbane Posted November 30, 2023 Share Posted November 30, 2023 (edited) Just ran into this myself. Navigated from the ESM file that has a record that points to a NIF file that resides in a BA2 file. But the NIF file has a reference to a *.mat file which I assume must be in this CDB file. I think this may be a Constant Database File (see https://docs.fileformat.com/database/cdb/ ). Sadly the original CDB reader doesn't compile on my CentOS system (I think the code is too old and relies on out of date libraries) and the TinyCDB compiles, but apparently barks on the materials.cdb file. Still not sure how to map the filename in the NIF file to a key in this (supposed) CDB KV store - but heck I still cannot even find a way to read the CDB file. Did Bethesda have some sort of "find the most obtuse file formats known to man" contest when designing their games? Why BA2 files when we have tar and zip files already? Why CDB files that somehow hold *.MAT files (whatever the heck they are) instead of just packing the MAT files into a BA2 file? Anyway, going to go read the specs for the CDB format and see if I can cobble together a Java program that might read the file. Assuming it even is a Constant DB file... Edited November 30, 2023 by grizbane Link to comment Share on other sites More sharing options...
grizbane Posted November 30, 2023 Share Posted November 30, 2023 Hm. Nope. Whatever that "materials.cdb" file is, its not a "Constant Database" file. Found information on the CDB format here: https://www.unixuser.org/~euske/doc/cdbinternals/index.html The first 8 bytes are 4-byte binary integers followed by a bunch of 4-byte offsets into the file. This file starts with "BETH" and then lots of ASCII characters. So no idea what type of file we've got here. Link to comment Share on other sites More sharing options...
fo76utils Posted November 30, 2023 Share Posted November 30, 2023 It is unrelated to other file formats with a ".cdb" extension, the format is actually the same as that of the REFL subrecords in Starfield.esm. These are used to serialize various data structures, and are not limited to materials. It is possible to dump the information using the mat_info tool from here, or with esmview or esmdump in the case of the ESM. Link to comment Share on other sites More sharing options...
captsensib1e Posted November 30, 2023 Share Posted November 30, 2023 Yeah not a "sg-cdb" database, tried opening it with Java. Also thought it might be some sort of .db sqlite file, but it doesn't seem to be that either. Link to comment Share on other sites More sharing options...
fo76utils Posted November 30, 2023 Share Posted November 30, 2023 (edited) For reference, here is a short description of the reflection data format. It consists of a set of chunks that begin with the chunk type (4 bytes, BETH, STRT, TYPE, CLAS, OBJT, DIFF, LIST, MAPC, USER or USRD) followed by the data size as a 32-bit integer, then the chunk data. The first three chunks are always BETH, STRT and TYPE, in this order: BETH: Header, the size is always 8 bytes, and the data consists of two 32-bit integers, a version number that is currently always 4, and the total number of chunks in the stream (this includes the BETH itself). STRT: String table, it is a set of C style NUL terminated strings concatenated to a single data block. In the rest of the stream, type and variable names are 32-bit signed integer offsets into the string table. There is also a set of pre-defined types that can be referenced with negative string table offsets (see below). TYPE: It contains a single 32-bit integer that is the number of CLAS chunks to follow. These are followed by class definitions (CLAS), the same number as specified in TYPE. The format of CLAS data is: Class name as string table offset. Class version/ID as a 32-bit integer, typically 0, 1 or 2. Flags as 16-bit integer, if bit 2 (0x0004) is set, then a USER or USRD chunk will be used to store the structure data. Bit 3 (0x0008) is set on certain structures, but its exact purpose is unknown. Other flag bits seem to be currently unused. Number of field definitions to follow (16-bit integer). The definition of a single field consists of the name and type (both as 32-bit string table offsets), then the data offset and size as 16-bit integers. The latter two refer to how the structure data is stored in memory in the game (with alignment etc. taken into account), and are not required for decoding. The class definitions are followed by the actual data. Each object is stored as an OBJT or DIFF chunk, which begin with the data type (as string table offset). The difference between OBJT and DIFF is that the former just contains all data fields as defined in the CLAS, while the latter uses a "differential" format that allows for encoding only a subset of the fields. In DIFF, each field is stored as a 16-bit signed field number (0 = first) followed by the data. A negative field number denotes the end of the structure. The differential format is not used within simple built-in types (like integers and floats) that are not structures, but it is inherited by sub-structures that are stored in separate LIST, MAPC, USER or USRD chunks. Regular structures can be nested within OBJT and DIFF, however, certain data types require additional chunks, which are stored separately after the parent object. These special types include: LIST: A list of objects, begins with the element type (string table offset) and the number of elements (32-bit integer), followed by the element data. MAPC: A map of objects, similar to LIST, but it contains key, value pairs, and it begins with the key and value types (two string table offsets) and the number of pairs. USER, USRD: Used for sub-structures with the "user" (0x0004) flag set, USER if the parent is OBJT, and USRD if it is DIFF. These allow for type conversions, and always begin with the class name (similarly to OBJT and DIFF) and end with an unknown 32-bit integer that seems to be always a small non-negative value, typically 0, 1 or 2. After the class name, the data is stored as type, value pairs. If the type is the same as the class name (no actual conversion is performed), then the encoding of the data is identical to OBJT and DIFF. Otherwise, a type, value pair is stored for each field of the structure, and the type seems to be always a built-in one (negative string table offset) based on the data I could test in the ESM and CDB. Note that in this case, there seems to be no difference between USER and USRD, and a value can be assigned even to an empty structure (0 fields). Finally, here is the table of built-in types: 0xFFFFFF01 (-255): Null, no data. 0xFFFFFF02 (-254): String, a 16-bit length value followed by the C style string data (including a terminating NUL character). 0xFFFFFF03 (-253): List, requires a separate LIST chunk. 0xFFFFFF04 (-252): Map, requires a separate MAPC chunk. 0xFFFFFF05 (-251): Pointer/reference to anything, stored as a pair of type and data (the type is a string table offset). 0xFFFFFF08 (-248): 8-bit signed integer. 0xFFFFFF09 (-247): 8-bit unsigned integer. 0xFFFFFF0A (-246): 16-bit signed integer. 0xFFFFFF0B (-245): 16-bit unsigned integer. 0xFFFFFF0C (-244): 32-bit signed integer. 0xFFFFFF0D (-243): 32-bit unsigned integer. 0xFFFFFF0E (-242): 64-bit signed integer. 0xFFFFFF0F (-241): 64-bit unsigned integer. 0xFFFFFF10 (-240): Boolean (0 or 1 as an 8-bit integer). 0xFFFFFF11 (-239): 32-bit float. 0xFFFFFF12 (-238): 64-bit float. The above is only a description of the general reflection data. However, the material database can be dumped in a human readable format with mat_info -dump_db, which could be of help understanding the structures it uses. The 32-bit hashes in resource IDs use CRC32, this sample code correctly calculates them (paths are expected to use lower case characters only, and backslashes as separators). Edited December 2, 2023 by fo76utils Link to comment Share on other sites More sharing options...
fo76utils Posted December 1, 2023 Share Posted December 1, 2023 (edited) More on the material database, it consists of two objects describing the list of materials, material objects and components, followed by all components. The first object in the CDB is of type BSMaterial::Internal::CompiledDB: BSMaterial::Internal::CompiledDB { String BuildVersion Map HashMap List Collisions List Circular } BuildVersion is the version of the game (currently "1.8.86.0"), and HashMap is a map from BSResource::ID to uint64_t. It maps material paths (represented as CRC32 hashes of the base name without the .mat extension and the directory name, and the extension that is always "mat\0") to an unknown 64-bit hash. Note that while the definition of BSResource::ID has the fields in Dir, File, Ext order, File is actually the first in the data. The second object is BSComponentDB2::DBFileIndex: BSComponentDB2::DBFileIndex { Map ComponentTypes List Objects List Components List Edges Bool Optimized } 'ComponentTypes' maps 16-bit component type IDs to string format class names. 'Objects' is a list of all material objects, in this format: BSComponentDB2::DBFileIndex::ObjectInfo { BSResource::ID PersistentID BSComponentDB2::ID DBID BSComponentDB2::ID Parent Bool HasData } PersistentID is similar to the keys used in HashMap above, and contains the same information for the externally visible layered material (.mat) objects. DBID is the internal 32-bit ID of the object (it cannot be 0), while Parent is used as the base object to construct this object from. HasData is true for all except 6 "root" objects from which all others are derived, and only for those, the Parent is 0. These 6 objects are for the 6 material object types, denoted by the base names "layeredmaterials", "blenders", "layers", "materials", "texturesets" and "uvstreams". 'Components' links material components to material objects: BSComponentDB2::DBFileIndex::ComponentInfo { BSComponentDB2::ID ObjectID UInt16 Index UInt16 Type } ObjectID is one of the DBID values from the object list, Index is a component slot for component types of which there can be more than one for a single object (e.g. a TextureSet object may have texture files associated with it at Index = 0 to 20), otherwise it is 0, and Type is one of the 16-bit component type IDs previously defined in ComponentTypes. Finally, 'Edges' describes how the material objects are organized in a tree structure: BSComponentDB2::DBFileIndex::EdgeInfo { BSComponentDB2::ID SourceID BSComponentDB2::ID TargetID UInt16 Index UInt16 Type } Index seems to be always 0, and Type is always the ID of BSComponentDB2::OuterEdge. This defines TargetID as logically the parent of SourceID. After BSMaterial::Internal::CompiledDB and BSComponentDB2::DBFileIndex, all material components are stored as OBJT and DIFF chunks, the total number and order of these is exactly the same as in 'Components' above. All components of derived objects are always stored after all components of their base object (the Parent in ObjectInfo), so they can be copy constructed using the data that has already been read. Edited December 1, 2023 by fo76utils Link to comment Share on other sites More sharing options...
captsensib1e Posted December 1, 2023 Share Posted December 1, 2023 (edited) From the description above, it does sound like a "sg-cdb" file, and makes sense to use it as it is basically a fast read DB file. But quite possible Bethesda made some tweaks or modifications where its not straightfoward to open programatically like a standard .cdb file. The materials file is called materialsbeta.cdb after all (note the "beta" part). fwiw I used the "sg-cdb" java (what I'm most familiar with) program to try to open the materisbeta.cdb, and it throws an IllegalArgumentException "invalid cdb format", similar to trying to open up a completely different and random file like "word.exe" or "my-dog-pics.jpeg" (works fine on a sample .cdb file I created). Didn't want to spend too much time digging around the source code of sg-cdb (as this could very well be Bethesda black magic), but this exception is thrown in a nextElement method in class CDB which tries to parse out a data entry - not surprising this method will choke if the .cdb is not formatted as expected (or is a completely different file type). Edited December 1, 2023 by captsensib1e Link to comment Share on other sites More sharing options...
Recommended Posts