Bundle_schema

Old: ChuanHsing 2023-01-24 13:13:51

New: ChuanHsing 2023-09-05 21:35:37

OldNewDifferences
33 [source](https://github.com/poe-tool-dev/ggpk.discussion/wiki/Bundle-scheme)
44
55 As of patch 3.11.2 for Harvest preparing for the launch of Heist the distribution method and patching of Path of Exile changed from the previous monolithic `Content.ggpk` file to a system of multiple bundles more akin to what has previously been deployed for console game clients.
6+
7+The path hashing scheme was updated in patch 3.21.2 and the differences are explicitly noted in this page.
68
79 On Steam there are tens of thousands of bundle files each containing related assets which makes it easier for Steam to do its write-a-new-file type of atomic patching as files will be smaller. In the Standalone client the bundles are contained in a `Content.ggpk` with a "node count" of 3 but retaining the existing `PDIR` and `FILE` structure for patching.
810
9698
9799 The list of `file_info` is ordered by `bundle_index`. For groups with equal `bundle_index` the beginning of the list is ordered by distinct values of `file_offset`. Following those there may be several files with the same `(bundle_index, file_offset, file_size)`, denoting distinct files that have the same payload.
98100
99-The hash field is the [FNV1a](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash) hash of the full file path in lower case. This hash is also salted with `++` suffixed at the end of the file name, thus taking the format `<lower_file_name>++`.
101+The hash field is generated by one of two different algorithms and schemes depending on the game version:
102+
103+Up until 3.21.2 the hash is the [FNV1a](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash) hash of the full file path in lower case. This hash is also salted with `++` suffixed at the end of the file name, thus taking the format `<lower_file_name>++`.
100104
101105 For example the file `Art/UIDivinationImages.txt` will become `art/uidivinationimages.txt++` and resulting hash `0x574cc9062dcda786`. This can be used for looking up a file to it's corresponding section of a bundle.
102106
107+Since 3.21.2 the scheme is instead [MurmurHash64A](https://en.wikipedia.org/wiki/MurmurHash#MurmurHash2) ([C](https://github.com/hhrhhr/MurmurHash-for-Lua/blob/master/MurmurHash64A.c), [Python](https://github.com/Project-Path-of-Exile-Wiki/PyPoE/blob/eb909d8f0826f93b9e58601dc4eb7a050c9fc10b/PyPoE/shared/murmur2.py#L104)) with a seed of 0x1337b33f with full file path in lower case but without the `++` suffix of the previous scheme.
108+
103109 The bundle at the end of the binary index file contains information on how to generate a set of paths from base paths and append operations. The entries in `path_rep` indicate the specification extents for this payload. Each element slices out a part of the payload at offset `payload_offset` of size `payload_size`.
104110
105-The hash field of `path_rep` is also a [FNV1a](https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function#FNV-1a_hash) hash. The path name is keep in upper & lowercase unlike the `file_info` hash but has the trailing `/` stripped off the end and suffixed with `++`. For example `Art/2DArt/SkillIcons/passives/Assassin/4K/` will become `Art/2DArt/SkillIcons/passives/Assassin/4K++` and have a hash of `0xe8deca74810f821f`.
111+The hash field of `path_rep` is generated the same way as the one for file paths and has no trailing `/`. For the legacy 3.11.2 scheme the path name is kept in upper & lowercase and has a `++` appended to the end.
112+
113+In 3.21.2 onward the directory path is unconditionally lowercased and has no `++` suffix. For example `Art/2DArt/SkillIcons/passives/Assassin/4K/` will become `Art/2DArt/SkillIcons/passives/Assassin/4K++` and have a hash of `0xe8deca74810f821f`.
106114
107115 When using `payload_size` to generate paths for all slices, the number of paths is equal to the number of files.
108116
109117 The `payload_recursive_size` field in `path_rep_t` is similar to `payload_size` but also generates file paths from all subdirectories of the directory entry. Some directory entries generate no paths of their own when using `payload_size` as they have no files of their own, just subdirectories.
110118
111119 The name of a directory is not explicitly expressed but can for non-empty directories be obtained by finding the last slash in any generated path in a directory. All generated file paths in a directory entry share the same parent directory, path generation may not introduce additional subdirectories.
120+
121+Note that since 3.21.2 all paths generated are forced lower-case, losing previously known path information and requiring applications to adapt to this change in path lookups.
112122
113123 ## Path specification encoding
114124 A path specification section has two kinds of elements, an unsigned 32-bit integers and null-terminated narrow strings.