Meet the Next Generation UUID for High-Load Systems Keys

March 31, 2022 on the website IETF was officially posted text of the working paper (copy 1, copy 2) New UUID Formats (hereinafter referred to as the standard), which should formally update, but actually replace the long outdated and initially flawed RFC 4122.
The standard introduces new formats for universally unique identifiers (UUID) having the following properties:
for use in high-load applications and databases – both monolithic and distributed,
increasing in generation time (without leap seconds),
containing a timestamp, a counter with its segments initialized to zero and a pseudo-random value, as well as a pseudo-random value itself,
combined with metadata.
The standard recommends that DBMS vendors ensure that UUIDs are created and stored in new formats for use as identifiers or left-hand sides of identifiers, such as, but not limited to:
IN long and heated debate managed to develop a standard of essentially impeccable quality. Although some vague formulations and previous unsuccessful technical solutions have been preserved in the text, original and beautiful solutions to all problems have been found. It is worth noting especially the creative contribution of a resident of Japan with a pseudonym LiosK and reasonable decisions of the initiators of the standard Brad Peabody And Kyzer Davis from USA. The standard provides the maximum possible speed for searching for records by the UUID value contained in them. The standard contains many correct recommendations with justification. The only significant flaw in the standard is the wasteful use of 6 of the 128 UUID bits (segments ver And var) only for compatibility with the obsolete RFC 4122. The standard is superior ULID, KSUID, CUID and other analogues. All of them were explored and are specified in the standard.
The standard has not yet been approved, but DBMS vendors may already be starting to implement it. It is impossible to imagine that another, better and significantly different version of the standard will appear. Prototypes attached to the standard in the language C are highly simplified and therefore cannot be a good basis for development. Of the three proposed formats, the most practical value is the version UUIDv7.
Due to time constraints, the current version of the standard does not include alternative UUID text encodings. The initiators of the standard want to include them in the next version of the standard, and the encoding Crockford’s Base32 already approved by them.
While the standard gives the developers of UUID generators a great deal of freedom within the outlined limits, the reference UUID structure that was discussed during the development of the standard is as follows:
Designation in the standard | Segment position in UUID from left to right | Length, bits | Binary value or calculation algorithm | purpose |
unix_ts_ms | one | 48 | Number of milliseconds since midnight (00:00:00) January 1, 1970 Coordinated Universal Time (UTC) minus leap seconds | Ensuring monotonicity of written UUIDs. Timestamp with millisecond precision, lagging behind UTC by tens of seconds. The millisecond is the maximum possible precision for ordering by point in time generating UUIDs coming from different sources |
ver | 2 | 4 | “0111” | UUIDv7 version. The meaning of this segment is only in compatibility with RFC 4122 |
rand_a | 3 | one | Counter segment initialized to zero every millisecond | Counter overflow protection against unlikely counter initialization failure with a large pseudo-random value |
4 | eleven | Counter segment initialized to a pseudo-random value every millisecond | The counter ensures monotonicity of UUIDs from a single source, generated within a millisecond. Initializing the counter with a pseudo-random value reduces the chance of UUID collisions | |
var | five | 2 | “10” | A variant detailed in a standard or in RFC 4122, as opposed to other variants mentioned in RFC 4122. The meaning of this segment is only in compatibility with RFC 4122 |
rand_b | 6 | 12 | Counter segment initialized to a pseudo-random value every millisecond | The counter should be long enough to protect against overflow, but not too long to speed up the desired high-order binary lookup of the UUID. Within a millisecond, the entire counter is incremented by one for each next UUID |
7 | fifty | Pseudo-random value generated separately for each UUID | Unlike the counter segment, which is initialized with a pseudo-random value every millisecond, this segment makes it difficult to guess close UUIDs with the same timestamps | |
no designation | to the right of the UUID in an identifier used as a unique or surrogate key | any | Custom segment that can be compound | See below the table for possible custom segment elements |
Possible custom segment elements:
additional pseudo-random value
entity type or database table code
namespace
shard (segment) or partition (section)
data source code
operation type or message type code
check sum
other application-specific elements