I looked into this when trying to make a piece of toy software to serialize and de-serialize *all* OpenSSH-supported public key formats.

While OpenSSH uses standardized formats for private keys, the `ssh-* AAAA/3NzaC0…`

format you're used to pasting into remote servers is actually a "proprietary" (though with freely-licensed spec and implementation) encoding—it's not JSON, and not any standardized BER/DER codec; instead, it's *mostly* a Length-Value encoding (think TLV without the T) with fixed-length length fields. (Technically, the underlying encoding is basically ad-hoc, but in practice the only 2 of its datatypes anyone practically ever uses are `string`

, which holds octet strings of length variable at runtime up to $2^{32}-1$ bytes long; and `mpint`

, which holds the integers in range ${[{{-2^{{({2^{32}-1})}\times 8}},{2^{{({2^{32}-1})}\times 8}}})}$; both of which are, obviously, encoded as a `uint32`

length followed by the raw value.)

You can see from the following code fragment (which I wrote, and which is correct as regards OpenSSH-generated `~/.ssh/id_*.pub`

files) that, while OpenSSH actually encodes RSA's (and DSS's) key values itself, it just stashes elliptic-curve keys of all species as foreign "data blobs" that *have* to be parsed out by respective functions `unpack_bernstein_compressed_point`

and `unpack_sec1ec_point`

:

```
function _ossh2obj(buf) {
let reader = new _OsshReader(buf);
let type = reader.readString();
switch (type) {
case "ssh-rsa": {
let e = reader.readMpint();
let n = reader.readMpint();
return {type: 'rsa', value: {n, e}};
}
case "ssh-dss": {
let p = reader.readMpint();
let q = reader.readMpint();
let g = reader.readMpint();
let y = reader.readMpint();
return {type: 'dss', value: {p, q, g, y}};
}
case "ssh-ed25519":
case "ssh-ed448": {
// 1. Curve
let identifier = type.match(/^ssh-(ed\S+)$/)[1];
let params = getEdDSAParams(identifier);
// 2. Point
let A = reader.readBytes();
if (A.length !== get_bernstein_compressed_length(params))
throw new Error(`Invalid key (wrong ${type} length).`);
let P = unpack_bernstein_compressed_point(A, params);
// let {x, y} = P;
return {type: 'eddsa', value: {identifier, point: P}};
}
case "ecdsa-sha2-nistp256":
case "ecdsa-sha2-nistp384":
case "ecdsa-sha2-nistp521": {
// https://www.rfc-editor.org/rfc/rfc5656#section-3.1
// 1. Curve
let [hash_name, expected_identifier] = type.match(/^ecdsa-(\w+)-(\w+)$/).slice(1);
let identifier = reader.readString();
if (identifier !== expected_identifier)
throw new Error("Invalid key (mismatched type field and SEC 1 identifier).");
let params = getECDSAParams(identifier);
// 2. Point
let Q = reader.readBytes();
let P = unpack_sec1ec_point(Q, params);
// let {x, y} = P;
return {type: 'ecdsa', value: {identifier, h: hash_name, point: P}};
}
default:
throw new Error(`Unsupported OpenSSH key type: ${type}`);
};
}
```

These formats (SEC 1, Bernstein) were both "borrowed" wholesale from the non-OpenSSH elliptic-curve software ecosystem.

What's interesting to me (and why I wrote this post) is how *nearly* similar these 2 formats are. Actually, instead of saying how they're similar, I'll just enumerate *comprehensively* their differences:

- Bernstein encoding lacks the header byte, and supports
*only*compressed format. Bernstein-encoded elliptic-curve points are always exactly $\lceil{{({{\lceil{\text{log2}{({p})}}\rceil}+{1}})}\div 8}\rceil$ octets long; in contrast to SEC 1, Bernstein encoding*defines*octet-strings of length $0$ and ${{\lceil{({{\text{log2}{({p})}}-1})} \div 8}\rceil} \times {2}$* to be invalid. - For point compression, Bernstein encoding truncates the
*first*co-ordinate, $x$, rather than SEC 1's truncation of $y$.- Technically, the Bernstein protocols are defined over Montgomery curves (curves of form $By^2 = x^3 + Ax^2 + x$), while the general EC protocols are defined over short Weierstrass curves (with form $y^2 = x^3 + ax + b$), so this is a bit of an apples-to-oranges comparison. But it still bears noting, if you're trying to make actual serializers/deserializers of these codecs.

- Bernstein encoding encodes both points trivially, while SEC 1 maps the compressed point through a transformation, $0 \mapsto 2; 1 \mapsto 3$.
- Bernstein encoding concatenates ${Y} \mathbin\Vert {X}$
**bitwise**; this allows saving an octet for some curves. This contrasts with SEC 1, which serializes the points independently into whole "padded" octet strings before concatenation. - Bernstein encoding serializes the non-truncated co-ordinate into one
**fewer**bits than would be required to encode $p-1$, making a leap of faith for the extremely conservative assumption that the truncated co-ordinate will be $\ge 2$.- *Technically, the legal lengths for SEC 1 encoded points are $\{{1}, {1 + \lceil{{({{\lceil{\text{log2}{({p})}}\rceil}})}\div 8}\rceil + 1}, {1 + \lceil{{\text{log2}{({p})}} \div 8}\rceil \times {2}}\}$ (including the header byte, not pinching a bit off non-compressed co-ordinates, and not pinching an octet in bitwise concatenations). The formulas named earlier are what
*would*be the lengths if you generalized Bernstein encoding to support uncompressed points and the point at infinity.

- *Technically, the legal lengths for SEC 1 encoded points are $\{{1}, {1 + \lceil{{({{\lceil{\text{log2}{({p})}}\rceil}})}\div 8}\rceil + 1}, {1 + \lceil{{\text{log2}{({p})}} \div 8}\rceil \times {2}}\}$ (including the header byte, not pinching a bit off non-compressed co-ordinates, and not pinching an octet in bitwise concatenations). The formulas named earlier are what

[NOTE FROM EDITOR: The above formulas are all based on prime $p\neq 2$. While no Bernstein protocol is *yet* based on a characteristic-$2$ field, for SEC 1, some of these formulas are actually different for characteristic-$2$.]

The first point is the most significant difference; it was a design choice to create the following "knock-on" effects:

- $\mathcal{O}$ (the point-at-infinity) has no representation; it simply cannot be encoded—and, therefore, cannot be produced during decoding!
- Since compressed format is forced, you have to solve for the missing co-ordinate during deserialization; this means that no element of $\mathbb{Z}_p^2 \setminus E$ ("points-
*off*-the-curve") can be produced during decoding! - The only supported format has a fixed length, which allows simplifying codepaths and ossifying netcode.

These are security features because trying to treat with the point at infinity or points-off-the-curve can leak your private key or otherwise damage the security of the handshake if the software you're using wasn't specifically designed to handle those inputs (see CVE-2022-21449 and CVE-2017-16007 for representative examples); it's very easy for library developers to just *assume* that the public key you're trying to treat with is a fully legitimate one. By categorically preventing the deserializer from emitting them, the library developer is saved totally from being required to **even think about** these cases, and the application (and, therefore, the user) which uses a Bernstein protocol is saved totally from library developers who neglected to think about these cases.