Many crypto protocols contain variable length fields: the names of the participants, different sizes of public key, and so on.
In my previous post, I mentioned how Liqun Chen has (re)discovered the attack that many protocols are broken if you don’t include the field lengths in MAC or signature computations (and, more to the point, a bunch of ISO standards fail to warn the implementor about this issue).
The problem applies to confidentiality, as well as integrity.
Many protocol verification tools (ProVerif, for example) will assume that the attacker is unable to distinguish enc(m1, k, iv) from enc(m2, k, iv) if they don’t know k.
If m1 and m2 are of different lengths, this may not be true: the length of the ciphertext leaks information about the length of the plaintext. With Cipher Block Chaining, you can tell the length of the plaintext to the nearest block, and with stream ciphers you can tell the exact length. So you can have protocols that are “proved” correct but are still broken, because the idealized protocol doesn’t properly represent what the implementation is really doing.
If you want different plaintexts to be observationally equivalent to the attacker, you can pad the variable-length fields to a fixed length before encrypting. But if there is a great deal of variation in length, this may be a very wasteful thing to do.
The alternative approach is to change your idealization of the protocol to reflect the reality of your encryption primitive. If your implementation sends m encrypted under a stream cipher, you can idealize it as sending an encrypted version of m together with L_m (the length of m) in the clear.