PHP RFC: Improve hash_hkdf() parameter order and handling
- Version: 0.9
- Create Date: 2017-02-05
- Author: Yasuo Ohgaki yohgaki@ohgaki.net
- Status: Draft (or Under Discussion or Accepted or Declined)
- First Published at: http://wiki.php.net/rfc/improve_hash_hkdf_parameter
Introduction
HKDF is informational internet standard defined by RFC 5869. HKDF is designed to generate secure key for encryption/validation/authentication/etc from other key information such as output from hash_password(), API KEY, etc. The RFC states “designers of applications are therefore encouraged to provide salt values to HKDF if such values can be obtained by the application.”, but current PHP implementation discourages “salt” parameter use by it signature.
This PHP RFC will not explain HKDF in detail, please refer to RFC 5869 or other references for details.
Terms
- IKM - Input Key Material which is secret.
- salt - Some entropy value which could be both secret and non secret. Poor entropy like timestamp is acceptable when IKM is strong. Ideally random value of used hash size. Poor IKM example is user defined plain password.
- info - Context and application specific information such as a protocol number, algorithm identifiers, user identities, etc. These are non secret values by definition.
- length - Controls HKDF result length.
- strong key - Cryptographically secure(true) random bytes. Example 256 bit strong key is random_bytes(64).
- weak key - Anything not cryptographically secure(true) random bytes. Example 256 bit weak key is hash('sha256', 'myrandompassword', FALSE). This key is extremely weak key.
Typical HKDF usage with PHP would be:
- Generate new encryption keys for user, session, domain, groups, etc, from “IKM”, “salt” and “info”.
- Generate security tokens for CSRF protection, Object Access, etc, from “IKM”, ““salt” and “info”.
hash_hkdf() is added to master without PHP RFC already. It has following signature currently.
string hash_hkdf(string algo, string ikm [, int length = 0 [, string info = '' [, string salt = '']]])
“salt” is the last optional parameter that user would omit blindly.
In most use cases, IKM is strong key. However, in real world, IKM could be user defined poor plain text password.
RFC 5869 “Notes for HKDF Users” states,
3.1. To Salt or not to Salt
HKDF is defined to operate with and without random salt. This is done to accommodate applications where a salt value is not available. We stress, however, that the use of salt adds significantly to the strength of HKDF, ensuring independence between different uses of the hash function, supporting “source-independent” extraction, and strengthening the analytical results that back the HKDF design.
Primary purpose of “salt” is to generate stronger key from IKM by “salt” entropy. “Salt” is also often used as pre shared key. i.e. Salt is combined final key. In some cases, “salt” could be a password for users.
3.2. The 'info' Input to HKDF
While the 'info' value is optional in the definition of HKDF, it is often of great importance in applications. Its main objective is to bind the derived key material to application- and context-specific information.
Primary purpose of “info” is to distinguish key context so that generated key is only usable to specific context. i.e. Users should not use secret value for “info”.
Summary of salt and info parameter
- Unordered List Item
hash_hkdf() behavior
md5() is used to obtains shorter result from hash_hkdf().
[yohgaki@dev PHP-master]$ ./php-bin -r 'var_dump(bin2hex(hash_hkdf("md5","123456")));'
string(32) "**1a4f9cd30ab214082d93ba850f1fa2b0**"
[yohgaki@dev PHP-master]$ ./php-bin -r 'var_dump(bin2hex(hash_hkdf("md5","123456", 20)));'
string(40) "**1a4f9cd30ab214082d93ba850f1fa2b0**54cfcd49"
[yohgaki@dev PHP-master]$ ./php-bin -r 'var_dump(bin2hex(hash_hkdf("md5","123456", 20, "1")));'
string(40) "d0d1bbee08810d08a1e54f3a401308353cedd30b"
[yohgaki@dev PHP-master]$ ./php-bin -r 'var_dump(bin2hex(hash_hkdf("md5","123456", 20, "1", "1")));'
string(40) "ca16de591ad40f02e599428bf9f50772ebead3ff"
The result is affected by both “salt” and “info” parameters. Although hash_hkdf() does complex hash calculations to derive secure key from IKM, salt and info, it seems simple hash calculation by using separate parameters from user point of view. “length” parameter works in a way that weaken derived key.
hash_hkdf() uses HMAC to compute output hash. Following code is equivalent.
// Although the value returned differs, they are equivalent
$key = hash_hmac('sha256', $ikm, $salt);
$key = hash_hkdf('sha256', $ikm, 0, '', $salt);
hash_hkdf() applications with PHP
Typical PHP HKDF application can be used with “salt”. Application can provide better security with “salt” in many cases. This PHP RFC only describes 3 examples here, but there are many PHP HKDF usages that can/should/must use with salt. Developers must consider salt use for better security rather than omitting salt without proper consideration. Salt is often a part of final key, sometimes key for users to access resources. If it is possible, developers should use strong salt. When IKM is weak, developer must use strong salt to keep IKM security.
Web page with expiration that requires password
Step1: Setting up keys
- Get application secret and strong secret master key stored in secure place. e.g. $_ENV $ikm
- Generate random password. e.g. base64_encode(random_byte(8)) $salt
- Set expiration $timestamp
- Get and set URL for the page and expiration time. $info e.g. http://example.com/the_page?expire=$timestamp
- Generate HKDF hash value ( $hkdf_key ) with 1, 2 and 3. hash_hkdf('sha256', $ikm, 0, $info, $salt) as a part of combined key.
- Send final URL includes HKDF hash (e.g. http://example.com/the_page?expire=$timestamp&hk=$hkdf_key) via mail/etc.
- Notify password ($salt) via SMS/etc.
Step2: Validating keys
- Ask and get user to enter password ($salt) generated by Step1
- Get application secret and strong secret master key stored in secure place. e.g. $_ENV $ikm
- Get and set the URL and expiration time. $info
- Generate HKDF hash value with 1, 2 and 3. hash_hkdf('sha256', $ikm, 0, $info, $salt)
- Compare saved HKDF hash value and 4, allow access to the page only when password ($salt) matches.
With this method, system does not have to store each combinations of $_GET['hk'] (HKDF derived key), $_GET['timestamp'] (info) and $_POST['password'] (salt). Access is cryptographically secured. No user registration required.
Per user data encryption
- Get application secret and strong secret master key stored in secure place. e.g. $_ENV $ikm
- Get user ID. $info
- Generate HKDF hash value with 1 and 2. hash_hkdf('sha256', $ikm, 0, $info)
- Encrypt the user data with the key from 3
Suppose your application had SQL injection vulnerability and your data is stolen including password hash and encrypted user data. Secret encrypted data can be decrypted by attackers if “salt” is NOT used.
See also Security Note section blow. The same technique can be used for per user encryption key.
Better way to encrypt per user
- Get application secret and strong secret master key stored in secure place. e.g. $_ENV $ikm
- Get strong random salt as combined key, store it as secret key for the user. $salt
- Generate HKDF hash value with 1 and 2. hash_hkdf('sha256', $ikm, 0, '', $salt)
- Encrypt the user data with the key from 3
Note that both method uses “secret” information for $ikm. However, there is notable difference between these 2. This method uses only 1 secret key (master encryption key) and info (user ID) is known to public, one stolen key allows attackers to decrypt all encrypted data. Latter method requires 2 secret information(master encryption key and secret salt as combined key) to attack.
CSRF token
When session ID is used for CSRF token, there is risk that session ID can leak to others by saving & sending HTML page, by malware web browser plugins that read page content, etc. Therefore, session ID should not be used as CSRF token. HKDF can be used to generate CSRF token belongs to specific session with “salt”.
Setting up CSRF token
Generate strong CSRF token seed, store it in $_SESSION. ( $_SESSION['CSRF_TOKEN_SEED'] = random_bytes(32) )
- Use CSRF_TOKEN_SEED as secret key. ( $ikm )
- Get expiration timestamp. ( $salt ) Weak salt is OK, since $ikm is strong.
- Generate HKDF value with 1 and 2. <nowiki>hash_hkdf('sha256', $ikm, 0, '', $salt)</nowiki>
- Send key from 3 and timestamp from 2 to browser as CSRF token.
Verifying CSRF token
- Get HKDF key and timestamp ( $salt ) value from request.
- Check if timestamp is not expired.
- Generate HKDF value from CSRF_TOKEN_SEED ( $ikm ) and timestamp ( $salt ). hash_hkdf('sha256', $ikm, 0, '', $salt)
- Compare HKDF hash value sent by browser and server generated HKDF hash value if it matches.
Secure CSRF token expiration can be defined with this method regardless of session ID lifetime.
Proposal
Change hash_hkdf() signature from
string hash_hkdf(string algo, string ikm [, int length = 0 [, string info = '' [, string salt = '' ]]])
to
string hash_hkdf(string algo, string ikm, string salt [, string info = '' [, int length = 0 [, bool raw = FALSE]]) - Return value: HEX string hash value by default, i.e. raw = FALSE - algo: Hash algorithm. e.g. "sha1", "sha256" - ikm: Secret Input Key Material. Some kind of key. e.g. Super secret master key, password, API key, etc. - salt: Secret or non secret salt value. Set NULL to use without salt, raise exception for empty string. e.g Random value such as nonce, timestamp, etc. - info: Generated key context. e.g Protocol number, user ID, group ID, applicable URI, etc. - length: Output length. If omitted, default output size of specified hash function.
Note: Parameter order and internal salt parameter handling is changed. i.e. Type and salt length check is changed. Make return HEX string by default by considering PHP use case and consistency for existing hash functions.
From user perspective, “salt” and “info” parameters could be used interchangeably. However, it would be good idea to follow RFC 5869 semantics. Unlike “info”(context) which could be an optional in many cases, “salt”(entropy) cannot be an optional with expiration enabled keys for instance. This kind of expiration is common. e.g. Amazon AWS S3 uses HKDF like expiration to allow access to objects.
Typical PHP applications that need HKDF can use (or should use) salt for security reasons as examples above. Users should consider if salt can be used or not. If salt could be secret, it should be kept as secret. Users shouldn't omit salt blindly. It could lead serious security issue. i.e. When input key is weak, input key can be guessed from hash_hkdf() output without salt. Salt makes derived keys considerably stronger.
Salt is better to be required parameter because typical PHP applications can supply salt. Salt should set to be empty only when salt cannot be used.
Salt summary for typical PHP HKDF usage:
- Use salt always when it is possible.
- Use strong salt if possible.
- Use secret salt if possible.
- When input key is weak, must use strong secret salt.
- Omit salt only when salt cannot be used.
- Omitting salt is likely to result in weaker implementation, see 'Other Use Cases' section.
Return value:
- Most use cases with PHP will require string return values. Make it default.
- HEX string return value and “raw” flag parameter is required to be consistent with existing hash functions.
Discussions
Salt is optional. - Salt could be optional, but it should be provided always whenever it is possible.
On Mon, Jan 16, 2017 at 8:16 PM, Andrey Andreev narf@devilix.net wrote:
There's no comment from you on the PR, inline or not, but I can assure you this was not overlooked.
Salt is optional because RFC 5869 allows it to be optional. There's a reason for each of the current defaults work as they do, as well as the parameter positions:
- Length is in no way actually described as optional, and that makes sense as the function's purpose is to create cryptographic keys, which by nature have fixed lengths. The only reason we could make Length optional is because hash functions' output sizes are known values, and matching the desired OKM length with the hash function size makes for better performance.
- Info can be empty, but the algorithm is pretty much meaningless without it. The purpose of HKDF is to derive 2+ outputs from a single input, with the Info parameter serving as the differentiating factor.
- Salt is … while recommended, the only thing actually optional.
Incorrect argument. Salt could be optional, but as the RFC describes “salt” as “salt adds significantly to the strength of HKDF” and “designers of applications are therefore encouraged to provide salt values to HKDF if such values can be obtained by the application.”, “info” is actually optional. “Salt” should be used always whenever it is possible as the RFC recommends.
In order to obtain strong output key, either input key or salt must be cryptographically strong. i.e. When input key is weak, strong salt is mandatory by HKDF definition.
In any cases, 'info'(context) has less importance than 'salt'(entropy) at least.
On Mon, Jan 16, 2017 at 8:08 PM, Nikita Popov nikita.ppv@gmail.com wrote:
Making the salt required makes no sense to me.
HKDF has a number of different applications:
a) Derive multiple strong keys from strong keying material. Typical case for this is deriving independent encryption and authentication keys from a master key. This requires only specification of $length. A salt is neither necessary nor useful in this case, because you start with strong cryptographic keying material.
b) Generating per-session (or similar) keys from a (strong cryptographic) master key. For this purpose you can specify the $info parameter. again, a salt is neither necessary nor useful in this case. (You could probably also use $salt instead of $info in this case, but the design of the function implies that $info should be used for this purpose.)
c) Extracting strong cryptographic keying material from weak cryptographic keying material. Standard example here is extracting strong keys from DH g^xy values (which are non-uniform) and similar. This is the usage that benefits from a $salt.
d) Combinations thereof.
Remember that HKDF is an extract-and-expand algorithm, and the extract step (which uses the salt) is only necessary if the input keying material is weak. We always include the extract step for compatibility with the overall HKDF construction (per the RFCs recommendation), but it's essentially just an unnecessary operation if you work on strong keying material.
The only thing that we may want to discuss is whether we should swap the $info and the $salt parameters. This depends on which usage (b or c) we consider more likely.
a) When deriving keys, “salt” should be supplied whenever it's possible. Simply deriving other key w/o salt would not be typical, not recommended at least, usage with PHP because PHP is not used to implement basic cryptographic algorithms.
b) If I assume 'user identity' is used for “info”, then derived key wouldn't be “per session” key, but “per user” key. So assuming session ID is used as “info”. While it works, the RFC states “info” as are non secret information, i.e. “a protocol number, algorithm identifiers, user identities, etc.”. Session ID is secret key. We don't have to follow RFC recommendations always, but storing secret key in “info”(context) violates the RFC.
For per-session encryption/etc, simple choice for secret input key(IKM) would be per-session master key stored in $_SESSION, random string as “salt” which is a part of final key, optional “info”(context) could be used for additional information such as “confidential”,”public“, etc.
We are implementing RFC 5869. Not following the RFC recommendation does not make sense.
c) True, but “salt” is not only good for generating strong key from weak key according to the RFC. “salt” can be used as part of final key just like crypt() calculates password hash by “salt and password”.
d) True, but you seems to be missed “non secret salt” usage. 'Other Use Cases' section includes many “non secret salt” that are used as final key.
“the extract step (which uses the salt) is only necessary if the input keying material is weak”, this cannot be true by the RFC.
What the RFC states is
Yet, even a salt value of less quality (shorter in size or with limited entropy) may still make a significant contribution to the security of the output keying material
It does not say salt is only good for weak input keys, but generated key will have significantly better security.
Should not be used with weak key. - It is ok by HKDF definition.
On Sun, Feb 5, 2017 at 1:20 AM, Tom Worster fsb@thefsb.org wrote: The salt defends against certain attacks on predictable input key material, i.e. weak passwords. But HKDF should not normally be used for passwords because it is unsuitable.
Strong key is prefered, but input key shouldn't not have to be strong. For weak keys, strong salt should be used though.
Other Use Cases
Following use case examples are using new hash_hkdf() signiture. $ikm could be any valid keys. Secret master key is assumed for convenience. Generally speaking, secret master key is difficult to maintain, developers are better to avoid it if it is possible.
As you can understand from bad examples, omitting salt is as optional parameter results in optimal implementation.
Create Strong Key From Weak Key For a User
Bad example first
- Get plain text password. ($ikm)
- Get random string to make strong password. ($info)
- Generate encryption key. hash('sha256', $ikm, NULL, $info);
Although it works, developers shouldn't do this because $info is intended for context which is public information as per the RFC.
Correct way is
- Get plain text password for user. ($ikm)
- Get secret random string to make strong password. ($salt)
- Generate encryption key. hash('sha256', $ikm, $salt);
Note: “salt” is intended to be secret or non secret as per the RFC, but salt should be secret. Since $ikm is very weak, salt must be strong.
Per User Encryption Key
Bad example first
- Get secure secret master key. ($ikm)
- Get user ID. ($info)
- Generate encryption key. hash('sha256', $ikm, NULL, $info);
Developers shouldn't do this because once encryption key is stolen, they cannot issue new encryption key.
Correct way is
- Get secure secret master key. ($ikm)
- Generate secret or non secret random slat. ($salt) Save salt and provide salt as part of key to user also if it is needed.
- Get user ID. ($info)
- Generate encryption key. hash('sha256', $ikm, $salt, $info);
Developers can issue new encryption keys as many as they needed with this way for user.
Combined key, output key and salt, may be disclosed to user as encryption key.
Per Session Encryption Key
Bad example first
- Get secure secret master key. ($ikm)
- Get session ID. ($info)
- Generate encryption key. hash('sha256', $ikm, NULL, $info);
Although it works, developers shouldn't do this because $info is intended for context(public information) as per the RFC.
Correct way is
- Get secure secret master key. ($ikm)
- Set secret random salt to session, get it as salt. ($salt)
- Generate encryption key. hash('sha256', $ikm, $salt);
URL access key
Bad example first
- Get secure secret master key. ($ikm)
- Get URL should be protected. ($info)
- Generate access key. hash('sha256', $ikm, NULL, $info);
Developers shouldn't do this unless they are absolutely sure that URL is accessible with the generated key regardless of stolen key.
Better way is
- Get secure secret master key. ($ikm)
- Get random salt. ($salt) Save salt and provide salt as a part of key to user.
- Get URL should be protected. ($info)
- Generate encryption key. hash('sha256', $ikm, $salt, $info);
By keeping track valid $salt values, developers can control key validity.
Combined key, output key and salt, may be disclosed to user as encryption key.
Limited URL access key
Good example only
- Get secure secret master key. ($ikm)
- Get timestamp for access control as non secret salt. ($salt) Provide salt as a part of key.
- Get URL should be protected. ($info) Accessed URL is known to server when user accesses it.
- Generate access key. hash('sha256', $ikm, $salt, $info);
Any keys that have timeout can build similarly.
Generate Key with Timeout
Good example only
- Get secure secret master key. ($ikm)
- Get timestamp + random string for access control as non secret salt. ($salt) Provide salt as a part of key.
- Generate access key. hash('sha256', $ikm, $salt);
Generate Key Whatever Purpose For a User
Bad example first
- Get user's password information. ($ikm)
- Get user ID. ($info)
- Generate access key. hash('sha256', $ikm, NULL, $info);
Developers shouldn't do this because developers cannot issue new key for the user, cannot revoke keys. Since input is weak, there must be strong salt also.
Better way is
Note: this example's salt is secret partially, non secret partially.
- Get user's password information. ($ikm)
- Get random secret master salt (Per system or per user. Per user prefered.) + get random salt(user key) as combined key. ($salt) Save user key salt and provide it as a part of key.
- Generate access key. hash('sha256', $ikm, $salt);
By keeping track salt values, developers can control key validity.
Generate Key from Whatever ID
Bad example first
- Get secret master key. ($ikm)
- Get ID. ($info) Could be non secret ID like group ID display in URL.
- Generate access key. hash('sha256', $ikm, NULL, $info);
Developers shouldn't do this because developers cannot issue new key for the user, cannot revoke keys. Since input is weak, there must be strong salt also.
- Get secret master key. ($ikm)
- Get random secret salt for ID ($salt)
- Get ID ($info) Could be non secret ID like group ID display in URL.
- Get key version ($info)
- Generate key. hash('sha256', $ikm, $salt, $info);
Use output key, ID and key version as combined key. Users may revoke keys by checking ID and/or key version.
Generate Key for Group
Good example only
- Get secure secret master key. ($ikm)
- Get random salt. ($salt) Save salt and provide salt as a part of key.
- Get group ID. ($info)
- Generate access key. hash('sha256', $ikm, $salt, $info);
By keeping track valid $salt values, developers can control key validity and can issue as many as new keys as needed.
Backward Incompatible Changes
None. hash_hkdf() is new function.
Proposed PHP Version(s)
Next PHP 7.x
RFC Impact
None.
Open Issues
Please comment if any.
Unaffected PHP Functionality
Nothing is affected. hash_hkdf() is new function does not affect any.
Future Scope
Please comment if any
Proposed Voting Choices
State whether this project requires a 2/3
Patches and Tests
TBD
Implementation
After the project is implemented, this section should contain
- the version(s) it was merged to
- a link to the git commit(s)
- a link to the PHP manual entry for the feature
- a link to the language specification section (if any)
References
Links to external references, discussions or RFCs
Rejected Features
Keep this updated with features that were discussed on the mail lists.