No normative description of the password hashing algorithm is provided, so interoperability of this feature cannot be assumed. In an informative section, 5-pages of C-language source code is provided as “an example”, and this appears to involve machine-dependent bit manipulations.
Provide a normative, cross-platform definition of the hashing algorithm. Cross-platform source code can be given as an example, but the normative text should be in English, not in a programming language.
pp. 1917-1922 Part 4, Section 03.02.29
ED
Proposed Disposition of DIS 29500 Comment CA-0039 (Modified: 2008-01-13) Agreed; the source code provided is machine dependent, and the hashing algorithm requires a normative textual description. As well, to ensure that each instance is identical, we will create one normative definition of this algorithm and reference it throughout the text. Before providing the proposed disposition, it should be noted that based on multiple national body comments, the current hashing mechanism and all of its attributes will be deprecated in favour of a new mechanism which utilizes only well-accepted hashing algorithms. Accordingly, we will remove this and other legacy hashing mechanisms from their current location in the specification, and place them into a new annex for deprecated features. Following the precedent set by other ISO standards (such as SQL’s ISO 9075:2003 Part 1 and C++’s ISO/IEC 14882:1998), we will make use of a new Annex that contains normative descriptions of all deprecated features. The intent of this Annex is to enable a transitional period during which existing binary documents being migrated to DIS 29500 can make use of those deprecated features to preserve their fidelity, while noting that new documents should not use them. Accordingly, the Conformance clause will also be changed to state that newly created documents (those not created by migrating existing binary documents) should not use deprecated features. All deprecated features will be removed from their current locations in the standard, but will be fully defined in this new Annex. However, even though this legacy hashing mechanism is deprecated, we agree that it should be normatively defined (as some national bodies pointed out, the algorithm is already defined in §2.15.1.28); accordingly, the following changes will be made: Part 4, §3.2.12, page 1,896, attribute reservationPassword: reservationPassw ord (Write Reservation Password) Specifies the legacy hash of the password required for editing this workbook. This hash is optional and may be ignored. The hash is generated from an 8-bit wide character. 16-bit Unicode characters must be converted down to 8 bits before the hash is computed, The hash is generated using the logic defined in the revisionsPassword attribute of the workbookProtection element ( §3.2.29 ) . The resulting value is hashed using the algorithm defined below. [Note: An example algorithm to hash the user input into the value stored is as follows: // Function Input: // szPassword: NULL terminated C-Style string // cchPassword: The number of characters in szPassword (not including the NULL terminator) WORD GetPasswordHash(const CHAR *szPassword, int cchPassword) { WORD wPasswordHash; const CHAR *pch; wPasswordHash = 0; if (cchPassword > 0) { pch = &szPassword[cchPassword]; while (pch– != szPassword) { wPasswordHash = ((wPasswordHash >> 14) & 0×01) | ((wPasswordHash << 1) & 0×7fff); wPasswordHash ^= *pch; } wPasswordHash ^= (0×8000 | (’N’ <<
| ‘K’); } return(wPasswordHash); } end note] The possible values for this attribute are defined by the ST_UnsignedShortHex simple type (§3.18.87). Part 4, §3.2.29, pages 1,9161,922, attribute revisionPassword: revisionsPasswor d ( Legacy Revisions Password) Specifies the legacy hash of the password required for unlocking revisions in this workbook. The hash is generated from an 8-bit wide character. 16-bit Unicode characters must be converted down to 8 bits before the hash is computed, using the following logic. For SpreadsheetML password hash purposes, Unicode UTF-16 input code points are converted to a n “ansi” single or double byte code page character set. from the following list: 874 windows-874 ANSI/OEM Thai (same as 28605, ISO 8859-15); Thai (Windows) 932 shift_jis ANSI/OEM Japanese; Japanese (Shift-JIS) 936 gb2312 ANSI/OEM Simplified Chinese (PRC, Singapore); Code points with no representation in the target code page character set are replaced with Unicode character 0×3f (?). Valid values are names and aliases listed in the IANA CHARACTER SETS listing found at http://www.iana.org/assignments/character-sets. The necessary mapping tables can be found at the following location: http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/ . Code pages 932, 936, 949, and 950 are “Double Byte” code pages. The remainder of the “ANSI” code pages supported by windows are “Single Byte” code pages. For single byte code pages character sets, each Unicode code point is replaced by a single byte or 0×3f if an appropriate character doesn’t exist in the code page character set . For double byte code pages character sets , each Unicode code point is replaced by either a single byte, or a two byte sequence, depending on the input character, or 0×3f if an appropriate character doesn’t exist in the code page character set . In our tables the target is a single byte sequence if the most significant byte is 0×00, otherwise it is a double byte sequence, with the lead byte being the most significant byte. To convert, first check if conversion is being done to a single or double byte code page character set and load the appropriate WCTABLE code page table. For each input WCHAR character , look up the code point in the WCTABLE. There are 3 possibilities: Not found, single byte, or double byte. If the input WCHAR character is not found, append 0×3f and continue to Chinese Simplified (GB2312) 949 ks_c_5601-1987 ANSI/OEM Korean (Unified Hangul Code) 950 big5 ANSI/OEM Traditional Chinese (Taiwan; Hong Kong SAR, PRC); Chinese Traditional (Big5) 1250 windows-1250 ANSI Central European; Central European (Windows) 1251 windows-1251 ANSI Cyrillic; Cyrillic (Windows) 1252 windows-1252 ANSI Latin 1; Western European (Windows) 1253 windows-1253 ANSI Greek; Greek (Windows) 1254 windows-1254 ANSI Turkish; Turkish (Windows) 1255 windows-1255 ANSI Hebrew; Hebrew (Windows) 1256 windows-1256 ANSI Arabic; Arabic (Windows) 1257 windows-1257 ANSI Baltic; Baltic (Windows) 1258 windows-1258 ANSI/OEM Vietnamese; Vietnamese (Windows) the next WCHAR character . If the result is a single byte, check to make sure the entry in the MBTABLE matches the input. If it matches, append the single byte to the output. If it does not match, append 0×3f to the output. If the result is a double byte, check to make sure the entry in the DBCSENTRY table for the appropriate lead byte matches the input WCHAR character . If it matches, append the lead byte and trail byte to the output. If it does not match, append 0×3f to the output. The following pueudocode pseudocode describes how this conversion should be done: int WideCharToMultiByte( WCHAR wchar_t * wszInput, byte* szOutput) { // Remember output start so we can return length byte* szOutputStart = szOutput; // Ask the system for the current ANSI code page, which // on windows is a system setting. int iCodePage = GetCurrentAnsiCodePage(); // Load Code Page Character Set Tables and determine // double/single byte nature. // This will depend on how the code pages character sets are represented on // the target machine. TABLECLASS represents some abstract // representation of this structure here. TABLECLASS pTables = LoadCodePageTables LoadCharacterSetTables ( iCodePage ); bool bDoubleByte = IsCharacterSetDoubleByte(); bool bDoubleByte = false; if (iCodePage == 932 || iCodePage == 936 || iCodePage == 949 || iCodePage == 950) bDoubleByte = true; while (*wszInput != 0) { if (bDoubleByte) szOutput = AppendDoubleByte(pTables, *wszInput, szOutput); else szOutput = AppendSingleByte(pTables, *wszInput, szOutput); // Read next input WCHAR wchar_t wszInput++; } // Null terminate the output *szOutput = 0; // Return output length return szOutput szOutputStart; } byte* AppendSingleByte(TABLECLASS pTables, WCHAR wchar_t wcIn, byte* szOutput) { // Look up byte that we want to append. byte bOut = pTables->LookUpSingleByte(wcIn); // Make sure that bOut matches the input, otherwise use ? // (ie: no best fit behavior allowed) if (wcIn != pTables->LookUpWideChar(bOut)) bOut = 0×3f; *szOutput = bOut; szOutput++; return szOutput; } byte* AppendDoubleByte(TABLECLASS pTables, WCHAR wchar_t wcIn, byte* szOutput) { // Look up bytes that we want to append. UINT16 bytesOut = pTables->LookUpDoubleByte(wcIn); // See if it is a single or double byte sequence if (bytesOut & 0xFF00) { // It is a double byte sequence // Make sure that bytesOut matches the input, otherwise use ? // (ie: no best fit behavior allowed) if (wcIn != pTables->LookUpWideChar(bytesOut)) { // Use ?, it will be added below bytesOut = 0×003f; } else { // It matched, use the lead byte we found // trail byte will be added below *szOutput = bytesOut >> 8; szOutput++; } else { // It is a single byte sequence // Make sure that bytesOut matches the input, otherwise use ? // (ie: no best fit behavior allowed) if (wcIn != pTables->LookUpWideChar(bytesOut & 0xFF)) bytesOut = 0×003f; } // Add the single or trail byte *szOutput = bytesOut & 0xFF; szOutput++; return szOutput; } class pTables { // Construction depends on how you choose to store & load the // table files byte LookUpSingleByte( WCHAR wchar_t wcIn) { // How you access the table depends on your storage mechanism. // Look up the line in WCTABLE where the first column matches wcIn, // and then return the byte value from the second column. if (exists WCTABLE{wcIn}) return WCTABLE{wcIn}.SecondColumn; // If it doesn’t exist, return ? return 0×3f; } UINT16 LookUpDoubleByte( WCHAR wchar_t wcIn) { // How you access the table depends on your storage mechanism. // Look up the line in WCTABLE where the first column matches wcIn, // and then return the double byte value from the second column. if (exists WCTABLE{wcIn}) return WCTABLE{wcIn}.SecondColumn; // If it doesn’t exist, return ? return 0×003f; } // Overload that looks up wide chars from single byte code points. WCHAR wchar_t LookUpWideChar(byte bIn) { // How you access the table depends on your storage mechanism. // Look up the line in MBTABLE where the first column matches bIn, // and then return the WCHAR wchar_t value from the second column. if (exists MBTABLE{bIn}) return MBTABLE{bIn}.SecondColumn; // If it doesn’t exist, return ? return 0×003f; } // Overload that looks up wide chars from double byte code points WCHAR wchar_t LookUpWideChar(UINT16 bytesIn) { // How you access the table depends on your storage mechanism. // First find the DBCSTABLE where the LeadByte matches // the lead (most significant) input byte. if (exists DBCSTABLE{bytesIn >> 8)) { DbcsTable = DBCSTABLE{bytesIn >> 8); // Look up the line in DbcsTable where the first column // matches the input trail (least significant) byte, // and then return the WCHAR wchar_t value from the second column. if (exists DbcsTable{bytesIn & 0xFF}) return DbcsTable{bytesIn & 0xFF}.SecondColumn; } // Either the lead byte table or specific trail byte // doesn’t exist in the table, return ? return 0×003f; } } The resulting value is hashed using the low-order word algorithm defined in §2.15.1.28 below . [Example: [Note: An example This algorithm to hash the resulting single-byte user input into the value stored is as follows can be represented by the following pseudocode : // Function Input: // szPassword: NULL terminated C-Style string // cchPassword: The number of characters in szPassword (not including the NULL terminator) WORD unsigned short GetPasswordHash(const CHAR char *szPassword, int cchPassword) { WORD unsigned short wPasswordHash; const CHAR char *pch; wPasswordHash = 0; if (cchPassword > 0) { pch = &szPassword[cchPassword]; while (pch– != szPassword) { wPasswordHash = ((wPasswordHash >> 14) & 0×01) | ((wPasswordHash << 1) & 0×7fff); wPasswordHash ^= *pch; } wPasswordHash = ((wPasswordHash >> 14) & 0×01) | ((wPasswordHash << 1) & 0×7fff); wPasswordHash ^= cchPassword; wPasswordHash ^= (0×8000 | (’N’ <<
| ‘K’); } return(wPasswordHash); } end note example ] The possible values for this attribute are defined by the ST_UnsignedShortHex simple type (§3.18.87). Part 4, §3.2.29, page 1,922, attribute workbookPassword: workbookPasswo rd ( Legacy Workbook Password) Specifies the legacy hash of the password required for unlocking revisions in this workbook. The hash is generated from an 8-bit wide character. 16-bit Unicode characters must be converted down to 8 bits before the hash is computed, The hash is generated using the logic defined in the revisionsPassword attribute of the workbookProtection element ( §3.2.29 ) . The resulting value is hashed using the algorithm defined below. [Note: An example algorithm to hash the user input into the value stored is as follows: // Function Input: // szPassword: NULL terminated C-Style string // cchPassword: The number of characters in szPassword (not including the NULL terminator) WORD GetPasswordHash(const CHAR *szPassword, int cchPassword) { WORD wPasswordHash; const CHAR *pch; wPasswordHash = 0; if (cchPassword > 0) { pch = &szPassword[cchPassword]; while (pch– != szPassword) { wPasswordHash = ((wPasswordHash >> 14) & 0×01) | ((wPasswordHash << 1) & 0×7fff); wPasswordHash ^= *pch; } wPasswordHash ^= (0×8000 | (’N’ <<
| ‘K’); } return(wPasswordHash); } end note] The possible values for this attribute are defined by the ST_UnsignedShortHex simple type (§3.18.87). Part 4, §3.3.1.69, page 2,004, attribute password: password ( Legacy Password) Specifies the legacy hash of the password required for editing this range. The hash is generated from an 8-bit wide character. 16-bit Unicode characters must be converted down to 8 bits before the hash is computed, using the logic defined in the revisionsPassword attribute of the workbookProtection element (§3.2.29). The resulting value is hashed using the algorithm defined below. [Note: An example algorithm to hash the user input into the value stored is as follows: // Function Input: // szPassword: NULL terminated C-Style string // cchPassword: The number of characters in szPassword (not including the NULL terminator) WORD GetPasswordHash(const CHAR *szPassword, int cchPassword) { WORD wPasswordHash; const CHAR *pch; wPasswordHash = 0; if (cchPassword > 0) { pch = &szPassword[cchPassword]; while (pch– != szPassword) { wPasswordHash = ((wPasswordHash >> 14) & 0×01) | ((wPasswordHash << 1) & 0×7fff); wPasswordHash ^= *pch; } wPasswordHash ^= (0×8000 | (’N’ <<
| ‘K’); } return(wPasswordHash); } end note] The possible values for this attribute are defined by the ST_UnsignedShortHex simple type (§3.18.87). Part 4, §3.3.1.81, page 2,022, attribute password: password ( Legacy Password) Specifies the legacy hash of the password required for editing this worksheet. This protection is optional and may be ignored by applications that choose not to support this functionality. The hash is generated from an 8-bit wide character. 16-bit Unicode characters must be converted down to 8 bits before the hash is computed, using the logic defined in the revisionsPassword attribute of the workbookProtection element ( §3.2.29 ). The resulting value is hashed using the algorithm defined below. [Note: An example algorithm to hash the user input into the value stored is as follows: // Function Input: // szPassword: NULL terminated C-Style string // cchPassword: The number of characters in szPassword (not including the NULL terminator) WORD GetPasswordHash(const CHAR *szPassword, int cchPassword) { WORD wPasswordHash; const CHAR *pch; wPasswordHash = 0; if (cchPassword > 0) { pch = &szPassword[cchPassword]; while (pch– != szPassword) { wPasswordHash = ((wPasswordHash >> 14) & 0×01) | ((wPasswordHash << 1) & 0×7fff); wPasswordHash ^= *pch; } wPasswordHash ^= (0×8000 | (’N’ <<
| ‘K’); } return(wPasswordHash); } end note] The possible values for this attribute are defined by the ST_UnsignedShortHex simple type (§3.18.87). Part 4, §3.3.1.82, pages 2,0242,025, attribute password: password (Password) Specifies the legacy hash of the password required for editing this chart sheet. This protection is optional and may be ignored by applications that choose not to support this functionality. The hash is generated from an 8-bit wide character. 16-bit Unicode characters must be converted down to 8 bits before the hash is computed, using the logic defined in the revisionsPassword attribute of the workbookProtection element ( §3.2.29 ). The possible values for this attribute are defined by the ST_UnsignedShortHex simple type (§3.18.87). Similar Comments: BR-0042 , CL-0199 , CL-0200 , CO-0142 , CO-0149 , DE-0090 , FR-0339 , FR-0344 , GB-0295 , GB-0297 , GR-0075 , GR-0080 , IN-0065 , IR-0046 , KE-0059 , PT-0092 , US-0137 , US-0142

Dupe of Ecma 50