Lack of support for RFC-3987. That means they do not allow Arabic characters in web addresses. Since Ghana is about a fifth Muslim, a decent portion of our population can be assumed to be at least basically literate in Arabic. This is therefore a problem
OOXML should include support for RFC-3987
te
Proposed Disposition of DIS 29500 Comment GH-0012 (Modified: 2007-12-11) Agreed; the use of IRI’s as part names will be allowed by making the following changes to Part 2, §8.1.1, starting on page 11: 8.1.1 Part Names Each part has a name. Part names refer to parts within a package. [Example: The part name "/hello/world/doc.xml" contains three segments: "hello", "world", and "doc.xml". The first two segments in the sample represent levels in the logical hierarchy and serve to organize the parts of the package, whereas the third contains actual content. Note that segments are not explicitly represented as folders in the package model, and no directory of folders exists in the package model. end example] 8.1.1.1 Part Name Syntax A Part name shall be an IRI and shall be encoded as either a Part IRI or a Part URI. A Part IRI is a physical representation that permits direct use of Unicode characters. A Part URI is a physical representation that uses a percent-encoding for non-ASCII Unicode characters. [Note: Not all versions of the ZIP specification support a Part name represented as a Part IRI. To preserve interoperability, implementers are encouraged to use the currently more prevalent Part URI representation. end note] 8.1.1.1.1 Part IRI Syntax The part IRI grammar is defined as follows: part-IRI = 1*( "/" isegment ) isegment = 1*( ipchar ) ipchar is defined in RFC 3987: ipchar = iunreserved / pct-encoded / sub-delims / ":" / "@" iunreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" / ucschar ucschar = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD / %xD0000-DFFFD / %xE1000-EFFFD pct-encoded = "%" HEXDIG HEXDIG sub-delims = "!" / "$" / "&" / "’" / "(" / ")" / "*" / "+" / "," / ";" / "=" The part IRI grammar implies the following constraints. The package implementer shall neither create any part that violates these constraints nor retrieve any data from a package as a part if the purported part IRI violates these constraints. A part IRI shall not be empty. [M1.1] A part IRI shall not have empty isegments. [M1.3] A part IRI shall start with a forward slash (”/”) character. *M1.4+ A part IRI shall not have a forward slash as the last character. [M1.5] An isegment shall not hold any characters other than ipchar characters. [M1.6] Part IRI isegments have the following additional constraints. The package implementer shall neither create any part with a part IRI comprised of an isegment that violates these constraints nor retrieve any data from a package as a part if the purported part IRI contains an isegment that violates these constraints. An isegment shall not contain percent-encoded forward slash (”/”), or backward slash (”\”) characters. [M1.7] An isegment shall not contain percent-encoded iunreserved characters. [M1.8] An isegment shall not end with a dot (”.”) character. *M1.9+ An isegment shall include at least one non-dot character. [M1.10] 8.1.1.1.2 Part URI Syntax The part URI grammar is defined as follows: part -URI _name = 1*( "/" segment ) segment = 1*( pchar ) pchar is defined in RFC 3986 : . pchar = unreserved / pct-encoded / sub-delims / ":" / "@" unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" pct-encoded = "%" HEXDIG HEXDIG sub-delims = "!" / "$" / "&" / "’" / "(" / ")" / "*" / "+" / "," / ";" / "=" The part name URI grammar implies the following constraints. The package implementer shall neither create any part that violates these constraints nor retrieve any data from a package as a part if the purported part name URI violates these constraints. A part name URI shall not be empty. [M1.1] A part name URI shall not have empty segments. [M1.3] A part name URI shall start with a forward slash (”/”) character. *M1.4+ A part name URI shall not have a forward slash as the last character. [M1.5] A segment shall not hold any characters other than pchar characters. [M1.6] Part URI segments have the following additional constraints. The package implementer shall neither create any part with a part name URI comprised of a segment that violates these constraints nor retrieve any data from a package as a part if the purported part name URI contains a segment that violates these constraints. A segment shall not contain percent-encoded forward slash (”/”), or backward slash (”\”) characters. [M1.7] A segment shall not contain percent-encoded unreserved characters. [M1.8] A segment shall not end with a dot (”.”) character. *M1.9+ A segment shall include at least one non-dot character. [M1.10] [Example: Example 81. A part name /a/%D1%86.xml /xml/item1.xml Example 82. An invalid part name //xml/. end example] 8.1.1. 2 1 Part IRI and Part URI Mapping A Part IRI can be converted to a Part URI by converting ucschar characters to percent-encoded triplets, as defined in Step 2 in §3.1 of RFC 3987. A Part URI can be converted to a Part IRI by converting percent-encoded triplets to ucschar characters, as defined in §3.2 of RFC 3987. 8.1.1. 3 2 Part Name s Equivalence Part names shall be mapped to either the Part IRI or Part URI form for comparison. Part names represented in different forms cannot be compared. [Note: Equivalence rules for the Part IRI and Part URI forms guarantee uniformity of the comparison result for Part Names converted either to Part IRI or to Part URI form. end note] Packages shall not contain equivalent Part Names, and package implementers shall neither create nor recognize packages with equivalent Part Names. [M1.12] 8.1.1.3.1 Part IRI Equivalence Part IRI equivalence is determined by comparing part IRIs character-by-character: pct-encoded and ALPHA characters as case-insensitive ASCII ucschar characters as case-sensitive Unicode 8.1.1.3.2 Part URI Equivalence Part URI name equivalence is determined by comparing part URIs names as case-insensitive ASCII strings. Packages shall not contain equivalent part names and package implementers shall neither create nor recognize packages with equivalent part names. [M1.12] 8.1.1.4 Part Naming A package implementer shall neither create nor recognize a part with a part name derived from another part name by appending segments to it. [M1.11] [Example: If a package contains a part named "/segment1/segment2/.../segmentn", then other parts in that package shall not have names such as: "/segment1", "segment1/segment2", or "/segment1/segment2/.../segmentn-1". end example] Similar Comments: CZ-0041 , ECMA-0015 , JP-0004 , JP-0042 , US-0043
