Skip to content

Instantly share code, notes, and snippets.

@rxaviers
Created October 9, 2018 15:02
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rxaviers/c0d16cc91ddb021eb0be5be04c5c06b6 to your computer and use it in GitHub Desktop.
Save rxaviers/c0d16cc91ddb021eb0be5be04c5c06b6 to your computer and use it in GitHub Desktop.
Uniqueness of unit part of unit identifier
UTS#35 defines "unit identifier" (https://unicode.org/cldr/trac/changeset/14503) and it implies that the unit identifier is unique, but it does not discuss the uniqueness of the unit.
The unit uniqueness is implied by:
- All units are unique among all existing data (CLDR and spec examples).
- 6.1 per Unit patterns algorithms [1].
Recommended spec updates:
- https://gist.github.com/rxaviers/39223b302264cc4028f46884403da4a0/revisions
Basically:
i. Add paragraph "Implementations can use either the <em>unit identifier</em> or its <em>unit</em> part (e.g., <code>day</code>) to identify a unit." after unit identifier definition.
ii. Clarify existing unit examples on 6.1 per Unit patterns algorithms:
ii.a. kilometer-per-hour for N/D already available.
ii.b. kilogram-per-second for otherwise case.
Thanks
1: Details why 6.1 per Unit patterns implies unit uniqueness:
- Which D (demoninator) to use if compound form for N/D isn't available? The spec text defines how to format a compound form, which basically is "use a direct match or generate it yourself by picking the parts individually and compose them using a certain pattern". For example, let's format throughput-megabyte-per-second. CLDR provides digital-megabyte and duration-second, but no precomputed form for megabyte per second. We need to pick a second for the D and therefore we can pick duration-second. What if unit parts aren't guaranteed to be unique? Let's suppose we also had a foo-second. Which of the two we would pick? Conclusion: duplicate unit parts leads us to non-determined solution here.
- Still looking at the above example, all types (throughput, digital and duration) are completely irrelevant. They are not helpful for identifying the N/D parts.
- How to name a unit id? Note I made up throughput in the example above. On CLDR we don't have any throughput example. Nevertheless, the example is completely valid and works fine using existing CLDR data. If only unit identifier is required, type is required. How would user guess type of something that is not even documented? The unit megabyte-per-second is unambiguous on its own. Figuring out its type could represent a challenge. Is it throughput? Is it bandwidth? Note this is not particular problematic to "custom" compound units. For example, meter-per-second. If I use speed-meter-per-second, I get a direct match. If I use velocity-meter-per-second (generated by mistake), both output could be different.
@faisalopidan
Copy link

old

@faisalopidan
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment