You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Improvement of NIP-45 Event Counts by Linear Counting
Abstract
NIP-45 Event Count10 doesn't provide the correct number of events if different events exist in different relays.
Introduction
NIP-45 Event Count enables clients to obtain the number of events matched to filters. This can be used for counting followers, the number of reposts and the number of reactions. However, NIP-45 has a problem: results may vary by relay.
For example, if relay X has events A and B, relay Y has events B and C, and we send a COUNT query with filters matches those events to relay X and Y, the both of COUNT results would be 2. This can happen in several situations: the user successfully publish an event to some relays but failed for other relays; the user publish to the set of relays (kind:30002) rather than the usual relay list (kind:10002); or the user start or stop to use some relays.
One solution is just to include event ids in the COUNT response20. However, if there are a lot of matched events, the list of ids would be large after all. If there are 1000 events and each id is 64 letters in hex, the size of the list is about 64KB. And, if we receive this from 5 relays, the total transferred bytes is about 256KB. To use first 8 letters of id like previous tag in NIP-2930 will reduce the size to one-eighth but the size is still big.
Other solution is Data vending machines (DVM, NIP-90)40. DVM allows user to ask some provider to process the user's request. The DVM job to count events is already defined50. This will be the one of the most efficient way to count events correctly because the result has only number without extra data. The provider can collect the data frequently from relays around the world and count in advance of the request. However, providers may choose to collect data at the time the request is published, then it may take some time to get the result and it will have a negative impact on the user experience. The user may need to choose which provider to trust and use before use.
Currently, we need to use REQ to count events if we want to know almost exact amount. This means the client will receives a lot of events from relays. If every single event has 350 bytes, there are 1000 events and the client request to 5 relays, the amount of data the client will receive would be 1.75 MB.
In this paper, we propose a method to solve this problem using linear counting algorithm. The results of linear counting from multiple relays can be integrated. The linear counting provides good estimated counts with a small amount of data.
Linear Counting
Linear counting60 is a probabilistic algorithm to count unique items. The core idea is to use bitmap (bitset) and hash function.
The following describes how to add an item:
calculate the hash value of the data to be added with hash function.
calculate the remainder of the hash value divided by the bitmap length.
bit in that position is set to 1.
To estimate the number of items, use following formula60:
- m * log2(1 - n/m)
m is the size of bitset and n is the number of bits set to 1.
The algorithm has :
To get the
The remainder values
The remainder values can be the same for different data, namely hash collision.
The algorithm has :
merge-able
If we want combine two results, we just need to calculate bit-or of two bit-sets.
Use linear counting in COUNT
To apply linear counting to COUNT,
id is a SHA-256 hash value of content and considered as unique, this can be used for linear counting.
Note that Proof of Work [NIP-13]
Proposal
Request
This proposal introduces these special prefixes:
size
prefix
bitset size
id bits
0
lc0'
128 bytes (1024 bits)
10
1
lc1'
256 bytes (2048 bits)
11
2
lc2'
512 bytes (4096 bits)
12
3
lc3'
1024 bytes (8192 bits)
13
4
lc4'
2048 bytes (16384 bits)
14
5
lc5'
4096 bytes (32768 bits)
15
6
lc6'
8192 bytes (65536 bits)
16
"size": values for request format B.
"prefix": values for request format A.
"bitset size": size of bitsets
"id bits": last n bits are used for bitset
If the result bitset is almost full and the error is considered large or willing to know more accurate number, client CAN retry with larger size (refer "Accuracy" section for more detail).
If server doesn't support this linear counting, it is expected to ignore the linear counting parameter and respond with the original NIP-45 response.
If there are no matching events, the server SHOULD omit "linear_counting" field from response and set "count" to 0. The client can interpret it as either no data or the client received data with all bits zero (unset).
Consideration
Bitset size
The client can choose the bitset size according to the expected number of events.
For example, if the client expects the number of events to be less than 100, the client can choose size=0. This could be suitable for the case of the number of reposts and the number of reactions.
If the client expects the number of events to be less than 2000, the client can choose size 1. This can be used for the case of the number of followers.
Of course, the client can choose a larger size if the client wants more accurate results.
Transferred data size
Base64 encoded data will be 33% bigger than original size.
Original size
Base64 size
128 bytes (1024 bits)
171 chars
256 bytes (2048 bits)
341 chars
512 bytes (4096 bits)
680 chars
1024 bytes (8192 bits)
1362 chars
The larger the bitset size, the more accurate the result will be. However, the larger the bitset size, the more data will be transferred.
Database load
The database is needed to return ids rather than just COUNT(id). The traffic between the relay and the database will increase and database load will also increase a bit.
To reduce the amount of traffic, we can write a database query that take the last few bytes of the id instead of the entire id. For example, a client requests COUNT where size is 2, only the last 12 bits is needed to calculate.
Accuracy (error and bitset size)
If the error is considered to be large, the client can request with a larger size again.
According to the linear counting paper,
Non suitable use-cases
NIP-2570 allows user to use emojis in their reactions. It is hard to count emojis with this mechanism. If we simply create bitset for each emoji, the size of the result will be large and it is not desirable.
The relay might tell a lie about the count by setting/un-setting some bits in the bitmap. The client can verify the result by comparing the COUNT result and the number of REQ results. It is also possible to compare a linear counting bitmap generated by the client from the REQ results with the bitmap generated by relay. If the bits in the client's bitmap were much different compared to the relay's bitmap, the server seems to lie. In that case, The client can choose to ignore the result from the relay for this use-case and inform the user of it. This lie can be verifiable but the problem is that it needs data transferring. However, there is more simple way to tell a lie -- just creating such events with random keys.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* Visit https://aka.ms/tsconfig to read more about this file */
/* Projects */
// "incremental": true, /* Save .tsbuildinfo files to allow for incremental compilation of projects. */
// "composite": true, /* Enable constraints that allow a TypeScript project to be used with project references. */
// "tsBuildInfoFile": "./.tsbuildinfo", /* Specify the path to .tsbuildinfo incremental compilation file. */
// "disableSourceOfProjectReferenceRedirect": true, /* Disable preferring source files instead of declaration files when referencing composite projects. */
// "disableSolutionSearching": true, /* Opt a project out of multi-project reference checking when editing. */
// "disableReferencedProjectLoad": true, /* Reduce the number of projects loaded automatically by TypeScript. */
/* Language and Environment */
"target": "esnext", /* Set the JavaScript language version for emitted JavaScript and include compatible library declarations. */
// "lib": [], /* Specify a set of bundled library declaration files that describe the target runtime environment. */
// "jsx": "preserve", /* Specify what JSX code is generated. */
// "experimentalDecorators": true, /* Enable experimental support for legacy experimental decorators. */
// "emitDecoratorMetadata": true, /* Emit design-type metadata for decorated declarations in source files. */
// "jsxFactory": "", /* Specify the JSX factory function used when targeting React JSX emit, e.g. 'React.createElement' or 'h'. */
// "jsxFragmentFactory": "", /* Specify the JSX Fragment reference used for fragments when targeting React JSX emit e.g. 'React.Fragment' or 'Fragment'. */
// "jsxImportSource": "", /* Specify module specifier used to import the JSX factory functions when using 'jsx: react-jsx*'. */
// "reactNamespace": "", /* Specify the object invoked for 'createElement'. This only applies when targeting 'react' JSX emit. */
// "noLib": true, /* Disable including any library files, including the default lib.d.ts. */
// "useDefineForClassFields": true, /* Emit ECMAScript-standard-compliant class fields. */
// "moduleDetection": "auto", /* Control what method is used to detect module-format JS files. */
/* Modules */
"module": "commonjs", /* Specify what module code is generated. */
// "rootDir": "./", /* Specify the root folder within your source files. */
// "moduleResolution": "node10", /* Specify how TypeScript looks up a file from a given module specifier. */
// "baseUrl": "./", /* Specify the base directory to resolve non-relative module names. */
// "paths": {}, /* Specify a set of entries that re-map imports to additional lookup locations. */
// "rootDirs": [], /* Allow multiple folders to be treated as one when resolving modules. */
// "typeRoots": [], /* Specify multiple folders that act like './node_modules/@types'. */
// "types": [], /* Specify type package names to be included without being referenced in a source file. */
// "moduleSuffixes": [], /* List of file name suffixes to search when resolving a module. */
// "allowImportingTsExtensions": true, /* Allow imports to include TypeScript file extensions. Requires '--moduleResolution bundler' and either '--noEmit' or '--emitDeclarationOnly' to be set. */
// "resolvePackageJsonExports": true, /* Use the package.json 'exports' field when resolving package imports. */
// "resolvePackageJsonImports": true, /* Use the package.json 'imports' field when resolving imports. */
// "customConditions": [], /* Conditions to set in addition to the resolver-specific defaults when resolving imports. */
// "maxNodeModuleJsDepth": 1, /* Specify the maximum folder depth used for checking JavaScript files from 'node_modules'. Only applicable with 'allowJs'. */
/* Emit */
// "declaration": true, /* Generate .d.ts files from TypeScript and JavaScript files in your project. */
// "declarationMap": true, /* Create sourcemaps for d.ts files. */
// "emitDeclarationOnly": true, /* Only output d.ts files and not JavaScript files. */
// "inlineSourceMap": true, /* Include sourcemap files inside the emitted JavaScript. */
// "outFile": "./", /* Specify a file that bundles all outputs into one JavaScript file. If 'declaration' is true, also designates a file that bundles all .d.ts output. */
// "outDir": "./", /* Specify an output folder for all emitted files. */
// "declarationDir": "./", /* Specify the output directory for generated declaration files. */
// "preserveValueImports": true, /* Preserve unused imported values in the JavaScript output that would otherwise be removed. */
/* Interop Constraints */
// "isolatedModules": true, /* Ensure that each file can be safely transpiled without relying on other imports. */
// "verbatimModuleSyntax": true, /* Do not transform or elide any imports or exports not marked as type-only, ensuring they are written in the output file's format based on the 'module' setting. */
// "allowSyntheticDefaultImports": true, /* Allow 'import x from y' when a module doesn't have a default export. */
"esModuleInterop": true, /* Emit additional JavaScript to ease support for importing CommonJS modules. This enables 'allowSyntheticDefaultImports' for type compatibility. */
// "preserveSymlinks": true, /* Disable resolving symlinks to their realpath. This correlates to the same flag in node. */
"forceConsistentCasingInFileNames": true, /* Ensure that casing is correct in imports. */
/* Type Checking */
"strict": true, /* Enable all strict type-checking options. */
// "noImplicitAny": true, /* Enable error reporting for expressions and declarations with an implied 'any' type. */
// "strictNullChecks": true, /* When type checking, take into account 'null' and 'undefined'. */
// "strictFunctionTypes": true, /* When assigning functions, check to ensure parameters and the return values are subtype-compatible. */
// "strictBindCallApply": true, /* Check that the arguments for 'bind', 'call', and 'apply' methods match the original function. */
// "strictPropertyInitialization": true, /* Check for class properties that are declared but not set in the constructor. */
// "noImplicitThis": true, /* Enable error reporting when 'this' is given the type 'any'. */
// "useUnknownInCatchVariables": true, /* Default catch clause variables as 'unknown' instead of 'any'. */