elichai/gist:19f23049108b4c66624c032c0e13facb

## gistfile1.txt
[2021-10-13 02:26:15] <real_or_random> sipa gmaxwell andytoshi: it would be nice to hear your opinion on https://github.com/bitcoin-core/secp256k1/pull/988#issuecomment-938650194 ... also wrt rust-secp256k1 which I believe is the main user user of  _context_no_precomp
[2021-10-13 17:45:02] <elichai2> Hmm I really want `secp256k1_context_no_precomp` to work for signing mostly for the ergonomics of rust-secp in a fully no-std mode
[2021-10-13 17:45:08] <elichai2> But I can see the arguments against it
[2021-10-13 17:45:42] <elichai2> at the very least the name should change, although that's a breaking change
[2021-10-13 17:45:44] — elichai2 sigh
[2021-10-13 17:58:42] <real_or_random> well, let's say that's an opportunity then to change the semantics ...
[2021-10-13 20:51:02] <gmaxwell> real_or_random: I don't really understand the question on the no-precomp.  The change to static tables should make the functions that need the additional setup check for that, rather than the tables... so that they don't run while not setup.  Having a messed up half formed context that couldn't be randomized wouldn't be particularly desirable.
[2021-10-13 20:52:24] <gmaxwell> real_or_random: or stated differently, it's perfectly possible to complete the static table change without making any change to the API at all. So that should be done.  If there is some reason to change the api, thats a seperate consideration.
[2021-10-13 21:01:14] <gmaxwell> oh. re-initilizers. I'm not sure whats required there to satisfy C++.  I think the issue is that {0} is supposed to always work to zero init any type in C, but in C++ they have {} for this purposes (which you can't use in C).  To add fun, both clang and GCC after 4.x have frequently had bugs in the warnings.
[2021-10-13 21:02:45] <cfields> (not in a few min of trivial hacking with no rearchitecting, anyway)
[2021-10-14 00:09:10] <real_or_random> gmaxwell: re no-precomp: so you're saying that you don't understand the question but you gave an answer
[2021-10-14 00:10:35] <real_or_random> so you're in favor of my second bullet point "Effectively retain the distinction between signing contexts and non-signing contexts."
[2021-10-14 00:20:37] <gmaxwell> yeah well I don't really understand why it's a question. :) which maybe means that I didn't understand the question.
[2021-10-14 00:21:15] <gmaxwell> maintain the api. if there is an argument for changing it, thats okay, but make it seperately from the precomp change and don't hold the precomp change up for it.
[2021-10-14 00:22:34] <gmaxwell> on that subject, will someone at least throw in the code to not build openssl when opensslv3 is installed? I think someone dropped in the relevant configure test.  That's not exclusive with going and removing openssl testing entirely, but if it can go in faster thats good-- it's not good to leave the library in a state where the tests fail for some users.
[2021-10-14 00:23:10] <gmaxwell> (it's a kind of indirect signal that the library is under-maintained that I'd advise people to look for when they can't independantly assess the quality of the library code they're considering using)
[2021-10-14 00:29:26] <real_or_random> gmaxwell: yeah I think it's a question because the similar PR for the verification tables changed the API. and maybe that was what people expect from this PR too.
[2021-10-14 00:29:40] <real_or_random> but I agree it's just more conversative not to change the API
[2021-10-14 00:31:24] <gmaxwell> oh how did the verification table change change the API?
[2021-10-14 00:33:54] <real_or_random> gmaxwell: the verification flag is irrelevant now. it's still there but ignored
[2021-10-14 00:35:08] <real_or_random> see https://github.com/bitcoin-core/secp256k1/pull/956/files#diff-e207f09eee090dbd881abf15bcdfa9f7e59453c2b542755ab11985e802839b4e for example
[2021-10-14 00:35:14] <gmaxwell> Hm. Not sure that this is ideal. But it makes more sense that you've asked!
[2021-10-14 00:36:13] <real_or_random> yeah I'm not sure either. my take was not too to care to much because I still think that after making the tables static, we should reconsider the entire context API
[2021-10-14 00:36:39] <real_or_random> so I think it's an intermediate thing, and it looked "ok enough"
[2021-10-14 00:37:04] <real_or_random> (what I wrote in https://github.com/bitcoin-core/secp256k1/pull/956#issuecomment-874607474)
[2021-10-14 00:37:15] <gmaxwell> There absolutely should be a way to build the library without support for verifying (and/or without support for signing)-- but perhaps ultimately that would best be accomplished with defines, and just making it so it won't compile/link if you're trying to call functions that are excluded.
[2021-10-14 00:37:48] <real_or_random> right. but our conclusion was that we want the compiler to handle this
[2021-10-14 00:37:52] <sipa> there are roughly 3 pieces of "context ish" data that make sense going forward: randomization (which needs r/w or threadlocal context), perhaps cpu autodetection (which would be r/o, and can be optional), and stash space for batch stuff (only for those calls)
[2021-10-14 00:37:54] <real_or_random> (compiler/linker
[2021-10-14 00:38:38] <sipa> so with all the precomputation gone, do we envision all signing functions e.g. to take a mutable context?
[2021-10-14 00:38:48] <real_or_random> sipa: how is cpu autodetection a context thing?
[2021-10-14 00:39:04] <gmaxwell> Because it can be expensive to run, so you want it run once.
[2021-10-14 00:39:09] <sipa> yeah
[2021-10-14 00:39:27] <sipa> it may involve running benchmarks to figure out which implementation is fastest
[2021-10-14 00:39:37] <sipa> but even if not, on some platforms it involves system calls
[2021-10-14 00:39:39] <gmaxwell> also on arm you're supposed to trawl through /proc/cpuinfo ... because there aren't standard instructions to detect cpu features which are available to userspace.
[2021-10-14 00:39:56] <real_or_random> sipa: https://github.com/bitcoin-core/secp256k1/issues/780#issuecomment-670531851 you suggested here that we could have _sign_and_randomize and I think that's a good idea
[2021-10-14 00:40:05] <gmaxwell> (which requries opening a file, which means you need to have file descriptors available)
[2021-10-14 00:40:17] <real_or_random> cpu: ok makes sense
[2021-10-14 00:40:30] <sipa> real_or_random: an alternative is having a FLAG on creation that marks the context as mutable
[2021-10-14 00:41:21] <sipa> so signing could automatically update rng data on such
[2021-10-14 00:41:51] <real_or_random> a 4th thing is the callbacks but here I believe they're not essential... (I said in the same issue that the ability to set them at runtime is not important. we anyway have a way to replace them at build time and this should be enough)
[2021-10-14 00:41:59] <gmaxwell> newly rerandomizing every signing may blow away a lot of the randomization benefits. The problem is that if you can read a side channel off the nonce generation, you can also do a similar read for the randomization.  The existing randomization is cumulative, so any unpredictibility gets amplified every time its used.
[2021-10-14 00:42:41] <real_or_random> gmaxwell: do you remember the discussion in the issue? the idea was to use a single bit to update the randomization after signing
[2021-10-14 00:42:53] <real_or_random> maybe we could do the same for key gen
[2021-10-14 00:43:46] <sipa> yeah, that single bit update is what i was implicitly thinking.of
[2021-10-14 00:43:50] <gmaxwell> real_or_random: you can use the public output to update the context state after each operation, yes. But that doesn't answer the api question-- because you need a place to store the state.
[2021-10-14 00:44:30] <real_or_random> sure, it's just a different question
[2021-10-14 00:44:32] <gmaxwell> sorry, I assumed sign_and_randomize was proposing an interface that didn't have a randomized state but just took a random input and randomized right before use.
[2021-10-14 00:45:26] <gmaxwell> I think with the static context it would be reasonable to drop contexts entirely except for the things that need mutable data.  ... except for the runtime autodetection, sadly.
[2021-10-14 00:45:40] <real_or_random> in general, we need to see which of the 3 (or 4) pieces we want to have in a generic context object or which could be separate  objects
[2021-10-14 00:46:10] <real_or_random> right, so I think CPU stuff would be more similar to the current context, where scratch spaces could be separate objects
[2021-10-14 00:46:21] <gmaxwell> I think e.g. for batch it makes total sense for batches to just take a scratch buffer as an argument. That seems like the most natural interface to me, in fact.
[2021-10-14 00:46:39] <gmaxwell> And I think it would be reasonable to make signing have a mutable local randomization context. Sure.
[2021-10-14 00:46:43] <real_or_random> and also the "ecmult randomization state" could be a separate object
[2021-10-14 00:47:04] <gmaxwell> And I suppose at least runtime detection is const.
[2021-10-14 00:48:12] <real_or_random> yeah that all makes sense.
[2021-10-14 00:50:11] <real_or_random> Then CPU detection is the only case where I'm not sure. I think this belongs to a "general-purpose" context (as we have it now)
[2021-10-14 00:50:41] <gmaxwell> Indeed, but at least its const.
[2021-10-14 00:50:47] <real_or_random> But we don't have it right now... I don't know. It would be bad to remove the contexts now just to reintroduce them in a later version for CPU detection
[2021-10-14 00:51:29] <gmaxwell> it doesn't help that we don't have any runtime cpu stuff implemented. It may turn out that the only reasonable way to do that is to effectively compile the whole library N times, and then just have the api functions just dispatching to the real version.
[2021-10-14 00:52:05] <gmaxwell> It'll almost certantly be a losing proposition to make every call to fe_add fe_mul indirect through a function pointer.
[2021-10-14 00:52:14] <real_or_random> hehe yeah
[2021-10-14 00:52:19] <gmaxwell> since it'll kill the heavy inlining that the compiler currently does.
[2021-10-14 00:52:54] <real_or_random> given that this is const, would it be absurd to have a global/singleton even?
[2021-10-14 00:53:18] <gmaxwell> geee. thats really considered bad form in libraries.
[2021-10-14 00:54:13] <gmaxwell> for good reason. The issue is this, say you have a system copy of libsecp256k1.  Your app is using it. Great.  Now you call into some other library.. which calls into some other library which then needs its own copy of libsecp256k1. And it calls the init functions, and now you've created a race condition.
[2021-10-14 00:54:14] <real_or_random> yeah that's what my conscience tells me immediately.
[2021-10-14 00:54:33] <real_or_random> that's a good example
[2021-10-14 00:54:37] <gmaxwell> and potentially the app doesn't know about any libsecp256k1 use at all, and the libraries have no knowedlge of each other.
[2021-10-14 00:54:40] <sipa> we may be able to get away with duplicating at the ge/scalar level
[2021-10-14 00:54:47] <sipa> if not, maybe at the ecmult level
[2021-10-14 00:56:06] <gmaxwell> sipa: yeah, unfortunately I think we do fe/scalar opersations that aren't strictly encapsulated.   ecmult level is helpful too because it'll be compatble with vectorization.
[2021-10-14 00:56:18] <real_or_random> it's all speculation at the moment
[2021-10-14 00:57:47] <gmaxwell> There is also an alternative of just compiling N ways and having the caller dl open the right one... but it's ugly where e.g. callers might call into the library from constructors or whatever.
[2021-10-14 00:58:39] <sipa> i'd rather have a bunch of macro logic around every function which results in it being duplicated N times, once for each arch
[2021-10-14 00:59:12] <sipa> and then compiling everything N times, but doing dispatch at library entrance manually
[2021-10-14 00:59:35] <sipa> relying on dl feels just... gross
[2021-10-14 00:59:35] <gmaxwell> ::nods::
[2021-10-14 01:00:23] <sipa> an advantage of such an approach is that it may turn out to be flexibly granulated
[2021-10-14 01:00:30] <gmaxwell> right the API could just be wrapper functions that simply call into an N-way recompiled copy of the library. The compiler will even remove the indirection if there is only one possiblity.
[2021-10-14 01:00:44] <real_or_random> that sounds reasonable
[2021-10-14 01:01:03] <real_or_random> I wonder how other libs deal with this
[2021-10-14 01:01:50] <real_or_random> it's certainly a rather advanced thing but I guess we're not the first project in the world to think about this
[2021-10-14 01:01:55] <gmaxwell> in video codecs, the way it's handled is that there is a table of function pointers set at setup. And the functions are sized large enough that its reasonable to dispatch through a function pointer.
[2021-10-14 01:02:09] <gmaxwell> real_or_random: my comments above aren't based on speculation but expirence.
[2021-10-14 01:02:57] <real_or_random> ok yes. I meant we can only speculate on where we'll have CPU specific code in the future (if at all)
[2021-10-14 01:03:02] <gmaxwell> For some things, they're handled by compiling multiple versions with -msse etc into seperate libraries and then there is a way to get the linker to open the right one.  This is how libspeex is done on debian/ubuntu, for example.
[2021-10-14 01:03:06] <gmaxwell> real_or_random: it's already written
[2021-10-14 01:03:12] <gmaxwell> and it's a pretty decent speedup.
[2021-10-14 01:03:15] <real_or_random> the ASM?
[2021-10-14 01:03:33] <gmaxwell> https://github.com/bitcoin-core/secp256k1/pull/967
[2021-10-14 01:03:38] <gmaxwell> has haswell+ specific code
[2021-10-14 01:04:06] <gmaxwell> which is commen enough to be very worthwhile, but not so common that it can just be a default without runtime detection. :(
[2021-10-14 01:04:16] <real_or_random> ok idk, I didn't look at it given it says WIP WIP WIP
[2021-10-14 01:04:21] <real_or_random> indeed
[2021-10-14 01:04:32] <gmaxwell> yeah, thats why I told you. :P
[2021-10-14 01:04:40] <real_or_random> thanks :P
[2021-10-14 01:04:53] <gmaxwell> I assume part of the reason it's WIP WIP WIP is specifically because of the open question of how the @#$@# to integrate it.
[2021-10-14 01:05:44] <real_or_random> first convince CI, then you can ask humans to look your code :PPP
[2021-10-14 01:06:28] <real_or_random> well but ok, so that means that CPU specific code is on the horizon
[2021-10-14 01:06:46] <real_or_random> which is a good argument in favor of not just removing contexts now
[2021-10-14 01:07:02] <gmaxwell> stuff like simd is more speculative because we don't know for sure that it would actually be faster (though some ed25519 libraries have seen substantial speedups from it).
[2021-10-14 01:07:17] <gmaxwell> but at least the bmi2/adcx stuff is concrete, the code exists and works.
[2021-10-14 01:07:23] <gmaxwell> and is a nice speedup.
[2021-10-14 01:07:34] <gmaxwell> (20% over stock? IIRC?)
[2021-10-14 01:08:05] <cfields> re function dispatch, couldn't the cpu check function instead return a struct full of function pointers to be called to avoid the indirection?
[2021-10-14 01:08:48] <real_or_random> it could but isn't that  just pushing the indirection to the user?
[2021-10-14 01:09:45] <real_or_random> and then you really need LTO to optimize the call in case the compiler can figure out that in fact only one function can be called
[2021-10-14 01:10:08] <gmaxwell> maybe the word indirection was confusing?  Any time you make a function call through a function pointer you stall the pipeline. It's not as bad as it used to be but it's still obviously slow, esp compared to code that was previously inline.. and in some cases like a fe_add is litterally just doign 5 interger adds in the function.
[2021-10-14 01:10:18] <gmaxwell> real_or_random: it can't optimize it away if its runtime detected. :P
[2021-10-14 01:10:27] <sipa> real_or_random: the CI issue with the optimized asm is something on macOS i haven't figured out (maybe it's acrually doing some runtime dispatch automatically)
[2021-10-14 01:11:10] <sipa> but the integration besides that is not the problem... it's just leaving selection of optimized asm a compile-time option which is useless for generic purposes, but enough to CI.it
[2021-10-14 01:11:35] <sipa> i can prioritize fixing that if there's interested in moving forward with it
[2021-10-14 01:11:38] <real_or_random> gmaxwell: I can imagine you compile with some cpu flags that imply that a specific function will be used... but ok, I see that this is hard for the compiler to see this
[2021-10-14 01:12:26] <cfields> gmaxwell: I just meant skipping adding a forwarding helper function and giving the user the pointer to the optimized impl directly after the test. but maybe that's what you had in mind anyway.
[2021-10-14 01:12:46] <gmaxwell> cfields: it is indeed.
[2021-10-14 01:12:48] <sipa> cfields: yeah, that's how that's done in practice
[2021-10-14 01:13:10] <cfields> ah, ok. don't mind me then :)
[2021-10-14 01:13:13] <sipa> function pointer is faster than a bunch of switch case functions
[2021-10-14 01:13:40] <gmaxwell> It's a bit obnoxious because these vtables are a vector for control flow attacks too, but ::shrugs::
[2021-10-14 01:13:56] <real_or_random> sipa: I think my comment about CI was mostly a joke. but  to be honest, I think we lack bandwidth on all fronts, so prioritizing things is a good idea. but then we need to agree on priorities
[2021-10-14 01:14:20] <sipa> you'd store the function pointers to the specialized functions ib the context object
[2021-10-14 01:16:11] * meshcollider changed host: meshcollid@meshcollider.jujube.ircnow.org → meshcollid@user/meshcollider
[2021-10-14 01:19:29] <gmaxwell> sipa: if the dispatch were done at the api enterance level I don't think it would matter how it was done. A switch statement there would end up well predicted and would work fine.
[2021-10-14 01:20:41] <sipa> real_or_random: i'm happy to prioritize fixing the static/precomp tables stuff first
[2021-10-14 01:20:44] <sipa> since it sounds like it could break/interfere witha  lot of.things
[2021-10-14 01:21:37] <gmaxwell> the static precomp stuff just has a lot of benefits now that it's been decided to go that way, the big delay on it happening I think was just making the decision to finally do it.
[2021-10-14 01:22:14] <real_or_random> sipa: true yeah but I wasn't suggesting any priorities.
[2021-10-14 01:22:31] <gmaxwell> should make things much more clear for embedded users, which is increasingly important with taproot. ... since they now have a choice of trying to bolt schnorr support on top of their maybe-very-yolo implementation, or switching to libsecp256k1.
[2021-10-14 01:23:04] <real_or_random> I guess I'm pointing out that the reason why many PRs don't get attention is that we lack bandwidth in general. and we can only move attention around, but it always means some things won't get enough attention
[2021-10-14 01:24:13] <gmaxwell> well not likely to change quickly. Most people think they're not qualified to contribute.
[2021-10-14 01:24:55] <sipa> would it help to have say a biweekly or monthly IRC meeting... not necessarily more than people sayihg what they're working on (if anything) and/or blocked on?
[2021-10-14 01:25:18] <real_or_random> I was discussing this with nickler a few hours ago. I think it would
[2021-10-14 01:25:32] <sipa> sometimes getting interest in working on something only needs knowing that.others care about it too
[2021-10-14 01:25:50] <real_or_random> I was thinking of a call (we're still a small crowd), an IRC meeting would be fine too
[2021-10-14 01:26:03] <sipa> call is fine by me too
[2021-10-14 01:26:12] <real_or_random> and yes, biweekly or monthly is what I had in mind
[2021-10-14 01:27:08] <real_or_random>  it would help to get people unblocked sometimes, or talk about priorities, so I believe we should give it a try
[2021-10-14 01:28:02] <real_or_random> in the long term, I'd be nice to attract some more people. @gmaxwell and you're right, people believe they're not qualified
[2021-10-14 01:28:27] <real_or_random> because they don't have a background in crypto. but often all we need is a careful C programmer
[2021-10-14 01:28:38] <gmaxwell> I know just because I've tried. E.g. tried pretty hard to get bluematt to work on the library, since he was (and continues to be) pretty burned out by bitcoin core process issues etc.
[2021-10-14 01:29:36] <gmaxwell> the 'crypto' is actually pretty easy in any case, too.  ( I mean, it's not like people are writing simulation proofs or whatever just to develop in libsecp256k1 ).
[2021-10-14 01:31:25] <real_or_random> indeed. and okay, there's an aspect of having experience in crypto engineering but 1) that's nothing that people need to be afraid of and 2) a lot of tasks won't even need this experience
[2021-10-14 01:31:55] <gmaxwell> right.
[2021-10-14 01:32:42] <sipa> right, indeed
[2021-10-14 01:33:02] <sipa> like we've had this open issue about writing example/demo code, that elichai2 has been working on here and there
[2021-10-14 01:34:27] <real_or_random> let me try to do some advertising here tomorrow, it won't hurt :D
[2021-10-14 01:34:48] <gmaxwell> heh. it's hard to write code that does something useful without running into some platform specific gotacha.  Like someone could make a little key generator utility that generates a private key and spits out a taproot address.... probably just 50 lines of code over calling the library. Except, woops, you need a random number.
[2021-10-14 01:38:12] <sipa> still a lot more people who can do that than just those who consider themselves worthy of writing cryptographic code
[2021-10-14 01:41:28] <real_or_random> yep. also small things can help like a "good first issue" tag or I don't know
[2021-10-14 01:43:03] <gmaxwell> I'm dubious at encouraging drivebys though.
[2021-10-14 01:43:26] <gmaxwell> The project doesn't lack for people who swing by with no investment and suggest some changes and then vanish. :P
[2021-10-14 01:43:42] <real_or_random> right, but it's hard to target the "right" people ^^
[2021-10-14 01:44:19] <real_or_random> I guess in the end you can try to encourage people and hope that some of them will stay
[2021-10-14 03:14:03] → roconnor joined (~roconnor@host-45-58-217-8.dyn.295.ca)
[2021-10-14 03:14:20] <roconnor> What's the race condition with the copies of libsecp256k1?
[2021-10-14 03:16:41] <gmaxwell> roconnor: real_or_random: was suggesting creating a global context object for run-once setup information.
[2021-10-14 03:17:14] <gmaxwell> roconnor: an I was pointing out that if an application called libraries which called libsecp, they'd potentially race in setting up that global data.
[2021-10-14 03:17:42] <roconnor> Yes, but with a global call_once lock, and the cpu detection ... er maybe being idempotent, I don't immediately see the problem.
[2021-10-14 03:17:57] <roconnor> But I don't really understand, like linkers, or computers and stuff.
[2021-10-14 03:18:24] <sipa> if libsecp is thread-aware that's not a problem, but that brings in dependencies on thread stuffz
[2021-10-14 03:18:55] <sipa> I guess C11 has that natively
[2021-10-14 03:19:08] <roconnor> treads.h is optional.
[2021-10-14 03:19:21] — sipa treads lightly
[2021-10-14 03:20:20] <gmaxwell> roconnor: that could potentially work, but it would depend on the specifics of the implementation of that "global call_once lock" -- which is a non-portable thing.
[2021-10-14 03:20:59] <gmaxwell> If we were willing to use system specific stuff there are all kinds of extra tools, like thread local storage.
[2021-10-14 03:21:07] <roconnor> I mean, it is standard in C11, but yes I get it.
[2021-10-14 03:21:29] <roconnor> OTOH, didn't Lamport write a semaphore from nothing somehow?
[2021-10-14 03:22:07] <sipa> you really don't want a spinlock here :p
[2021-10-14 03:22:25] <roconnor> did he use a spinlock really?
[2021-10-14 03:22:40] — sipa googlesd
[2021-10-14 03:22:49] <roconnor> ` while (Entering[j]) { /* nothing */ }`
[2021-10-14 03:23:00] <gmaxwell> I don't think the language gives you strong enough guarentees that you can write this stuff from nothing in any case. Like the language (well prior to C11) makes absolutely no promises about the synchronization of memory across multiple threads.
[2021-10-14 03:23:02] <roconnor> ... looks kinda suspicious ...
[2021-10-14 03:23:32] <gmaxwell> like, I think it would be totally valid to have your C89 machine have threads that communicated over a blockchain or something. :P
[2021-10-14 03:23:46] <sipa> roconnor: i guess "/* nothing */" is a special keyword in the language used :p
[2021-10-14 03:24:09] <sipa> gmaxwell: DONT GIVE THEM IDEAS
[2021-10-14 03:25:11] <roconnor> sipa: I think it is FORTRAN.
[2021-10-14 03:25:40] <gmaxwell> Even peppering your code with volitile doesn't give strong enough guarentees because it doesn't necessarily sync across cores (and e.g. on PPC it won't, you need hw specific sync instructions -- x86 has an extremely strong memory model, at a high hardware / communications expense).
[2021-10-14 03:25:59] <sipa> roconnor: i think it's assembly pseudocode, using a C-lookalike language
[2021-10-14 03:26:13] <roconnor> That's true.  The multi-core memory model is something ... that doesn't even exist.
[2021-10-14 03:26:20] <sipa> i guess at some point in history those two were the same
[2021-10-14 03:28:33] <sipa> C11 threads.h also has thread local storage, mutexes, and condition variables, FWIW
[2021-10-14 03:28:56] <gmaxwell> once you have any kind of memory model it's pretty much reasonable to provide all the things. :P
[2021-10-14 03:29:12] <sipa> no read-write lock!
[2021-10-14 03:29:30] <sipa> (though you can implement that on top of the existing primitives afaik)\
[2021-10-14 03:30:20] <gmaxwell> huh, well people seem to rarely use reader writer locks for whatever reason.
[2021-10-14 03:30:28] <sipa> indeed
[2021-10-14 03:30:30] <gmaxwell> (maybe I'm just prone to overuse them)
[2021-10-14 03:31:04] <sipa> i think the C11 threads.h is pretty much pthreads, with s/pthreads_/thrd_/g applied to the API
[2021-10-14 03:31:27] <gmaxwell> pthreads as a reader/writer lock.
[2021-10-14 03:32:41] <sipa> oh, indeed
[2021-10-14 03:39:41] ⇐ roconnor quit (~roconnor@host-45-58-217-8.dyn.295.ca): Ping timeout: 245 seconds
[2021-10-14 04:36:56] ⇐ jamesob quit (uid180710@id-180710.helmsley.irccloud.com): Quit: Connection closed for inactivity
[2021-10-14 05:47:51] <cfields> sipa/roconnor: Linus's userspace spinlock rant is a fun read, if you haven't already: https://www.realworldtech.com/forum/?threadid=189711&curpostid=189723
[2021-10-14 05:48:31] <cfields> (the spinlock keyword above triggered that)
[2021-10-14 05:49:45] <cfields> tldr: "I repeat: do not use spinlocks in user space, unless you actually know what you're doing. And be aware that the likelihood that you know what you are doing is basically nil."
[2021-10-14 05:50:24] <gmaxwell> there are a lot of concurrency primitives that work a lot better in the kernel than in userspace, sadly. :(
[2021-10-14 05:54:54] <gmaxwell> but then again since almost no one uses reader writer locks, the fact that RCU in userspace is kinda ugly, I guess doesn't matter much. :P
[2021-10-14 05:56:27] <cfields> gmaxwell: I recently used rw locks in another project with great success. To your point about unuses, it exposed a nasty pthreads bug, though :p
[2021-10-14 05:56:49] <sipa> LD
[2021-10-14 05:56:50] <sipa> :D
[2021-10-14 05:57:22] <cfields> gmaxwell: c++17 exposes rw locks, which presumably exposed the pthreads api to real fw use for the first time.
[2021-10-14 05:58:00] <cfields> *real rw
[2021-10-14 05:58:04] <gmaxwell> cfields: lol. Fully expected.
[2021-10-14 05:58:36] <gmaxwell> (this is of course why I whine that no one uses them, otherwise why care what anyone else does? ... I need them to find the bugs. :P )
[2021-10-14 05:59:24] <cfields> (trying to hunt down that bug report now, it was a fun one)
[2021-10-14 06:01:13] <gmaxwell> But at least there is a hope for RW locks, vs just about ever data structure people call lock free is actually broken. :P
[2021-10-14 06:06:25] <cfields> gmaxwell: found it: https://sourceware.org/bugzilla/show_bug.cgi?id=23861
[2021-10-14 06:06:30] <cfields> we ran into that in real code.
[2021-10-14 06:07:05] <cfields> all good now. but was a real bummer at the time.
[2021-10-14 06:07:40] → roconnor joined (~roconnor@host-45-58-217-8.dyn.295.ca)
[2021-10-14 06:08:02] <roconnor> I thought Lamport didn't use a spin lock.
[2021-10-14 06:08:21] <roconnor> I thought making a semaphore from a spin lock was easy, and whatever Lamport did was not easy.
[2021-10-14 06:09:24] <cfields> roconnor: like I said, I have no clue about lamport's semaphor, I merely pattern matched on "spinlock". And I assumed the userspace part :)
[2021-10-14 06:09:30] <cfields> *semaphore
[2021-10-14 06:13:58] <roconnor> googling suggests that my memory of concurrency from my CS classes is kinda sketchy.
[2021-10-14 07:19:27] ⇐ kanzure quit (~kanzure@user/kanzure): Ping timeout: 240 seconds
[2021-10-14 07:21:31] → kanzure joined (~kanzure@user/kanzure)
[2021-10-14 07:45:22] → roconnor_ joined (~roconnor@host-45-58-217-8.dyn.295.ca)
[2021-10-14 07:46:17] ⇐ roconnor quit (~roconnor@host-45-58-217-8.dyn.295.ca): Ping timeout: 265 seconds
[2021-10-14 08:09:52] <fanquake> sipa: I may or may not have an opinion once I read all the backlog
[2021-10-14 08:10:00] <fanquake> I do see musl and LTO
[2021-10-14 12:55:48] <real_or_random> roconnor_: this https://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm ?
[2021-10-14 12:59:38] <real_or_random> if we're looking at C11, then atomic data structures may be more suitable for this use case https://en.cppreference.com/w/c/atomic
[2021-10-14 13:00:37] <real_or_random> it wouldn't be crazy to add have this as an optional feature (with an #ifdef)
[2021-10-14 19:03:32] <sipa> real_or_random: for initializing a global, call_once is the appropriate approach i think
[2021-10-14 19:25:18] <elichai2> about platform detection, because cpuid is only x86 then we can use a global `int` and use x86 specific code for writing/reading it (more specifically we can use a simple asm `mov` instruction to/from it which is defined to be atomic in x86)
[2021-10-14 19:26:43] <elichai2> Anyway this discussion became too long here to follow, so if someone wants to open an issue summarizing the different kind of contexts and reasons that would be great. otherwise I could try to spend some time tonight reading everything here and summarizing it
[2021-10-14 19:27:11] <elichai2> (actually this can prob be done in #780)
[2021-10-14 20:07:50] <real_or_random> yeah I think it should go to 780, would be nice if you could do it
[2021-10-14 20:08:51] ⇐ siv2r[m] quit (~siv2rmatr@2001:470:69fc:105::fed3): Quit: Bridge terminating on SIGTERM
[2021-10-14 20:12:28] → siv2r[m] joined (~siv2rmatr@2001:470:69fc:105::fed3)
[2021-10-14 20:14:10] <sipa> real_or_random, gmaxwell: a possibility is having a flag at context creation that indicates the context will only be used from a single thread; then signing operations etc can permute its randomness state without a separate sign_and_rerandomize; if secp is compiled with thread support, it could instead *always* do this, even without the flag present; that way the thread-awareness of the lib doesn't leak
[2021-10-14 20:14:17] <sipa> into the API
[2021-10-14 20:18:11] <real_or_random> call_once sounds right. there's also _Thread_local in C11
[2021-10-14 20:20:26] <sipa> hmm, using thread local storage for randomization state would be even better, i guess
[2021-10-14 20:20:45] <sipa> though iirc it comes with a significant performance cost on some platforms
[2021-10-14 20:22:05] <real_or_random> PRGs are a typical use case of thread locals