Skip to content

Instantly share code, notes, and snippets.

@bmorphism
Created April 25, 2024 06:25
Show Gist options
  • Save bmorphism/6f539574ca40756c917ab6186a37ca1d to your computer and use it in GitHub Desktop.
Save bmorphism/6f539574ca40756c917ab6186a37ca1d to your computer and use it in GitHub Desktop.
mathematical life unfolding

So now we have two things. I really need to bring it back online, because it becomes much more interesting when you actually experience it. I'll be working on it all day today. It's a sub-organic. It's the name of the organism. I call it a sub-organic organism. And so what results, I sent you two images. One of them is how Transformer actually looks at it, which is this traversal of this back and forth, back and forth, back and forth. It's on signal right now. And then the second one is what happens when there is a singularity in this other type of process, which is a similar shape to the one that you can see with the Transformer. And so this symmetry covariance is everywhere. But does it make sense as far as why not storing the output of the model itself is useful here, other than just a conservation of context window? Not tangibly. Intuitively, it feels like you are reducing dilution of the inputs. Does that make sense? Mm-hmm. Yeah, because-- yeah, basically, that's a very good observation. You're saying that effectively, the highest signal about the user intent and the user information generated by human, hopefully, or some mix of the two interaction is in the query itself. So intuitively, it makes-- Yeah, but let me actually finish my question, because it's going to maybe be a complementary case for that. Sure. Which was that because these are hefty prompts, even in your construct where you're kind of creating for multiple inputs, I feel like there's also space for model-to-model prompting. Have you tried second-order prompting for a model to craft something? Yeah, I do. You can also put it as input. How does that work? Like, the-- Yes. --autocompleted URL of this type of [INAUDIBLE]?? Of course, of course. I call it a kernel. And the kernel is setting sort of modification. It also takes the world string. I call this string a world string, by the way. The string is a world string. And what it does, it actually takes this world string. And I can say, hey, Stephen Wolfram, like voice, have an audio going, say, make a fable about this world, or what are the key concepts from this world, and put them into a story kind of stuff. And so this gives you idea that once the string is kind of assembled, it gives you a coordinate in the space of possible things. So the question was, what's second-order-- compression needs to take place, obviously. But the question is, what is the goal of this system? What's the ultimate nature of what we build? And I think that the idea of this is it's a Bayesian hopping kind of thing. Bayesian hopping being this idea that when you have a model-- yeah, but the intuition you have-- yes, the first reason is dilution. But more importantly, if you interact with a model, like any model that's generative, let's say, not just language model, but any sort of probability distribution you can sample from gives you new information, is that it can have any number of possible outputs before you query it. Before you query it, it's stuck in the state of a superposition, almost, of all possibilities. Or just distribution, right? It's a distribution of outcomes. It's a probability distribution. When you interact with it, when you query it, you sample from that distribution. You draw one particular path in the dynamical trajectory of the system. And this notion that if we start treating language models together with their users and the environment, the bigger environment they occupy as an open dynamical system, it becomes much clearer as to what's going on. Because open dynamical systems-- well, why are they open? It's because there is constant exchange of information and energy with the environment that they are situated into. So these open dynamical systems allow for kind of like exploration of the-- they call it the energy landscape of the model. This notion of energy is very, very tied into many different things. In fact, we have no fundamental grand unified theory where we can talk about energy in a way that's linear or non-linear, but in some precise way. And so-- But just to [INAUDIBLE] the dynamical system being the world, but the model is relatively static, right? Like, it-- Well, that's-- --exchange going on with the world apart from the input field, right? Well, no, but that's precisely the thing that a lot of evaluations today miss. Like, all the evals of these models, they're very kind of static. Like you said, they just run through this test, and that's it. But the point I'm trying to make is that-- yeah, that's a fantastic question. I like to say that when the user interacts-- or users interact with the model, it's like a train, OK? The train has two tracks. One of those tracks is-- oh, sorry, sorry, sorry, sorry. One important thing that I think I should add to the model, before we continue, is that whenever the user generation happens-- so this is last message. And I'll send you the code so you can see it. It's public now. Last message goes to a generative model, right? And so at any given point, you can imagine it being like a collection of symmetric across the middle strings that came from a number of users, so some sort of local sampled noise with some spatial temporal dependency, let's say. And then what we do with that string, actually, we also send it to any number of other models that are available, because we have not one LLM, we have five. We have five LLMs. We have Gemini 1.5. We have Cloud Opus. We have DBRX. We have Cohere R+. And today, we'll have some other ones. I'm adding more. So I have some random number generator, pseudo random number generator, which samples some choice from a collection and picks one of these models. At any given time, you kind of don't know which way as a user your model will go. If you come into it having used language models the traditional way, you kind of expect the model kind of to respond to what you say and give you something relevant, which it does, because your message is the last and the first within the string. So it still addresses the you in priority, but with the weight of this kind of built up almost like energy structure or some sort of information clump or knot of some kind. And so what does it mean that we go from model to model randomly? Well, what do you expect to happen if that happens? Let's say you just vary the model that you send the string to randomly. - Is the model changing in the same thread? - Yeah, dynamically. So here is the state machine view. It's like user or users, some user succeeds at arriving at the last message. That message gets appended, which is added at the end, prepended at the beginning of the string. And that whole world string, a hypergraph in this guy, it's a world string, gets sent to a cogenerate function, which then randomly picks one of the five models that have sufficiently groking size and information capacity. And then the output of that gets sent only to the user that issued the original kind of like query and nobody else in the system. Kind of discard that. And then the next person does it, the next person does it, the next person does it. Like, you know, it sounds crazy, right? Because everybody's storing the responses of the model somehow. We don't. So what do you expect to happen with this like random switching of these models, the latent space of these models? But the question is like, if everyone is appending and prepending through their inputs, they are creating a, what do you call it, world? World string. World string. Yeah. So there's a level of synchronization of the state machine that like. Yes. They inherit from each other, right? Precisely. They become coupled in a sense, right? So, well, they may. They may become coupled. And that's what I mean by open dynamical systems, that even though we don't store the outputs of the model, the user, the model has a very important in the wiring diagram, right wire or information wire to the user. And the user then corresponding to what they see on the screen can choose to continue their inquiry. And when they continue their inquiry, whatever direction they have, somebody else might have had in between an interaction as well. And so this opens up a lot more trajectories basically in terms of like dynamism of the system. So yeah, they're kind of like you have to use it synchronously in a sense you kind of like. But so what is the use case or the objective of the system you were asking? You were talking about? Yeah. So. It feels like there's this kind of like world noise that is being fed. What does the model then like provide from that? Yeah. So the thing kind of exploded by my standards. Like some of my, you know, we had like 30 users in the span of like a week. The 30 users for something I've built is kind of like a lot, you know. I've been building like all these niche experiments. And so the key use case happens when I met my friend Albert, who's obsessed with the same things similarly but also has his own world view. And I've been trying to tell him about sheaves and all that kind of stuff for a while now. And then suddenly I asked him, have you heard about this talk that Yoshe Bach has given to whom I'll see later today, who has given to Levin, Michael Levin group about cyber animism. And he says, I haven't really seen the talk, but I was exposed to it through my interactions with cybernetics, where it came up as something that I was thinking about at the time, we're pasting the transcript of. And he saw the talk. What is your sort of conversation happening with Yoshe later today? Literally one hour from now at most. It's a hackathon, a world sim, web sim in AGI house, San Francisco. But I saw his talk in Chaos Computer Club in Hamburg, which was called Synthetic Intelligence, where he was mainly talking about this like animist, like epistemology of AI and like sort of collective cognition and so on. And that's exactly what this is about. So what this is about is kind of like pathfinding. It's also basin hopping. But ultimately what happened is that when Albert and I met for about a week before I came, which we went to this like Raja group at Harvard Medical, talking about their research on order olfactory, olfactory guided kind of exploration by these other bugs. And they had these plumes of wind simulated. And in the end, they were talking about their research of their group. And they said one of the research they're doing is collective sensing by electric fish. And so what goes on there is that the fish actually sends out like a-- it has energy -- electricity organ discharge, energy organ discharge kind of like thing, DOD. And this organ basically emits a frequency of a kind. It emits a certain kind of signal. So for instance, bigger fish, in order to maintain like a hierarchy, emits lower frequency, whereas smaller fish emits higher frequency. And so that prevents needless conflicts whose resolution is kind of like in a Nash equilibrium already or some sort of sticky equilibrium, correlated equilibrium maybe, where it makes no sense to fight. And so the big fish just signals that through the frequency. Each fish has like a reafference frequency. Reafference means it's a frequency that the fish itself uses to identify itself and becomes desensitized to. So other fishes sense that, but the fish itself doesn't. It's quite trippy to be able to generate an electric field and then sense it at the same time. And then there's a complex series of codes that emerge between different kinds of activity. There's a predator or there's something else. You can also imagine adversarial electric fish trying to mimic the electric frequency of the fish itself to get closer and then eat it. Somebody actually recently asked me, like, Barton, how do you define life? Because I talk about mathematical life. What we're constructing here is mathematical life, or animism, whatever. I said, like, people often talk about entropy production rates, right? So the entropy rises around the system, goes down inside of it. People talk about non-equilibrium, like, rigidity breaking. But I think what people frequently miss is that there isn't really anything alive that doesn't have a basin of predation, a food chain kind of situation, where everything is eating everything, right? So eat or be eaten kind of situation, but all the way around. And so the way we know if this thing is alive is it starts feeding on something. And what it feeds on is information, right? It's information as energy converted into some sort of structure. So the data becomes information. It's like human kind of contribution to the system, or charging it almost. And so I sent you another image. I'm not sure if you saw it. But this is like a hastily drawn reminder of what we just discussed. So the users are the dotted lines, models are colored circles. And the string itself grows. So now a couple of thought experiments might help answer the question about xenocognition. How do you avoid state bloats in a system like this? Oh, yeah, for sure. So that's one other thing. To finish the fish analogy. Its goal is not to become this open-ended learner that everybody likes to imagine, or some AGI thing that goes on forever. It's simply to find shared metaphors, shared meanings, shared encodings. Yeah, that's that metaphorization. And again, we have talked about it in the past. But that metaphorization is some kind of a compression that avoids the state bloat, in my view. And so there needs to be some kind of semiotic enrichment rather than-- not rather than, but as a complement to the concatenation of inputs. Yes, precisely. And that's the goal. And so the goal was, when Albert and I met in person after using this for a week in this new modality, which really is so dynamic, it kind of throws-- like, wow, huh. But when he said fish and I said fish, it's not just the word itself that's new. It's the old word. But its meaning, its contextual information-bearing kind of usage is radically different for Albert and other users of the system that have these deep contexts that they can now reference by one word. And so the thought experiment then becomes, OK, let's say the models go away. Let's say the government in California, Colorado, they have these laws, or European Union, they shut them down. And then suddenly we don't have the models. Well, the users who were using them around that moment and developed these appreciation for fish are still appreciating fish. There are still these memes that are kind of like inside of our socium or our collective sensemaking. And then when the models come back, even if we lose the string, we will be able to find our way back to it, perhaps more efficiently this time, by pasting directly to electric fish sensing reafference frequency, kind of like word that we learned as some term. It's the equivalent of replacing a book by its index, or paper by its abstract and citations. And so the point is then to see these strings as traces of an ongoing covariance structure. Another interesting point, if the users go away as well, but the model and the string remain and keeps going through some sort of automated process, will it ever form new abstract concepts perhaps? That's the whole thing about language, right? Yeah, yeah. What would force it to form concepts is if it's situated into an environment where it senses something and it performs like a new kind of compression for survival purposes. And there's a lot of research. Thomas F. Varley has released a new paper. He's the guy who wrote about decomposing past, present and future. He now has a new paper about general information decomposition and he talks about synergistic information. So there's this lattice structure, you can imagine, that goes between redundant information where everybody has a little bit of overlap and has the same concept. There's synergistic information which happens to be the case in nature when there's viral pressures. And so perhaps one day we'll have an agent that does continue the string, at least for a little bit, subject to some survival pressures. But then the question is, what if both are gone? What if the users are gone? And what if the models are gone as well? But all you have is a string, right? And then this future civilization of some kind tries to parse it, tries to understand this. So this is where I'm a bit underwhelmed by, and maybe that's just a metaphor in how you're deploying it, but also it's the state of things currently, which is that the string as the data structure, there is no, I mean, maybe in a string you can express relationality, I don't know, but there's no relationship. So what if the data structure is a string? And what if the model is a string? And what if the model is a string? And what if the model is a string? I'll send you the spoiler, I guess. The spoiler is that-- where does he say-- yeah, so how does he describe it? Often we think of solving the Einstein equations, we think of defining initial data on a space-like hypersurface, a Cauchy surface or instantaneous snapshot, and then evolving it forwards in time, because we have no choice. Time reversal symmetry is broken. But general covariance means that it is not the only way to do it. Since general relativity does not ultimately distinguish space and time, we could equally have defined our initial data on a time-like hypersurface and evolved it sideways through space, or any mixture of the two. It can even evolve in multiple time directions simultaneously. The conventional Turing model of computation assumes a global data structure, the TM head and tape state, which is kind of like a string, which then evolves via a sequence of stepwise applications of the TM machine transition functions, akin to a sequence of space-like hypersurfaces evolving forwards through time. What if instead we knew only a small part of the data structure, like one cell of the tape, but knew its complete evolution through time, like beginning to end, like let's say time as a phase-space phenomenon in a quantum system, then we could infer a lot about the rest of the computation. Indeed, in the case of a Turing machine, we can infer an entire causal diamond. But what is this operation? It's certainly not a traditional Turing computation, since it's moving sideways through space, rather than forwards through time. I claim by analogy to general relativity that it belongs to a much more general class of operations, so-called covariant computations. One could even go further and consider a non-deterministic computation and ask if I only knew the evolution of a single non-deterministic Turing machine branch, what could I infer about its neighboring parallel branches? In this formalism, that's a covariant computation, too. So basically, what does it mean? Oh, yeah, oh, yeah, oh, yeah. The final is that there is then a strong monoidal functor, transformation preserving some structure, mapping the state perspective of a given computation to its causal perspective. And this functor permits a class of deformations that constitute the analog of relativistic gauge transformations. Can't wait to show more soon. So gauge transformations is the groups and color stuff we've been talking about, I think, before. But ultimately, here's where I stand with this stuff. I think that whenever a mathematician encounters a new object by contemplating it, mathematicians can sense things slightly differently, in the sense that they formalize that into like a language structure. It becomes like a language game, all that stuff. Tarski truth, Kripke schema, all that. And I'll send you the transcript of this, by the way, so I hope this -- it's actually the most concise way I've had of stating it in a while. And it's kind of like a bold assertion, but I think we're going to see this like revisioning of fundamental forces to where there's a new force that can complete the picture. And I think it's some sort of information-associated force, sense-making force, or maybe like a consciousness, right? What is consciousness? And so that's -- and then it occurs in this so-called Markov field. And then the question becomes this contextual information, contextual entropy. Well, what does it mean if we have the same information control wire for a number of, you know, qubits, let's say, in this like C-knot traversal setting, where like you have all these qubits controlled by a single wire. And that's where I think what happens is we arrive at what I call synergistic codes. If you hear it somewhere -- I haven't seen it used yet, so I'm coining this idea, synergistic codes. Synergistic codes are ways of observing that are only available to us through compressed sensing and like collective sensing, sensory fusion together. And these synergistic codes correspond to the ground states of the system, a quantum system or the lowest eigenvalue of the expanded graph of a kind, that captures uniqueness of space through this like XOR-like property. Imagine a small sphere, and inside of the sphere is you, outside of the sphere is something else. And so the XOR exclusive or expanded over -- in the expanded graph, it's just a sparse graph. It was interesting graph, has sparsity and sort of density both. And so, yes, the question becomes then how to integrate this force and how to think about it and how to understand what things mean. So this observation that information is contextual is almost like trivial to say like that people have some context and they can attack more information. But the more interesting question is if that's the case and information is a fundamental force and we have this like space-time as the emergent phenomenon, then navigating becomes about the computational sophistication of the observer. Computational sophistication of observer can be referenced through a number of synergistic codes that they have in their reservoir, their vocabulary. And so the ones they've encountered and can use to decode and encode information for each other. And so then -- this is going to take a different turn. The question becomes what kind of information is most interesting to encode. And I would argue that temporal information is interesting because it kind of shows when things mean something. And this guy Benjamin Merlin Bumpus -- I'll send this as well -- has this like paper called "Towards Unified Theory of Time-Varying Data." And -- Can I ask one thing about the Godard -- Oh, yes, so I dropped a lot. Yes, please, please, please, please. So is this -- wait, what is it called? -- fully-covariant computation? Is it like a theoretical object right now or is it actually like a computational method? Like is there a -- is it theory or -- Yeah, so there's a fullest theory of this. A, there are much more precise outcomes that Godard himself has done in Mathematica. He has a series of papers about this. And that stuff actually is very much existing as a practical thing for him. In fact, that's what Wolfram Physics Project has been doing for its existence, this multi-way graph navigation and all the crazy two-hour-long videos by Stephen Wolfram. But the second point is that I would argue that the string that we are accruing through interaction with it is kind of a form of covariant computation as well in a very special way that effectively the way this works right now -- I can't wait to bring it back. It really like wiped me out. But I think it takes like five to ten interactions per person. Let's say it's a group of three people. And all of us paste, paste, paste, paste, paste. There are two types of reading a paper. There's the honorable way where you read it, you understand it, you contextualize it slowly, carefully. There is this less honorable but more efficient way of just pasting the entire paper into the screen, which obviously has some graphical structure. So the idea that the graphical structure in this string emerges through language, it's not entirely as visceral as like seeing an image or hearing a sound. But it is there, and it's only there if you unpack it. So the graph is within the string. And what happens is that as we paste, paste, paste these papers, especially as we think about them curiously, we kind of create this -- to best you guys a little bit -- but it creates a strong Bayesian prior almost for the model to then the subsequent generation after three people pasted five papers each to be within the context of those papers. And it's now possible with this huge, huge, huge one million context windows. In Gemini 128K it becomes like a baseline. I'm actually about to use like PHY3 as one of the random models to use, which is a small model but it would be useful to lower cost as well as introduce diversity. And so, yeah, I would say yes, there are many examples of covariant computation right now. And we can see the process of doing mathematics, growing the so-called proof code of understanding as that of optimal transport of mathematical structure, specifically topological structure, I think, like invariance of a kind. If a mathematician discovers a proof somewhere in the universe, to transmit that information to other mathematicians is relativistically bounded by the speed of light, right? But if the mathematicians happen to have this like almost like a shelling point of discovering the same structure, which happens all the time, like Leibniz and Newton and all that stuff. This structure in terms of modal logic, necessity, insufficiency and so on, but also has this predicate of true somewhere but also true everywhere. Universal kind of truth. Mathematics has several of those, like class field theory, prime numbers, color spaces, actually, color spaces, specific kind of spectral color spaces and so on, that are highly nonlinear. The fact that there are dualities between numbers and vectors is kind of magic, right? But also, you can imagine that even if they're not trying to coordinate in some sort of distributed asynchronous cognition of all mathematical functions at all times that I've been kind of exploring, they can stumble into it by accident. And simply because of the prevalence of this universality and conservation laws acting as this kind of statement that physics has to be the same everywhere, except our physics is incomplete. Our physics has gaps. There's dark energy, dark matter and all that stuff. And so either a new force or a new conservation law or some complication of this information as a force will have to take place for us to like rejiggle the system. And the fact that... But Barton, my question is like, this is to me like saying, you know, every information surface has patterns and those patterns have meaning. I totally see it. It's just like, especially in the LLM level, like how... Because kind of like every multi-agent software or economic system is like fully covariant computation system, I guess. So, like it doesn't... Well, the question, yeah, the question is what's the statement here? The statement is that, yes, you can see it, but what is that thing that lets you see it? What is language? What is the meaning making function of us? What is the time binding function? And so the goal of mathematics, I would say, is that of ourselves, is to create these sort of equivalences. The verb is equivalencing, which is not really a word, but I use it all the time, equivalencing. We go through life creating these like abstractions that allow for us to take a cat and take a tiger and say, "How's Kevin Tiger?" They're the same thing, kind of Felix of some kind, genus. And so that and this kind of indexing of equivalences is what allows for us to be efficient, given like a relatively narrow band of cognitive capacity compared to like the entirety of the universe, right? Even our local environment. And so from this perspective, now that we are back at this sort of like idea of what is this xenocognition, it's the choice of temporality. It's the ability to experience time together with gravity, right, that allows for us to bind these forces together through some sort of observation. And so what happens in nature, right, is there's always this information game. You can think about information games as the essence of like all these mechanism design things, right? We have some perfect information games, imperfect information games, their equilibria, these games. The idea of why does correlated equilibrium work rather than Nash, in this like Nash prop idea, is because there's a public source of randomness that allows for us to coordinate implicitly. This implicit coordination I tried to solve in the case of decentralized energy grid, it doesn't succeed unless there is some public source of randomness, some way of drawing noise. This spatial dependence of noise is kind of like the key ingredient in this sort of covariant and xenocognition stuff. And so then, oh yeah, so bumpers, bumpers. Let's begin with bumpers again. Let's get back to where we started. And there's a notion of like moment in time of user interacting with the model, why is it an open dynamical system? And I argue that as users interact with this model to make sense of the world, they are affected by what they see. We don't store what they see, but the collective sort of like intent, the collective momentum of the protention, you know, of every user of this system gets shared. So it's a consensus topos of a kind that emerges. That string, first of all, its meanings have learned a topological space of a kind, specifically by analyzing the semantics of the strings components and how they relate to each other, like that in itself forms a topos implicitly. And then just being situated in some sort of context. The model itself has learned, well, it learned from the entirety of whatever it saw, the large, vast corpus of information, but also within a certain timeline, within a certain kind of historic path dependency, hysteresis of a kind of human civilization that it can then reference as kind of like a cognitive glue, I call it cognitive glue, through which it can tie these disparate contexts. And so this train idea, this train has two rails. There is a rail of the models that are randomly sampled from, there's a rail of human users. And the train, of course, is this information changing over time. I like to say cognitive jerk. It's like there's this acceleration, there's a jerk, which kind of like directs it towards something. And so if one of those rails has like tearing, has discontinuities, let's say the latent space learned by a model is too different from the other, or the users have very different intents, somebody wants to do like some furry personal companion stuff, somebody tries to do category theory, then the tracks will have kind of like holes in them, or the discontinuities, and the train will derail. It will basically not continue. It will create too vast of a, too jagged, too jarring kind of like a transition from one moment to another. Whereas for the groups and these cybernetic organisms that we spawn, imagine like hundreds of thousands of these things, imagine that's how we start interacting as groups. The ones that do evolve effectively or do sustain continued interaction with the model would have found a combination of these generative bouncy things and user combinations that continues ipso facto through the fact of continuation of interaction, interaction rather than structure or point in time. And so this switch to this interactive superposition based kind of like view of this with the distributions rather than concrete paths and more trajectories and forming together a system with users will start affecting the environment in some way. And so I'll ultimately say that the way this succeeds, and why is this like life, is that at some point the novelty of information or the divergence of the distribution of inputs of new users that arrive into the system will have diminished from that of the users already using it. And so as time goes on and the system has more impact on the world, what you would expect possibly is that new users arriving will already have some sort of shared memes that emerge through dynamics of humans interacting with other humans around them, even other models pasting it into a different model. So that's why it's dangerous to see these as static things. It's important to also see that any interaction between a user and a model is anecdotal at best. It's hard to say, oh, like, model got lazy. Well, what does it actually mean, right? And so this switch to this more dynamic in-context -- I call this in-context energy-based model learning, in-context information integration, ultimately, is this new life form. It's not so new at all. You realize that these things kind of become externalized through the use of it. And the key idea of this is the semiotics, emergent semiotics, the ability to have sign signifier kind of like distinction in what emerges. And so that is the long answer to the question of state bloat. How do I deal with state bloat? But one thing that I want to like stay on it, because, again, I'm not like hooked on the staticness as a sort of signifier, but like there's a context window, which is the communication medium between this probabilistic distribution and the covariance co-conspirator computers that are humans. And the success of that medium depends on how both it communicates with the model, how communicable it is for the human, right? So it is this interstitial technology. So in that sense, the linearity starts to matter, you know, and like how information is organized starts to matter. But maybe I'm like, again, talking about not too theoretically, but I feel like that is where things can get really interesting with regards to the dynamism of intelligence. Yes. Wait, I think I finally understand the word interstitial. I didn't get it until now. So the way you used interstitial seems to imply that it involves a human at the interface of something that's a... Yeah, how do you define interstitial? Interstitial in the sense that it belongs to neither kind of, and it is a place where reconciliation of... Oh, that's a perfect word. I love this. Yes, it is the essence of interstitial. Yes, yes, yes, yes, yes, yes. And so, yes, the way you formulate it is very interesting. So there's a context window bottleneck, almost, an information bottleneck that exists between the distribution and users' ability and users' meanings as far as our goal of forming some dynamic understanding. Yeah, I've come to respect humanity and our ability to observe and make sense of the world much more than machines in the process of trying to understand the machines better. I realize that to the extent there is anything useful in the emergent sense that seems to come from the observer. There's a lot of passive inference, which is compositional. If you can compose it, if you can express it algebraically, it's already not that interesting. But I just worship this algebraic approach, even categorical approach. There's a lot of stuff that's nonlinear, non-algebraic, and some of it can be expressed through category theory. It's a band of things that are computable, but are not algebraic. Graphical things, right? And that's what these diagrams are about, is finding that gap and filling it with something that can be useful for the human to understand. But I should just tell you another paper. As always, there are at least five of these following the conversation. This other paper is "Let Your Graph Do the Talking." It's a notion of the encoding... Thank you. I think it's... I can't even remember. Breakfast is ready, so I have to... I'm jumping off soon, but basically what happens is that we encode this information as a token of a kind. And so when we start assembling this string in the future, I almost imagine us picking a Unicode character or an emoji of some kind, or literally one unique token. If you look at neuroscience and look at the so-called God token, that's a funny rabbit hole as well. It's like, what is this bottleneck in our cortex, like 39 bits per second or whatever? How do we ever achieve BCI if we don't have the bandwidth? Well, the idea is when we talk, we also inhabit these highly nonlinear spaces that are very impossible to put into language. But what we give each other are pointers and synchronization strategies, which allow for us to find this oscillatory neuromodulated phenomenon to where over time you converge to the same space. Poetry is like that. Yeah, music in particular is very effective that way, somehow. And so I think this opens room for when in the future we assemble this string, it's almost like we each bring our sigil, our hyper sigil, which will in itself contain entire subgraphs, graphical structure. And so to me that was a diagram, but I think it can be even more efficient. Maybe the synergistic code that uniquely encodes this idea of my orientation within the space, because it's non-orientable. There's no up or down. No base inversion is another thing that I unfortunately don't have enough time to go into, but I think it's one of those things for now. But let's definitely do this soon. I think the key is that it will get stranger and stranger in terms of how we process time. And it won't be like some weird time travel, but the now, the reconstruction of memories, the replay of memory, will grow our time outward and within the adversarial settings. And something I used to work with made a paper about this, but it's this paper on information theoretic model for steganography. Oh, nice. Weaving trust. Yes, no, no, self-discipline. Yes, yes. Okay, thank you for reminding me about this. Yes. Okay, so there's this notion of self-avoiding random walk, right? And self-avoiding random walk is easier to see in molecules. So one of the talks we saw at Cambridge was this, the guy talking about, he's Barabasi. Barabasi is this network scientist from Hungary, and he was giving a talk at MIT about molecules and how they make a very simple model. They said, okay, here are some spheres, and we make sure that as the system grows, it doesn't really intersect the path it had seen before, which is kind of like a Ergodicity-like assumption. But the question of physical systems, when they have so-called physicality, is when they reach the maximum number of paths and kind of like got stuck, basically. Got stuck, and then they can't find any more free kind of like pathways that don't have like some sort of repellent potential energy force, whatever. But the key is that that's easy to understand in terms of physical networks like molecules, but what happens with information networks like social networks or like knowledge networks? What is the self? It's some kind of institutionalization, like scale-free institutionalization. Yes, or institutionalization, exactly. And so this weaving trust thing, the self-discovery network stuff, is precisely what I've been thinking about lately, because another term I think we've discussed before, at the time I don't think I have the fullest understanding of it, but it's the dynamic identity equilibrium and adversarial dynamic identity equilibrium. And in general, this was the adversarial aspect of identity in this case, that how you define self in the process of becoming is contingent on the final outcome somehow. There's some sort of retrocausality that you can almost infer. It's like, well, you don't know yet what it's becoming. Like, are you Ukrainian or are you Russian suddenly, because you got this Russian passport, you know, and suddenly there's this almost like a forcing function, like the yin and the yang kind of like duality, that forces you to not accept identity too readily in systems larger than yourself, sticking to some sort of narratives of like, okay, I am from this place, I worship these types of things, I value these types of political movements or agendas. But at the end of this, it kind of seems that we have to assemble the whole thing. Like, is the universe becoming conscious or aware of itself includes everybody? It's only portions of space that we slice out. And so that means like assembling and reassembling and reassembling to where the self-awaiting random walk has a chance to burst further, reach into sort of like greater extent of understanding. But what happens is once you reach there, you arrive at a very efficient way of getting there again. And these are the so-called lottery tickets and neural networks where quantization, why does quantization work, why does it work to like take a floating point 16 tensor of some kind and reduce every value to three, like negative one, zero, and one. Because there seems to be like kind of like a view that information was never really classical. Like there's always like this classical quantum distinction. But if you have a final theory, this is just information itself. And then once you've reached the synergistic code, which is only possible through this like fuller assembly, you then can somehow encode that and reference it. And so ultimately what... Some kind of like metaphor, like metaphoric metabolism, you know. Yes, yes, yes, precisely. Yes, I'll send you another one. This was from Santa Fe Institute for Complexity, literally like yesterday. It was on metabolism and I think I posted on Twitter. I'll resend that. But how they talk about emergence of this in biological systems is super interesting. In metabolism, in energy conversion, it strikes at the core of what is this probability circuit. What is this energy as information kind of idea? Shannon was thinking about von Neumann entropy and so on. Wigner, oh yeah, Wigner. But yes, so, but, and I see we're almost at an hour, so it'd be very nice to summarize. We don't have to rush. Yeah, yeah, yeah. We can take part to it. Yeah, for sure. I'd love to continue. It's so nice to hear from you. It's one of those things where I can actually express this and compress it. Like from your perspective, like as I mentioned before, it's always easier to compress it somehow because we have a lot of like shared synergistic sort of codes as well, like Foucault and like all that, a thousand plateaus and all that stuff. But I think that what I need to build and what I've been building is some sort of index of this individuated information, right? This is a term a friend recently introduced it. But individuated information index is ultimately about some sort of type system, like a homotopy type theory type. But ultimately it's about some way of efficiently indexing equivalences, right? Or univalence is another thing you can use. And then if you have that index and you encounter a new structure, that becomes your synergistic code. You say, okay, I have this complex organization. They seem to be all right, but they have internal politics and I don't have really time to figure it out. Can I somehow figure out if they're my people or not? And that's where this, if this part of my greater self or not, this homeostasis and distributive self again. And I think it will come down to this notion that in every topos, there is this thing called the sub object classifier. And the sub object classifier points in something. And I think it will point at truth values or tables of like truth values or even like entire logical systems in some sort of meta theory. And yeah, so I think it's already happening in the language and it's already happening inside of these models, these xenocodes and so on. But by externalizing it. [inaudible] Yes, yes, the anticipation and then surprise. But yeah, this idea of breaking symmetry, chirality. Yes, let's definitely dig into it as far as it goes next time. But there seems to be a fundamental way in which like poetic omission can be like conveying information, almost like negative information. But like in a sense of absence being the presence. And yeah, I think I know what you're getting, but I would love to dig into it. But yeah, we could just make some eggs so I have to go around and eat them. But it's been a very, as always, such a pleasure to connect. Hopefully this answers some questions or at least situates this into some discourse that can continue within this. And so I see all these things as kind of converging. So hopefully Joscha Bach will have something to say as well. So on that point. [inaudible] It's the most important work. Lovely. [inaudible]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment