Skip to content

Instantly share code, notes, and snippets.

Created February 27, 2024 09:49
Show Gist options
  • Save bmorphism/6e8713ac01ae4787284bbe8a2ceac2e0 to your computer and use it in GitHub Desktop.
Save bmorphism/6e8713ac01ae4787284bbe8a2ceac2e0 to your computer and use it in GitHub Desktop.

Bibliography [AAG03] [AAGM03] [ACB17] [ADG+ 16] [AGG+ 21] [ALS10] Michael Abbott, Thorsten Altenkirch, and Neil Ghani. Categories of Containers. In Andrew D. Gordon, editor, Foundations of Software Science and Computation Structures, Lecture Notes in Computer Science, pages 23–38, Berlin, Heidelberg, 2003. Springer. Michael Abbott, Thorsten Altenkirch, Neil Ghani, and Conor McBride. Derivatives of Containers. In Gerhard Goos, Juris Hartmanis, Jan Van Leeuwen, and Martin Hofmann, editors, Typed Lambda Calculi and Applications, volume 2701, pages 16–30. Springer Berlin Heidelberg, Berlin, Heidelberg, 2003. Series Title: Lecture Notes in Computer Science. Martin Arjovsky, Soumith Chintala, and L ́eon Bottou. Wasserstein GAN, December 2017. arXiv:1701.07875 [cs, stat]. Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, and Nando de Freitas. Learning to learn by gradient descent by gradient descent, November 2016. arXiv:1606.04474 [cs]. Robert Atkey, Bruno Gavranovi ́c, Neil Ghani, Clemens Kupke, J ́er ́emy Ledent, and Fredrik Nordvall Forsberg. Compositional Game Theory, Compositionally. Elec- tronic Proceedings in Theoretical Computer Science, 333:198–214, February 2021. arXiv:2101.12045 [cs, math]. Thorsten Altenkirch, Paul Levy, and Sam Staton. Higher-Order Containers. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen, Madhu Sudan, Demetri Terzopoulos, Doug Tygar, Moshe Y. Vardi, Gerhard Weikum, 168

BIBLIOGRAPHY 169 [AP20] [APGSZ21] [Bak] [BBCV21] [BCC+ 22] [BCG+ 21] [BCS09] [BDK+ 21] [BE15] [BF18] [BFH+ 18] Fernando Ferreira, Benedikt L ̈owe, Elvira Mayordomo, and Lu ́ıs Mendes Gomes, edi- tors, Programs, Proofs, Processes, volume 6158, pages 11–20. Springer Berlin Heidel- berg, Berlin, Heidelberg, 2010. Series Title: Lecture Notes in Computer Science. Mario Alvarez-Picallo. Change actions: from incremental computation to discrete derivatives, June 2020. arXiv:2002.05256 [cs]. Mario Alvarez-Picallo, D. Ghica, David Sprunger, and F. Zanasi. Functorial String Diagrams for Reverse-Mode Automatic Differentiation, 2021. Igor Bakovic. Grothendieck construction for bicategories. Michael M. Bronstein, Joan Bruna, Taco Cohen, and Petar Veliˇckovi ́c. Geomet- ric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, May 2021. arXiv:2104.13478 [cs, stat]. John C. Baez, Simon Cho, Daniel Cicala, Nina Otter, and Valeria de Paiva. Applied Category Theory in chemistry, computing, and social networks, 2022. Dylan Braithwaite, Matteo Capucci, Bruno Gavranovi ́c, Jules Hedges, and Eigil Fjeld- gren Rischel. Fibre optics, December 2021. arXiv:2112.11145 [math]. R F Blute, J R B Cockett, and R A G Seely. Cartesian differential categories. Theory and Applications of Categories, 22:622–672, January 2009. Yasaman Bahri, Ethan Dyer, Jared Kaplan, Jaehoon Lee, and Utkarsh Sharma. Ex- plaining Neural Scaling Laws, February 2021. arXiv:2102.06701 [cond-mat, stat]. John C. Baez and Jason Erbele. Categories in Control, May 2015. arXiv:1405.6881 [quant-ph]. John C. Baez and Brendan Fong. A Compositional Framework for Passive Linear Networks, November 2018. arXiv:1504.05625 [math-ph]. James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman- Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy pro- grams, 2018.

BIBLIOGRAPHY 170 [BG22] [BGMS21] [BHS23] [BK22] [BKH16] [BL95] [BLB16] [BLM23] [BMR+ 20] Jacob Buckman and Carles Gelada. Bad ML Abstractions I (Generative vs Discrimi- native Models), April 2022. John C. Baez, Fabrizio Genovese, Jade Master, and Michael Shulman. Categories of Nets, April 2021. arXiv:2101.04238 [cs, math]. Dylan Braithwaite, Jules Hedges, and Toby St Clere Smithe. The Compositional Structure of Bayesian Inference, May 2023. arXiv:2305.06112 [cs, math]. Anne Broadbent and Martti Karvonen. Categorical composable cryptography. In Patricia Bouyer and Lutz Schr ̈oder, editors, Foundations of Software Science and Computation Structures, Lecture Notes in Computer Science, pages 161–183, Cham, 2022. Springer International Publishing. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization, July 2016. arXiv:1607.06450 [cs, stat]. Y. Bengio and Yann Lecun. Convolutional Networks for Images, Speech, and Time- Series. The Handbook of Brain Theory and Neural Networks, 1995. Aleksandar Botev, Guy Lever, and David Barber. Nesterov’s Accelerated Gradi- ent and Momentum as approximations to Regularised Update Descent, July 2016. arXiv:1607.01981 [cs, stat]. John C. Baez, Owen Lynch, and Joe Moeller. Compositional Thermostatics. Journal of Mathematical Physics, 64(2):023304, February 2023. arXiv:2111.10315 [math-ph]. Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Pra- fulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language Models are Few-Shot Learners, July 2020. arXiv:2005.14165 [cs]. Guillaume Boisseau. String Diagrams for Optics, May 2020. arXiv:2002.11480 [math]. [Boi20]

BIBLIOGRAPHY 171 [BP17] [BPRS18] [BPT23] [BQL21] [Bra21] [Bre18] [BS07] [BS22] [BSZ17] [Cap22a] [Cap22b] [CBD16] John C. Baez and Blake S. Pollard. A Compositional Framework for Reac- tion Networks. Reviews in Mathematical Physics, 29(09):1750028, October 2017. arXiv:1704.02051 [math-ph]. Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jef- frey Mark Siskind. Automatic differentiation in machine learning: a survey, February 2018. arXiv:1502.05767 [cs, stat]. Mugurel Barcau, Vicentiu Pasol, and George C. Turcas. Composing Bridges, May 2023. arXiv:2305.16435 [cs, math]. Joshua Bassey, Lijun Qian, and Xianfang Li. A Survey of Complex-Valued Neural Networks, January 2021. arXiv:2101.12249 [cs, stat]. Tai-Danae Bradley. Entropy as a Topological Operad Derivation. Entropy, 23(9):1195, September 2021. arXiv:2107.09581 [cs, math]. Spencer Breiner. Workshop Introduction, April 2018. John C. Baez and Michael Shulman. Lectures on n-Categories and Cohomology, October 2007. arXiv:math/0608420. Guillaume Boisseau and Pawel Sobocin ́ski. String Diagrammatic Electrical Circuit Theory. Electronic Proceedings in Theoretical Computer Science, 372:178–191, Novem- ber 2022. arXiv:2106.07763 [cs]. Filippo Bonchi, Pawel Sobocin ́ski, and Fabio Zanasi. The Calculus of Signal Flow Diagrams I: Linear relations on streams. Information and Computation, 252:2–29, February 2017. Matteo Capucci. Diegetic Representation Of Feedback In Open Games, December 2022. arXiv:2206.12338 [cs, math]. Matteo Capucci. Seeing double through dependent optics, April 2022. arXiv:2204.10708 [math]. Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. BinaryConnect: Training Deep Neural Networks with binary weights during propagations, April 2016. arXiv:1511.00363 [cs].

BIBLIOGRAPHY 172 [CC14] [CCG+ 19] [CCL21] [CEG+ 22] [CG22] [CGG+ 22] [CGHR22] [CGLF22] [CGLP22] J. R. B. Cockett and G. S. H. Cruttwell. Differential Structure, Tangent Structure, and SDG. Applied Categorical Structures, 22(2):331–417, April 2014. Robin Cockett, Geoffrey Cruttwell, Jonathan Gallagher, Jean-Simon Pacaud Lemay, Benjamin MacAdam, Gordon Plotkin, and Dorette Pronk. Reverse derivative cate- gories, October 2019. arXiv:1910.07065 [cs, math]. J. R. B. Cockett, G. S. H. Cruttwell, and J. S. P. Lemay. Differential Equations in a Tangent Category I: Complete Vector Fields, Flows, and Exponentials. Applied Categorical Structures, 29(5):773–825, October 2021. Bryce Clarke, Derek Elkins, Jeremy Gibbons, Fosco Loregian, Bartosz Milewski, Emily Pillmore, and Mario Rom ́an. Profunctor Optics, a Categorical Update, March 2022. arXiv:2001.07488 [cs, math]. Matteo Capucci and Bruno Gavranovi ́c. Actegories for the Working Amthematician, March 2022. arXiv:2203.16351 [math]. Geoffrey S. H. Cruttwell, Bruno Gavranovi ́c, Neil Ghani, Paul Wilson, and Fabio Zanasi. Categorical Foundations of Gradient-Based Learning. In Ilya Sergey, editor, Programming Languages and Systems, Lecture Notes in Computer Science, pages 1– 28, Cham, 2022. Springer International Publishing. Matteo Capucci, Bruno Gavranovi ́c, Jules Hedges, and Eigil Fjeldgren Rischel. To- wards Foundations of Categorical Cybernetics. Electronic Proceedings in Theoretical Computer Science, 372:235–248, November 2022. arXiv:2105.06332 [math]. Matteo Capucci, Neil Ghani, J ́er ́emy Ledent, and Fredrik Nordvall Forsberg. Trans- lating Extensive Form Games to Open Games with Agency. Electronic Proceedings in Theoretical Computer Science, 372:221–234, November 2022. arXiv:2105.06763 [cs, math]. Geoffrey Cruttwell, Jonathan Gallagher, Jean-Simon Pacaud Lemay, and Dorette Pronk. Monoidal Reverse Differential Categories, September 2022. arXiv:2203.12478 [cs, math].

BIBLIOGRAPHY 173 [CK17] [Cla20] [Cla22] [CS11] [CUV06] [CXZG16] [DB16] [DBK+ 21] [Del19] [DGGA+22] [DH06] Bob Coecke and Aleks Kissinger. Picturing Quantum Processes: A First Course in Quantum Theory and Diagrammatic Reasoning. Cambridge University Press, Cam- bridge, 2017. Bryce Clarke. Internal lenses as functors and cofunctors. Electronic Proceedings in Theoretical Computer Science, 323:183–195, September 2020. arXiv:2009.06835 [math]. Bryce Clarke. Delta lenses as coalgebras for a comonad, March 2022. arXiv:2108.00390 [math]. Robin Cockett and R.A.G. Seely. The Fa`a di Bruno construction. Theory and Appli- cations of Categories [electronic only], 25, January 2011. Venanzio Capretta, Tarmo Uustalu, and Varmo Vene. Recursive coalgebras from comonads. Information and Computation, 204(4):437–468, April 2006. Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. Training Deep Nets with Sublinear Memory Cost, April 2016. arXiv:1604.06174 [cs]. Alexey Dosovitskiy and Thomas Brox. Inverting Visual Representations with Convo- lutional Networks, April 2016. arXiv:1506.02753 [cs]. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, June 2021. arXiv:2010.11929 [cs]. Jean-Charles Delvenne. Category Theory for Autonomous and Networked Dynamical Systems. Entropy, 21(3):302, March 2019. Andrew Dudzik, Bruno Gavranovi ́c, Jo ̃ao Guilherme Arau ́jo, Petar Veliˇckovi ́c, and Pim de Haan. Categories for AI, October 2022. Benjamin Dauvergne and Laurent Hasco ̈et. The Data-Flow Equations of Checkpoint- ing in Reverse Automatic Differentiation. In Vassil N. Alexandrov, Geert Dick van Albada, Peter M. A. Sloot, and Jack Dongarra, editors, Computational Science –

BIBLIOGRAPHY 174 ICCS 2006, Lecture Notes in Computer Science, pages 566–573, Berlin, Heidelberg, [DHS11] [Dis20] [DKL19] [DL19] [Doz16] [DP89] [DV22] [Ell18a] [Ell18b] [FAL17] 2006. Springer. John Duchi, Elad Hazan, and Yoram Singer. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. The Journal of Machine Learning Research, 12(null):2121–2159, July 2011. Zinovy Diskin. General Supervised Learning as Change Propagation with Delta Lenses. In Jean Goubault-Larrecq and Barbara K ̈onig, editors, Foundations of Soft- ware Science and Computation Structures, Lecture Notes in Computer Science, pages 177–197, Cham, 2020. Springer International Publishing. Zinovy Diskin, Harald K ̈onig, and Mark Lawford. Multiple Model Synchroniza- tion with Multiary Delta Lenses with Amendment and K-Putput, November 2019. arXiv:1911.11302 [cs]. David Dalrymple and Eliana Lorch. Dioptics: a Common Generalization of Open Games and Gradient-Based Learners, 2019. Timothy Dozat. INCORPORATING NESTEROV MOMENTUM INTO ADAM. In Workshop track, 2016. V. C. V. De Paiva. The Dialectica categories. In John W. Gray and Andre Scedrov, editors, Contemporary Mathematics, volume 92, pages 47–62. American Mathematical Society, Providence, Rhode Island, 1989. Andrew Dudzik and Petar Veliˇckovi ́c. Graph Neural Networks are Dynamic Program- mers, October 2022. arXiv:2203.15544 [cs, math, stat]. Conal Elliott. The simple essence of automatic differentiation, October 2018. arXiv:1804.00746 [cs]. Conal Elliott. Video recording of ”The Simple Essence of Automatic Differentiation”, July 2018. Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the 34th International Confer- ence on Machine Learning, pages 1126–1135. PMLR, July 2017. ISSN: 2640-3498.

BIBLIOGRAPHY 175 [FC19] [FDRC20] [FS18] [FST19] [Fuj19] [Fuk75] [Gav19] [Gav20a] [Gav20b] [Gav21] [Gav22a] [Gav22b] [Gav23a] [Gav23b] Jonathan Frankle and Michael Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks, March 2019. arXiv:1803.03635 [cs]. Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, and Michael Carbin. Stabilizing the Lottery Ticket Hypothesis, July 2020. arXiv:1903.01611 [cs, stat]. Brendan Fong and David I. Spivak. Seven Sketches in Compositionality: An Invitation to Applied Category Theory, October 2018. arXiv:1803.05316 [math]. Brendan Fong, David I. Spivak, and R ́emy Tuy ́eras. Backprop as Functor: A compo- sitional perspective on supervised learning, May 2019. arXiv:1711.10455 [cs, math]. Soichiro Fujii. A 2-Categorical Study of Graded and Indexed Monads, April 2019. arXiv:1904.08083 [cs, math]. Kunihiko Fukushima. Cognitron: A self-organizing multilayered neural network. Bi- ological Cybernetics, 20(3):121–136, September 1975. Bruno Gavranovi ́c. Compositional Deep Learning, July 2019. arXiv:1907.08292 [cs, math]. Bruno Gavranovi ́c. Category Theory in Machine Learning, July 2020. original-date: 2020-07-09T14:16:58Z. Bruno Gavranovi ́c. Learning Functors using Gradient Descent. Electronic Proceedings in Theoretical Computer Science, 323:230–245, September 2020. arXiv:2009.06837 [cs, math]. Bruno Gavranovi ́c. Meta-learning and Monads, October 2021. Bruno Gavranovi ́c. Lenses to the left of me, Prisms to the right, January 2022. Bruno Gavranovi ́c. Space-time tradeoffs of lenses and optics via higher category the- ory, September 2022. arXiv:2209.09351 [cs, math]. Bruno Gavranovi ́c. Theory and Applications of Lenses and Optics, May 2023. original- date: 2022-04-12T13:53:16Z. Bruno Gavranovi ́c. Two kinds of Prisms, February 2023.

BIBLIOGRAPHY 176 [GBC16] [GDM+ 14] [GHWZ18] [GJL17] [GK96] [GK13] [GLP21] [GPAM+14] [GS20] [GV22] [GW00] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT press, 2016. Karol Gregor, Ivo Danihelka, Andriy Mnih, Charles Blundell, and Daan Wierstra. Deep AutoRegressive Networks, May 2014. arXiv:1310.8499 [cs, stat]. Neil Ghani, Jules Hedges, Viktor Winschel, and Philipp Zahn. Compositional game theory, February 2018. arXiv:1603.04641 [cs]. Dan R. Ghica, Achim Jung, and Aliaume Lopez. Diagrammatic Semantics for Digital Circuits, March 2017. arXiv:1703.10247 [cs]. C. Goller and A. Kuchler. Learning task-dependent distributed representations by backpropagation through structure. In Proceedings of International Conference on Neural Networks (ICNN’96), volume 1, pages 347–352 vol.1, June 1996. Nicola Gambino and Joachim Kock. Polynomial functors and polynomial monads. Mathematical Proceedings of the Cambridge Philosophical Society, 154(1):153–192, January 2013. arXiv:0906.4931 [math]. Fabrizio Genovese, Fosco Loregian, and Daniele Palombi. Escrows are optics, May 2021. arXiv:2105.10028 [math]. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative Adversarial Networks, June 2014. arXiv:1406.2661 [cs, stat]. Fabrizio Genovese and David I. Spivak. A Categorical Semantics for Guarded Petri Nets. In Fabio Gadducci and Timo Kehrer, editors, Graph Transformation, Lecture Notes in Computer Science, pages 57–74, Cham, 2020. Springer International Pub- lishing. Bruno Gavranovi ́c and Mattia Villani. Graph Convolutional Neural Networks as Para- metric CoKleisli morphisms, December 2022. arXiv:2212.00542 [cs, math]. Andreas Griewank and Andrea Walther. Algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation. ACM Transactions on Mathematical Software, 26(1):19–45, March 2000.

BIBLIOGRAPHY 177 [GWD14] [GWR+ 16] Alex Graves, Greg Wayne, and Ivo Danihelka. Neural Turing Machines, December 2014. arXiv:1410.5401 [cs]. Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwin ́ska, Sergio G ́omez Colmenarejo, Edward Grefenstette, Tiago Ra- malho, John Agapiou, Adria Puigdomenech Badia, Karl Moritz Hermann, Yori Zwols, Georg Ostrovski, Adam Cain, Helen King, Christopher Summerfield, Phil Blunsom, Koray Kavukcuoglu, and Demis Hassabis. Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626):471–476, October 2016. Number: 7626 Publisher: Nature Publishing Group. Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. Meta- Learning in Neural Networks: A Survey, November 2020. arXiv:2004.05439 [cs, stat]. Kenneth D. Harris. Characterizing the invariances of learning algorithms using cate- gory theory, May 2019. arXiv:1905.02072 [cs, math, stat]. Jules Hedges. Lenses for philosophers, August 2018. Jules Hedges. Limits of bimorphic lenses, August 2019. arXiv:1808.05545 [cs, math]. Dan Hendrycks and Kevin Gimpel. Gaussian Error Linear Units (GELUs), July 2020. arXiv:1606.08415 [cs]. Ralf Hinze. Kan Extensions for Program Optimisation Or: Art and Dan Explain an Old Trick. In Jeremy Gibbons and Pablo Nogueira, editors, Mathematics of Program Construction, Lecture Notes in Computer Science, pages 324–362, Berlin, Heidelberg, 2012. Springer. Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. A Convenient Cate- gory for Higher-Order Probability Theory. In 2017 32nd Annual ACM/IEEE Sympo- sium on Logic in Computer Science (LICS), pages 1–12, June 2017. arXiv:1701.02547 [cs, math]. Chris Heunen and Jean-Simon Pacaud Lemay. TENSOR-RESTRICTION CATE- GORIES. Theory and Applications of Categories, 37:635–670, 2021. [HAMS20] [Har19] [Hed18] [Hed19] [HG20] [Hin12] [HKSY17] [HL21]

BIBLIOGRAPHY 178 [HPKH20] [HS97] [HS22] [HSH+ 23] [HSV20] [HVHV19] [HY20] [IL23] [IS15] [JGH20] [JS14] Awni Hannun, Vineel Pratap, Jacob Kahn, and Wei-Ning Hsu. Differentiable Weighted Finite-State Transducers, October 2020. arXiv:2010.01003 [cs, stat]. Sepp Hochreiter and J{\”u}rgen Schmidhuber. Long short-term memory. Neural computation, 9:1735–1780, 1997. Jules Hedges and Riu Rodr ́ıguez Sakamoto. Value iteration is optic composition, June 2022. arXiv:2206.04547 [math]. Tyler Hanks, Baike She, Matthew Hale, Evan Patterson, Matthew Klawonn, and James Fairbanks. A Compositional Framework for Convex Model Predictive Control, May 2023. arXiv:2305.03820 [math]. Mathieu Huot, Sam Staton, and Matthijs V ́ak ́ar. Correctness of Automatic Differen- tiation via Diffeologies and Categorical Gluing, April 2020. arXiv:2001.02209 [cs]. Chris Heunen, Jamie Vicary, Chris Heunen, and Jamie Vicary. Categories for Quan- tum Theory: An Introduction. Oxford Graduate Texts in Mathematics. Oxford Uni- versity Press, Oxford, New York, November 2019. Reinhard Heckel and Fatih Furkan Yilmaz. Early Stopping in Deep Networks: Double Descent and How to Eliminate it, September 2020. arXiv:2007.10099 [cs, stat]. Sacha Ikonicoff and Jean-Simon Pacaud Lemay. Cartesian Differential Comonads and New Models of Cartesian Differential Categories, January 2023. arXiv:2108.04304 [math]. Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, March 2015. arXiv:1502.03167 [cs]. Arthur Jacot, Franck Gabriel, and Cl ́ement Hongler. Neural Tangent Kernel: Conver- gence and Generalization in Neural Networks, February 2020. arXiv:1806.07572 [cs, math, stat]. Bart Jacobs and Alexandra Silva. Automata Learning: A Categorical Perspective. In Franck van Breugel, Elham Kashefi, Catuscia Palamidessi, and Jan Rutten, editors, Horizons of the Mind. A Tribute to Prakash Panangaden: Essays Dedicated to Prakash

BIBLIOGRAPHY 179 Panangaden on the Occasion of His 60th Birthday, Lecture Notes in Computer Science, [JY20] [Kas22] [KB17] [KMH+ 20] [KV06] [Lei22] [Lew19] [Lor20] [Maa13] [MDG13] [MGY+21] pages 384–406. Springer International Publishing, Cham, 2014. Niles Johnson and Donald Yau. 2-Dimensional Categories, June 2020. arXiv:2002.06055 [math]. Rohan V. Kashyap. A survey of deep learning optimizers-first and second order meth- ods, November 2022. arXiv:2211.15596 [cs, math]. Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization, January 2017. arXiv:1412.6980 [cs]. Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling Laws for Neural Language Models, January 2020. arXiv:2001.08361 [cs, stat]. Jevgeni Kabanov and Varmo Vene. Recursion Schemes for Dynamic Programming. In Tarmo Uustalu, editor, Mathematics of Program Construction, Lecture Notes in Computer Science, pages 235–252, Berlin, Heidelberg, 2006. Springer. Tom Leinster. Entropy and Diversity: The Axiomatic Approach, October 2022. arXiv:2012.02113 [cs, math, q-bio]. Martha Lewis. Compositionality for Recursive Neural Networks, January 2019. arXiv:1901.10723 [cs, math]. Fosco Loregian. Coend calculus, December 2020. arXiv:1501.02503 [math]. Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models. In Proceedings of the 30th International Conference on Machine Learning, volume 28, 2013. Samuel Mimram and Cinzia Di Giusto. A Categorical Theory of Patches. Electronic Notes in Theoretical Computer Science, 298:283–307, November 2013. arXiv:1311.3903 [cs, math]. Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi,

BIBLIOGRAPHY 180 [Mil19] [Mil21a] [Mil21b] [Mil22] [MOT15] [MP23] [MV14] [Mye22a] [Mye22b] [Nes93] [NKB+ 19] [nLa23a] [nLa23b] Jiwoo Pak, Andy Tong, Kavya Srinivasa, William Hang, Emre Tuncer, Quoc V. Le, James Laudon, Richard Ho, Roger Carpenter, and Jeff Dean. A graph placement methodology for fast chip design. Nature, 594(7862):207–212, June 2021. Number: 7862 Publisher: Nature Publishing Group. Bartosz Milewski. Category Theory for Programmers. Blurb, 2019. Bartosz Milewski. Dependent Optics, September 2021. Bartosz Milewski. PolyLens, December 2021. Bartosz Milewski. Compound Optics, March 2022. arXiv:2203.12022 [math]. Alexander Mordvintsev, Christopher Olah, and Mike Tyka. Inceptionism: Going Deeper into Neural Networks, June 2015. Sean Moss and Paolo Perrone. A category-theoretic proof of the ergodic decomposi- tion theorem. Ergodic Theory and Dynamical Systems, pages 1–27, February 2023. arXiv:2207.07353 [cs, math]. Aravindh Mahendran and Andrea Vedaldi. Understanding Deep Image Representa- tions by Inverting Them, November 2014. arXiv:1412.0035 [cs]. David Jaz Myers. Categorical Systems Theory. unpublished book draft, February 2022. David Jaz Myers. The Para Construction as a Distributive Law, December 2022. Yurii Evgenievich Nesterov. A method of solving a convex programming problem with convergence rate $O\bigl(\frac1{kˆ2}\bigr)$. Dokl. Akad. Nauk SSSR, 269(3):543– 547, 1993. Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, and Ilya Sutskever. Deep Double Descent: Where Bigger Models and More Data Hurt, De- cember 2019. arXiv:1912.02292 [cs, stat]. nLab authors. locally graded category, June 2023. nLab authors. quasi-limit, June 2023.

BIBLIOGRAPHY 181 [NS23] [NW22] [NYC15] [Ola15] [OWEI20] [Pav14] [Per22] [PGM+ 19] Nelson Niu and David I. Spivak. Polynomial Functors: A Mathematical Theory of Interaction. Preprint, June 2023. Minh Nguyen and Nicolas Wu. Folding over Neural Networks. In Ekaterina Komen- dantskaya, editor, Mathematics of Program Construction, Lecture Notes in Computer Science, pages 129–150, Cham, 2022. Springer International Publishing. Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images, April 2015. arXiv:1412.1897 [cs]. Christopher Olah. Neural Networks, Types, and Functional Programming – colah’s blog, September 2015. Dominic Orchard, Philip Wadler, and Harley Eades III. Unifying graded and param- eterised monads. Electronic Proceedings in Theoretical Computer Science, 317:18–38, May 2020. arXiv:2001.10274 [cs, math]. Dusko Pavlovic. Chasing Diagrams in Cryptography. In Claudia Casadio, Bob Coecke, Michael Moortgat, and Philip Scott, editors, Categories and Types in Logic, Language, and Physics: Essays Dedicated to Jim Lambek on the Occasion of His 90th Birthday, Lecture Notes in Computer Science, pages 353–367. Springer, Berlin, Heidelberg, 2014. Paolo Perrone. Markov Categories and Entropy, December 2022. arXiv:2212.11719 [cs, math, stat]. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmai- son, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In H. Wal- lach, H. Larochelle, A. Beygelzimer, F. d’Alch ́e Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran As- sociates, Inc., 2019.

BIBLIOGRAPHY 182 [PGW17] [PMB13] [Pol64] [PRS22] [PSHM23] [PVM+ 21] [PYY+ 19] [QGL+ 20] [Ril18] [RVHV19] [SA20] Matthew Pickering, Jeremy Gibbons, and Nicolas Wu. Profunctor Optics: Modular Data Accessors. The Art, Science, and Engineering of Programming, 1(2):7, April 2017. arXiv:1703.10857 [cs]. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training Recurrent Neural Networks, February 2013. arXiv:1211.5063 [cs]. B. T. Polyak. Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 4(5):1–17, January 1964. Jo ̃ao Paix ̃ao, Lucas Rufino, and Pawel Sobocin ́ski. High-level axioms for graphical linear algebra. Science of Computer Programming, 218:102791, June 2022. Mathilde Papillon, Sophia Sanborn, Mustafa Hajij, and Nina Miolane. Architectures of Topological Deep Learning: A Survey on Topological Neural Networks, April 2023. arXiv:2304.10031 [cs]. Wei Peng, Tuomas Varanka, Abdelrahman Mostafa, Henglin Shi, and Guoying Zhao. Hyperbolic Deep Neural Networks: A Survey, February 2021. arXiv:2101.04562 [cs]. Zhaoqing Pan, Weijie Yu, Xiaokai Yi, Asifullah Khan, Feng Yuan, and Yuhui Zheng. Recent Progress on Generative Adversarial Networks (GANs): A Survey. IEEE Ac- cess, 7:36322–36333, 2019. Conference Name: IEEE Access. Haotong Qin, Ruihao Gong, Xianglong Liu, Xiao Bai, Jingkuan Song, and Nicu Sebe. Binary Neural Networks: A Survey. Pattern Recognition, 105:107281, September 2020. arXiv:2004.03333 [cs]. Mitchell Riley. Categories of Optics. arXiv:1809.00738 [math], September 2018. arXiv: 1809.00738. Lectured David Reutter, Jamie Vicary, Notes Chris Heunen, and Jamie Vicary. Cat- egorical Quantum Mechanics, 2019. Florian Sch ̈afer and Anima Anandkumar. Competitive Gradient Descent, June 2020. arXiv:1905.12103 [cs, math].

BIBLIOGRAPHY 183 [SB18] [Sch19] [Sel11] [SG22] [Shi22] [Shi23] [SK16] [SK19] [Smi22] [Spi20] [Spi22a] [Spi22b] [Spi23] Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA, October 2018. Robin M. Schmidt. Recurrent Neural Networks (RNNs): A gentle Introduction and Overview, November 2019. arXiv:1912.05911 [cs, stat]. P. Selinger. A Survey of Graphical Languages for Monoidal Categories. In Bob Coecke, editor, New Structures for Physics, Lecture Notes in Physics, pages 289–355. Springer, Berlin, Heidelberg, 2011. Razin A. Shaikh and Stefano Gogioso. Categorical Semantics for Feynman Diagrams, May 2022. arXiv:2205.00466 [quant-ph]. Dan Shiebler. Kan Extensions in Data Science and Machine Learning, July 2022. arXiv:2203.09018 [cs, stat]. D. Shiebler. Compositionality and functorial invariants in machine learning., University of Oxford, 2023. Tim Salimans and Diederik P. Kingma. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, June 2016. arXiv:1602.07868 [cs]. David Sprunger and Shin-ya Katsumata. Differentiable Causal Computations via Delayed Trace, March 2019. arXiv:1903.01093 [cs, math]. Toby St Clere Smithe. Mathematical Foundations for a Compositional Account of the Bayesian Brain, December 2022. arXiv:2212.12538 [cs, math, q-bio, stat]. David I. Spivak. Poly: An abundant categorical setting for mode-dependent dynamics, June 2020. arXiv:2005.01894 [math]. David I. Spivak. Generalized Lens Categories via functors $\mathcal{C}ˆ{\rm op}\to\mathsf{Cat}$, March 2022. arXiv:1908.02202 [cs, math]. David I. Spivak. Learners’ Languages. Electronic Proceedings in Theoretical Computer Science, 372:14–28, November 2022. arXiv:2103.01189 [cs, math]. David I. Spivak. Functorial aggregation, April 2023. arXiv:2111.10968 [cs, math].

BIBLIOGRAPHY 184 [SPW+ 13] [Str12] [Stu15] [SVZ14] [TDBM22] [TYdF21] [US20] [VB21] [VC22] [Vel22] [Ver22] [vONR+ 22] Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. Recursive Deep Models for Semantic Compo- sitionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1631–1642, Seattle, Wash- ington, USA, October 2013. Association for Computational Linguistics. Ross Street. Monoidal categories in, and linking, geometry and algebra, October 2012. arXiv:1201.2991 [math]. Kirk Sturtz. Categorical Probability Theory, March 2015. arXiv:1406.6030 [math]. Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps, April 2014. arXiv:1312.6034 [cs]. Yi Tay, Mostafa Dehghani, Dara Bahri, and Donald Metzler. Efficient Transformers: A Survey, March 2022. arXiv:2009.06732 [cs]. Alexis Toumi, Richie Yeung, and Giovanni de Felice. Diagrammatic Differentiation for Quantum Machine Learning. Electronic Proceedings in Theoretical Computer Science, 343:132–144, September 2021. arXiv:2103.07960 [quant-ph]. Henning Urbat and Lutz Schr ̈oder. Automata Learning: An Algebraic Approach, August 2020. arXiv:1911.00874 [cs]. PetarVeliˇckovi ́candCharlesBlundell.NeuralAlgorithmicReasoning.Patterns, 2(7):100273, July 2021. arXiv:2105.02761 [cs, math, stat]. Andre Videla and Matteo Capucci. Lenses for Composable Servers, March 2022. arXiv:2203.15633 [cs]. Petar Veliˇckovi ́c. Message passing all the way up, February 2022. arXiv:2202.11097 [cs, stat]. Pietro Vertechi. Dependent Optics, May 2022. arXiv:2204.09547 [cs, math]. Johannes von Oswald, Eyvind Niklasson, Ettore Randazzo, Jo ̃ao Sacramento, Alexan- der Mordvintsev, Andrey Zhmoginov, and Max Vladymyrov. Transformers learn in- context by gradient descent, December 2022. arXiv:2212.07677 [cs].

BIBLIOGRAPHY 185 [VS22] [VSP+ 17] [Wad93] [WDCC19] [Win23] [WMLC23] [WPC+21] [WZ21] [ZPIE20] Matthijs V ́ak ́ar and Tom Smeding. CHAD: Combinatory Homomorphic Automatic Differentiation, June 2022. arXiv:2103.15776 [cs]. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention Is All You Need, December 2017. arXiv:1706.03762 [cs]. Philip Wadler. Monads for functional programming. In Manfred Broy, editor, Program Design Calculi, NATO ASI Series, pages 233–264, Berlin, Heidelberg, 1993. Springer. Erwei Wang, James J. Davis, Peter Y. K. Cheung, and George A. Constantinides. LUTNet: Rethinking Inference in FPGA Soft Logic, April 2019. arXiv:1904.00938 [cs, stat]. Glynn Winskel. Making Concurrency Functional, April 2023. arXiv:2202.13910 [cs]. Vincent Wang-Mascianica, Jonathon Liu, and Bob Coecke. Distilling Text into Cir- cuits, January 2023. arXiv:2301.10595 [cs, math]. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. A Comprehensive Survey on Graph Neural Networks. IEEE Trans- actions on Neural Networks and Learning Systems, 32(1):4–24, January 2021. Paul Wilson and Fabio Zanasi. Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits. Electronic Proceedings in Theoretical Computer Science, 333:247–260, February 2021. arXiv:2101.10488 [cs]. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired Image- to-Image Translation using Cycle-Consistent Adversarial Networks, August 2020. arXiv:1703.10593 [cs].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment