Skip to content

Instantly share code, notes, and snippets.

@JohnReedLOL
Created August 28, 2023 03:36
Show Gist options
  • Save JohnReedLOL/35e92d404cf47481fe9eb3001a995504 to your computer and use it in GitHub Desktop.
Save JohnReedLOL/35e92d404cf47481fe9eb3001a995504 to your computer and use it in GitHub Desktop.
Dear Neil deGrasse Tyson...
Just wrote the following letter to Neil deGrasse Tyson via https://neildegrassetyson.com/general-comments/ (This version is better because of edits):
Hey Neil,
I heard you say online that we've had AI (which currently means the exact same thing to computer people as the phrase "Machine Learning") for many decades. No. The meaning of the term "AI" has changed. It used to mean any program that tries to mimic what a human would do (like a computer that does calculations to determine which move to make in chess). That's not what AI means anymore. Nowadays there is a division between traditional, non-ambiguous, concrete, literal computer code (i.e. "IF this happens THEN do that") and Machine Learning based computer code. When you talk to Apple's Siri and it converts what you just said into typed text, that is Machine Learning, not traditional computer coding. The distinction is that Machine Learning requires massive samples of clean, pre-prepared, usually pre-labeled data that is used to train an algorithm which internally is (typically) a "black box" in that it is not internally readable or debuggable by executing its code one line at a time like a programmer would with normal computer code that they typed on their laptop by hand. When you hand write your address on a letter that is sent in the mail, the letter gets put on a conveyor belt, a photo of it is taken, and a trained algorithm which internally is a "black box" is used to convert that photo into typed text which can be searched to find the address you wrote on the letter. This algorithm was trained from a massive collection of training data consisting of countless clean photos of hand written letters and numbers and the typed text which corresponds with what was hand written. If traditional computer coding methods were used to make out hand-written letters, the lines that were hand-drawn would be put on a graph and the angles of the lines would be used to determine which letter was written. For example, a hand written lowercase "l" is a single vertical line that if extended would intersect with the vertical edge of the paper by between 80 degrees and 110 degrees. That's not how the software code works at the post office, it is not drawing lines and checking the angles of the lines relative to hard-coded values that a programmer typed by hand. Any and all internal numerical constants are determined through some sort of algorithm that arrives at the value by performing some sort of combinatorial analysis of all the training/sample data, the constants are NOT hard-coded like traditional computer code is. Machine Learning is "fuzzy" or imprecise while traditional computer coding is totally precise and non-fuzzy. It is possible for a Machine Learning based computer program and a program written entirely via traditional hard-coding to do the exact same thing. For example, it is possible to have one program that plays chess via hard-coded calculations (if you can take this chess piece but you will lose that piece then don't make that move) and another program that uses Machine Learning based methods (by storing and pre-processing a collection of billions of different chess positions from countless professional chess games as well as the outcomes of those games and having it figure out how to get to a winning position without a programmer ever specifically telling it exactly what to do in a given position or how to do it). When you see something like ChatGPT or Google's AI, normally things like those ("large language models" or LLMs) are created by using the entirety of Wikipedia and/or something like Wikipedia as the stored training data that gets pre-processed to make the "black box" algorithm that spits out the output. This is made possible by the advent of "Big Data", NoSQL databases and Online Analytical Processing (OLAP) databases that are literally dozens or even hundreds of petabytes in size in total and function as a single database despite being made up of thousands of seperate computers connected to one another over some sort of internet medium like a Local Area Network (LAN) or ethernet cables or something like that. Hope that clears things up.
- John Michael Reed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment