Skip to content

Instantly share code, notes, and snippets.

@simonw
Created February 27, 2025 23:59
Show Gist options
  • Select an option

  • Save simonw/0c6f071d9ad1cea493de4e5e7a0986bb to your computer and use it in GitHub Desktop.

Select an option

Save simonw/0c6f071d9ad1cea493de4e5e7a0986bb to your computer and use it in GitHub Desktop.

Here's a breakdown of the themes and opinions expressed in the comments, along with illustrative quotes.

Naming Conventions for LLMs

Several users commented on the seemingly arbitrary or uninspired naming schemes used for LLMs.

  • Naming Critique: Users find the naming conventions of LLMs, especially OpenAI's, to be nonsensical and lacking creativity, comparing it to poor variable naming in programming.
    • "At this point I think the ultimate benchmark for any new LLM is whether or not it can come up with a coherent naming scheme for itself. Call it “self awareness.”" - throwup238
    • "The people naming them really took the "just give the variable any old name, it doesn't matter" advice from Programming 101 to heart." - lenerdenator
    • "Still more coherent than the OpenAI lineup." - smallmancontrov
    • "3,3.5,4,4o,4.5 I had my money on 4oz" - nopelynopington

Underwhelming Performance and Value of GPT-4.5

A central theme revolves around the perception that GPT-4.5 is underwhelming, especially considering its high cost. Many users express confusion and disappointment regarding its capabilities compared to existing models like GPT-4o and Claude 3.7.

  • Performance vs. Cost: Many users feel GPT-4.5 doesn't justify its cost, finding other models offer better performance at a fraction of the price.
    • "Considering both this blog post and the livestream demos, I am underwhelmed...What has been shown feels like it could be achieved using a custom system prompt on older versions of OpenAIs models" - Topfi
    • "It seems clearly worse than Claude Sonnet 3.7, yet costs 30x as much?" - jasonjmcghee
    • "In their model evaluation scores in the appendix, 4.5 is, on average, 26% better. I don't understand the value here." - MattSayar
    • "Wow you aren't kidding, 30x input price and 15x output price vs 4o is insane." - jdprgm
    • "...the difference in usefulness is so lacking(?) they're not even gonna keep serving it in the API for long" - zaptrem
  • Questionable Justification for Release: Some speculate OpenAI released GPT-4.5 due to pressure to demonstrate progress, despite its lackluster improvements.
    • "“We don't really know what this is good for, but spent a lot of money and time making it and are under intense pressure to announce new things right now. If you can figure something out, we need you to help us." - swatcoder (interpreting OpenAI's statement)
    • "This strikes me as the opposite of rushed. I get the impression that they've been sitting on this for a while and couldn't make it look as good as previous improvements." - bitshiftfaced
    • "...if they have some amazing capabilities that make a 30-fold pricing increase justifiable, why not show it? and if not, why release this at all?" - Topfi

Pricing Concerns and Compute Costs

The exorbitant pricing of GPT-4.5 is a major point of contention. Users express shock at the token costs and speculate about the reasons behind it, including GPU scarcity and a strategy to discourage casual use.

  • Extremely High Prices: The pricing is considered unreasonable, with some suggesting it may be due to GPU scarcity or an attempt to limit usage.
    • "$75.00 / 1M tokens for input $150.00 / 1M tokens for output That's crazy prices." - ekojs
    • "API price is crazy high. This model must be huge. Not sure this is practical" - brokensegue
    • "It sounds like it's so expensive and the difference in usefulness is so lacking(?) they're not even gonna keep serving it in the API for long" - zaptrem
  • GPU Availability and OpenAI's Consumption: Concerns are raised about OpenAI's consumption of GPUs and the impact on availability for other users and consumer products.
    • "OpenAI may be eating up the GPU supply..." - bhouston
    • "Nvidia really has basically abandoned consumers." - jdprgm

Model Capabilities and Comparisons

Users compare GPT-4.5 to other models, particularly Claude 3.7 and GPT-4o, in terms of coding ability, reasoning, and general usefulness. Many find Claude 3.7 to be superior in several aspects.

  • Coding Performance: Some coding benchmarks indicate Claude 3.7 outperforms GPT-4.5, leading users to prefer it for coding tasks.
    • "A bit better at coding than ChatGPT 4o but not better than o3-mini...BTW Anthropic Claude 3.7 is better than o3-mini at coding" - bhouston
    • "personal anecdote - claude code is the best llm devx i've had." - jasonjmcghee
    • "It seems clearly worse than Claude Sonnet 3.7, yet costs 30x as much?" - jasonjmcghee
  • Reasoning and General Intelligence: Opinions diverge on whether GPT-4.5 is truly "smarter." Some believe its strength lies in general knowledge rather than reasoning.
    • "Compared to OpenAI o1 and OpenAI o3‑mini, GPT‑4.5 is a more general-purpose, innately smarter model" - granzymes (quoting OpenAI)
    • "GPT-4.5 is a more general-purpose, innately smarter model. We believe reasoning will be a core capability of future models, and that the two approaches to scaling—pre-training and reasoning—will complement each other." - eightysixfour
  • Hallucinations: There is some discussion on whether GPT-4.5 hallucinates less than other models.
    • "Early testing doesn't show that it hallucinates less, but we expect that putting that sentence nearby will lead you to draw a connection there yourself". - jodrellblank

OpenAI's Strategy and Future Direction

Several commenters speculate on OpenAI's strategic direction, particularly regarding the balance between scaling existing models and developing reasoning-based AI.

  • Scaling vs. Reasoning: Some believe OpenAI is doubling down on scaling parameters rather than focusing on improving reasoning capabilities, while others see a potential ensemble approach with multiple specialized models.
    • "GPT-4.5 is insanely over price, it makes Anthropic look affordable!" - smcleod
    • "I feel like OpenAI is pursuing AGI when Anthropic/Claude is pursuing making AI awesome for practical things like coding." - wewewedxfgdf
    • "OpenAI seems to be betting that you'll need an ensemble of models with different capabilities, working as a single system, to jump beyond what the reasoning models today can do." - eightysixfour
  • Potential for Synthetic Data Generation: Some suggest GPT-4.5's high cost may be justified by its ability to generate high-quality synthetic data for training other models.
    • "At a glance, my first impression is this is something like Llama 3.1 405B, where the primary value may be realized in generating high quality synthetic data for training rather than direct use." - harlanlewis

Subjective Qualities: EQ, Tone, and "Friendliness"

Some users note OpenAI's emphasis on "EQ" (emotional quotient) and conversational tone in GPT-4.5, with mixed reactions.

  • Concerns About "EQ" and Personalization: Some are wary of the trend towards AI models adopting more "human-like" personalities, particularly regarding potential for manipulation or addiction.
    • "It is interesting that they are focusing a large part of this release on the model having a higher 'EQ' (Emotional Quotient)." - sebastiennight
    • "OpenAI doubling down on the American-style therapy-speak instead of focusing on usefulness. No thanks." - mvdtnz
  • Preference for More Succinct Responses: Some users appreciate the potential for a more concise and direct conversational style.
    • "GPT‑4.5 is more succinct and conversational" - kgeist (quoting OpenAI)

Business Model and Profitability Concerns

The long-term viability of OpenAI's business model is questioned, particularly in light of the high costs associated with large language models.

  • Profitability Doubts: Some express skepticism about OpenAI's path to profitability and sustainability, suggesting they are burning through capital and lack a strong competitive moat.
    • "AI as it stands in 2025 is an amazing technology, but it is not a product at all...OpenAI simply does not have a business model" - ur-whale

Alternatives and Competition

Several users mention alternative AI models and services, particularly Claude and Gemini, highlighting the growing competition in the AI space.

  • Claude and Gemini as Alternatives: Many users express satisfaction with Claude and Gemini, often citing superior performance or cost-effectiveness.
    • "My usage has come down to mostly Claude (until I run out of free tier quota) and then Gemini." - rakejake
    • "Claude 3.6 (new 3.5) and 3.7 non-reasoning are much better at pretty much everything, and much cheaper. What's Anthropic's secret sauce?" - synapsomorphy

State of Utopia

  • ChatGPT for State of Utopia Project: One user expresses some worry about the state of affairs because a simple rename function for his State of Utopia project failed repeatedly.

    • "It couldn't write a simple rename function for me yesterday, still buggy after seven attempts...I'm shocked and surprised ChatGPT failed to get a rename function to work, in 7 attempts." - logicallee

Illustration of Uncommon Views

A few users expressed opinions that stood out from the overall sentiment:

  • Optimism about Creative Writing Improvements: While many were underwhelmed, some saw potential in GPT-4.5 for creative writing.

    • "rethinking your comment 'was that all' I am listening to the stream now and had a thought. Most of the new models that have come out in the past few weeks have been great at coding and logical reasoning. But 4o has been better at creative writing. I am wondering if 4.5 is going to be even better at creative writing than 4o." - tmaly
  • The Value of exploring scaling laws: At least one commenter thought even an expensive flop had value.

    • "In many ways I'm not an OpenAI fan (but I need to recognize their many merits). At the same time, I believe people are missing what they tried to do with GPT 4.5: it was needed and important to explore the pre-training scaling law in that direction. A gift to science, however selfist it could be." - antirez

In summary, the overall sentiment towards GPT-4.5 leans towards disappointment, particularly in light of its high cost and perceived lack of substantial improvements over existing models. Many users express a preference for alternative AI services and question OpenAI's strategic direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment