-
-
Save aidando73/943b5f02d35571eb783f3cca5afa6e59 to your computer and use it in GitHub Desktop.
@ashwinb - Are we planning to support that endpoint long-term?
It seems the industry is moving away from it. E.g., OpenAI marked their completions api as legacy https://platform.openai.com/docs/api-reference/completions and Anthropic too: https://docs.anthropic.com/en/api/complete
I'm wondering if it's a good idea to move away from the endpoint as well while llama-stack is still early days.
Maintaining it for a long time + deprecating + remove might take a decent amount of bandwidth from us
Thanks for all your contributions @aidando73 -- we want to keep supporting completions at least for now because we believe having raw access to a model is as important. Unlike other providers, Llamas are open source and people play with them and iterate in a variety of ways. The kinds of manipulations we do with a chat_completion endpoint internally may not be what users intend sometimes. Sometimes, they just want a carefully formatted prompt to hit the model directly.
On that theme, I think it would be great if Groq could build completions endpoints on their end too. But until that time, NotImplementedError() would have to do.
As another example, I'm working on adding a Groq adapter atm, and they don't support completions https://console.groq.com/docs/api-reference#chat and there's a non-zero chance that they might just not implement it. So might need a workaround otherwise we're stuck with NotImplementedError() (can probably use a chat completion under the hood).