Created
April 24, 2024 04:46
-
-
Save fullstackwebdev/a89ad8522cc01fb409f229f186216773 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Initialized litellm callbacks, Async Success Callbacks: [<litellm.proxy.hooks.parallel_request_limiter._PROXY_MaxParallelRequestsHandler object at 0x7f1a09f63760>, <litellm.proxy.hooks.tpm_rpm_limiter._PROXY_MaxTPMRPMLimiter object at 0x7f1a09f63790>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x7f1a09f63850>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x7f1a09f63880>, <litellm._service_logger.ServiceLogging object at 0x7f1a05f21780>, <bound method ProxyLogging.response_taking_too_long_callback of <litellm.proxy.utils.ProxyLogging object at 0x7f1a09f63280>>] | |
self.optional_params: {} | |
litellm.cache: None | |
kwargs[caching]: False; litellm.cache: None | |
Final returned optional params: {'temperature': 0.0, 'top_p': 1, 'n': 1, 'max_tokens': 12000, 'presence_penalty': 0, 'frequency_penalty': 0, 'max_retries': 0, 'extra_body': {}} | |
self.optional_params: {'temperature': 0.0, 'top_p': 1, 'n': 1, 'max_tokens': 12000, 'presence_penalty': 0, 'frequency_penalty': 0, 'max_retries': 0, 'extra_body': {}} | |
RAW RESPONSE: | |
<coroutine object OpenAIChatCompletion.acompletion at 0x7f1a05c68890> | |
04:41:59 - LiteLLM:INFO: utils.py:1133 - | |
POST Request Sent from LiteLLM: | |
curl -X POST \ | |
https://openrouter.ai/api/v1/ \ | |
-H 'Authorization: Bearer sk-or-v1-cb5d631007ed554a0e649850d1fe81191da263f4c681********************' \ | |
-d '{'model': 'meta-llama/llama-3-70b-instruct:nitro', 'messages': [{'role': 'user', 'content': "Baleen is a scalable multi-hop reasoning system that improves accuracy and robustness in open-domain questions. It uses a condensed retrieval architecture, where facts from each hop are summarized into a short context for subsequent hops, and a focused late interaction passage retriever (FLIPR) to tackle query complexity. This approach allows Baleen to achieve 96.3% answer recall in the top-20 retrieved passages on the HotPotQA benchmark and outperform the official baseline on the HoVer task by over 30 points in retrieval accuracy.\n\n\n\nProcess and generate Markdown\nConvert the provided text into a markdown format using <details> and <summary> tags to create a hierarchical structure. Organize the text by paragraphs or thematic sections, each as a root-level collapsible block. Within each block, sentence by sentence, each sentence or related groups of sentences nested in collapsible sections. Limit nesting to a maximum of two or three levels to avoid recursive runaway, but not just one. Use paragraphs as clues to start and stop a nesting. Each <summary> tag should contain a keyword or short two word phrase in the spirit of the content of the <details>. <details> contains the sentence exactly word for word. Do each sentence exactly. This method creates a manageable hierarchical document, with each root details having at least one or more children, with the depth of nesting clearly controlled to prevent infinite recursion.\n\n\nStart with a ```markdown and then wrap the whole thing in <detail><summary>{subject of paper}</summary>...... text and child <detail's. Remember every sentence is wrapped with a <d><s>keyword</s>content</d> pattern, and inside any sentence highlight keyword\n\nPlease note process the entire text in full details do not skip any sentence or paragraph."}], 'temperature': 0.0, 'top_p': 1, 'n': 1, 'max_tokens': 12000, 'presence_penalty': 0, 'frequency_penalty': 0, 'extra_body': {'transforms': []}, 'extra_headers': {'HTTP-Referer': 'https://litellm.ai', 'X-Title': 'liteLLM'}}' | |
RAW RESPONSE: | |
{"id": null, "choices": null, "created": null, "model": null, "object": null, "system_fingerprint": null, "usage": null, "error": {"message": "This endpoint's maximum context length is 8192 tokens. However, you requested about 12490 tokens (490 in the input, 12000. Please reduce the length of either one, or use the \"middle-out\" transform to compress your prompt automatically.", "code": 400}} | |
Logging Details: logger_fn - None | callable(logger_fn) - False | |
Logging Details LiteLLM-Failure Call | |
get cache: cache key: 04-41:cooldown_models; local_only: False | |
get cache: cache result: None | |
set cache: key: 04-41:cooldown_models; value: ['1873cd978ca45b20666fb36820f14d2de4ea9895370eb897a28d864f835c118f'] | |
InMemoryCache: set_cache | |
Inside Max Parallel Request Failure Hook | |
user_api_key: asdf | |
get cache: cache key: asdf::2024-04-24-04-41::request_count; local_only: False | |
get cache: cache result: {'current_requests': 0, 'current_tpm': 0, 'current_rpm': 0} | |
updated_value in failure call: {'current_requests': 0, 'current_tpm': 0, 'current_rpm': 0} | |
set cache: key: asdf::2024-04-24-04-41::request_count; value: {'current_requests': 0, 'current_tpm': 0, 'current_rpm': 0} | |
InMemoryCache: set_cache | |
04:41:59 - LiteLLM Router:INFO: router.py:537 - litellm.acompletion(model=openrouter/meta-llama/llama-3-70b-instruct:nitro) Exception Invalid response object Traceback (most recent call last): | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/utils.py", line 6957, in convert_to_model_response_object | |
for idx, choice in enumerate(response_object["choices"]): | |
TypeError: 'NoneType' object is not iterable | |
Traceback (most recent call last): | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/utils.py", line 6957, in convert_to_model_response_object | |
for idx, choice in enumerate(response_object["choices"]): | |
TypeError: 'NoneType' object is not iterable | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/main.py", line 318, in acompletion | |
response = await init_response | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/llms/openai.py", line 477, in acompletion | |
raise e | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/llms/openai.py", line 472, in acompletion | |
return convert_to_model_response_object( | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/utils.py", line 7077, in convert_to_model_response_object | |
raise Exception(f"Invalid response object {traceback.format_exc()}") | |
Exception: Invalid response object Traceback (most recent call last): | |
File "/home/fullstack/.local/lib/python3.10/site-packages/litellm/utils.py", line 6957, in convert_to_model_response_object | |
for idx, choice in enumerate(response_object["choices"]): | |
TypeError: 'NoneType' object is not iterable | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment