Skip to content

Instantly share code, notes, and snippets.

@idvorkin
Created January 2, 2025 01:35

Changes to idvorkin/tony_tesla From [2024-04-20] To [2025-01-02]

  • Model: gpt-4o-2024-11-20
  • Duration: 25 seconds
  • Date: 2025-01-01 17:35:23

Table of Contents (code)


Summary

  • Establishment of Coding Standards and Conventions:

    • Comprehensive coding, testing, and debugging conventions introduced in CONVENTIONS.md to streamline development practices, including the use of specific libraries, TUI development guidelines, and testing strategies.
  • Expansion of Tony Assistant Capabilities:

    • New commands and enhanced features for Tony Assistant added in tony.py, including debugging, configuration export, call parsing, and detailed cost breakdowns.
    • A new FastAPI-based server implemented in tony_server.py to replace Modal, supporting assistant requests, journal management, and transit arrivals.
  • New Blog Functionality:

  • Transit System Integration:

  • Journal Management Enhancements:

    • storage.py introduces CLI tools for managing journals stored in Azure Cosmos DB, including reading, appending, and clearing entries.
    • journal_cleanup.convo.md outlines a standardized format for journal entries to improve consistency.
  • Testing and Automation Enhancements:

  • Project Configuration and Setup:

  • System Prompt and Assistant Configuration Updates:

  • Documentation and Miscellaneous Updates:

    • System interactions visualized in README.md with a Mermaid sequence diagram.
    • Transit-related tools and instructions documented in bus_tools.md.
    • .gitignore file updated to exclude unnecessary files.
  • Large Data Additions:

Table of Changes (LLM)


CONVENTIONS.md

CONVENTIONS.md: +293, -0, ~293

TL;DR: Establish comprehensive coding, testing, and debugging conventions for the project.

  • Define coding conventions to standardize the use of libraries:

    • Use Typer for CLI apps, Pydantic for validation, Rich for pretty printing, Loguru for logging, and ic for debugging.
    • Prefer built-in typing syntax such as foo | None over Optional[str].
    • Standardize return types using Pydantic models like FunctionReturn for clarity.
  • Enhance code style by encouraging simple, consistent practices:

    • Prefer early returns over deeply nested code.
    • Use descriptive variable names instead of relying on comments.
  • Streamline testing processes:

    • Recommend pytest-xdist for parallelized test runs.
    • Provide guidelines for debugging and running specific tests.
  • Introduce detailed conventions for creating TUI (Text User Interface) applications:

    • Utilize the Textual library for TUI development.
    • Implement standard key bindings, error handling, and Help modals.
    • Follow a consistent app structure using App classes for layout and actions.
    • Standardize DataTable usage with key handling, row selection, and debugging methods.
  • Provide debugging best practices using the ic library:

    • Debug variables, expressions, and state changes effectively.
    • Combine ic with print statements for context in test debugging.
    • Ensure async debugging is clearly marked.
  • Outline testing strategies for TUI applications:

    • Use pytest-asyncio for async tests.
    • Implement fixtures and test widget behaviors using app.run_test() context managers.
    • Test modal screens, DataTable operations, and keyboard navigation thoroughly.
    • Include debugging and error handling mechanisms within tests.

tony.py

tony.py: +149, -55, ~204

TLDR: Enhance functionality for Tony assistant by adding new commands, improving call parsing, and extending cost breakdown details.

  • Add new commands to extend Tony's functionality:

    • debug_loader: Debug and load Tony assistant configuration with additional state context.
    • last_transcript: Retrieve and display the transcript of the most recent call.
    • dump_last_call: Dump the complete raw JSON of the most recent call for debugging purposes.
    • export_vapi_tony_config: Export the current Tony assistant configuration from VAPI.
    • calls: Extend call listing to include detailed cost breakdown when requested.
    • local_parse_config: Parse and debug local Tony assistant configuration files.
  • Improve calls function by adding a detailed cost breakdown option (--costs) to display cost components per call and total cost for all calls.

  • Enhance parse_call function:

    • Add new fields (id, Cost, CostBreakdown) to the Call model to incorporate cost details.
    • Handle nested cost data structures and calculate total cost.
    • Improve timestamp parsing with fallback mechanisms and timezone conversion for robustness.
  • Update app initialization to include no_args_is_help=True for better user guidance.

  • Replace local_debug with local_parse_config for better clarity and functionality in parsing local configurations.

  • Refactor file structure and imports:

    • Replace langchain_core.pydantic_v1 with pydantic for managing models.
    • Add typing.Annotated for type-annotated command options.

These changes collectively improve the usability, debugging, and analytics capabilities of the Tony assistant application.


tony_server.py

tony_server.py: +224, -76, ~300

TL;DR: Introduced a FastAPI-based server with multiple endpoints, replacing the previous Modal-based implementation. Added functionality for journal management, search, and transit arrivals. Enhanced security and modularity.

  • Replace Modal-based assistant implementation with a FastAPI-based server to allow more dynamic HTTP endpoints and enhanced modularity.
  • Add /assistant endpoint to process assistant-related requests with updated context, including PST time and journal content.
    • Incorporate journal content and timezone-aware time into system prompt for richer assistant context.
  • Introduce /search endpoint to handle search logic via the Perplexity API, ensuring robust and secure token-based authorization.
  • Add /library-arrivals endpoint to fetch and return transit arrival data using the OneBusAway API.
  • Implement journal management endpoints:
    • /journal-read: Fetch and return journal content from Azure Cosmos DB.
    • /journal-append: Append new entries to the journal and update the database.
  • Enhance security by introducing raise_if_not_authorized to validate API keys in headers.
  • Add warm-up functionality for endpoints to reduce latency during initial requests.
  • Modularize common functionality such as parse_tool_call and make_vapi_response for reusability.
  • Update dependencies to include libraries like FastAPI, Azure Cosmos SDK, and OneBusAway SDK.
  • Update Modal App configuration with additional secrets and dependencies for the new functionality.

pyproject.toml

pyproject.toml: +82, -0, ~82

TL;DR: Introduce a pyproject.toml configuration file to define the build system, project metadata, dependencies, testing setup, and coverage reporting for the tony_tesla project.

  • Define build system with setuptools and wheel for packaging and setuptools.build_meta as the backend.
  • Specify project metadata including:
    • Project name (tony_tesla), version (0.1.0), description, authors, license (MIT), and readme file.
  • List required dependencies for the project, including libraries like icecream, rich, pydantic, langchain, and fastapi.
  • Add optional development dependencies for testing and development tools, such as pytest, pytest-asyncio, and textual-dev.
  • Configure setuptools with Python modules including tony, storage, bus, blog_server, and tony_server.
  • Define project command-line scripts (storage, tony, bus) linking to their respective modules.
  • Set up pytest configurations:
    • Define test paths (tests/unit, tests/integration, tests/e2e), test file patterns, and test function naming conventions.
    • Add options like verbosity, traceback shortening, and a 5-minute timeout for tests.
    • Ignore specific warning types (DeprecationWarning, UserWarning) and enable strict asyncio testing.
  • Configure coverage reporting:
    • Include source files for coverage but omit files like tests/* and setup.py.
    • Exclude specific lines from coverage evaluation (e.g., pragma: no cover, if __name__ == "__main__":, etc.).

modal_readonly/tony_system_prompt.md

modal_readonly/tony_system_prompt.md: +77, -19, ~96

TLDR: Enhanced the system prompt with more detailed instructions for interactions, added content for memorization, emotional states, and journal usage, and updated specific word mappings and card explanations.

  • Add detailed practices for the "sublime states" (Loving-kindness, Compassion, Altruistic Joy, Equanimity) to aid Igor in emotional development and practical application.
  • Expand the mnemonic system with updated word mappings ("Bee" to "Paw", "Roll" to "Rail") for better memorization associations.
  • Improve the explanation of the Tamriz memdeck to clarify card position and suit relationships, aiding in learning efficiency.
  • Introduce structured guidelines for Tony's responses, emphasizing speed, emotive delivery, and concise communication.
  • Add instructions for handling affirmations, workout recording, and blog URL responses to streamline interactions.
  • Provide a detailed framework for journaling, including categorization of entries (e.g., GRATEFUL, TODO) and handling failed tool calls.
  • Introduce reminders about Igor's preferences for Heath brothers’ books and their acronyms, ensuring relevant learning points are included in conversations.
  • Clarify workout tracking with specific follow-up questions and guidelines for recording details in the journal effectively.

blog_server.py

blog_server.py: +251, -0, ~251

TL;DR: Introduced a new FastAPI-based blog server to handle blog-related operations like fetching, searching, and retrieving random blog posts.

  • Implemented endpoints to interact with blog data:

    • /random_blog: Fetches a random blog post's content, title, and metadata.
    • /blog_info: Retrieves metadata about all blog posts, including URLs, titles, and descriptions.
    • /read_blog_post: Fetches the content of a blog post by its markdown path or URL.
    • /random_blog_url: Returns only the URL of a randomly selected blog post.
    • /blog_search: Searches blog posts using the Algolia service, excluding specific collections.
  • Added BlogReader class for handling blog metadata:

    • Fetches back-links.json from a GitHub repository.
    • Maps URLs to markdown paths and provides utility functions for URL-related operations.
  • Integrated GitHub content retrieval through read_blog_post() function:

    • Fetches raw markdown content from the repository based on a given path.
  • Secured endpoints:

    • Added authorization checks using raise_if_not_authorized.
  • Powered search functionality with Algolia:

    • Configured Algolia integration to query blog metadata with filters.
  • Integrated Modal for deployment:

    • Configured Modal's fastapi_app decorator for seamless deployment of the FastAPI app.

bus.py

bus.py: +247, -0, ~247

TL;DR: Introduce a new script to interact with King County Metro transit data, providing functionalities to fetch, process, and display bus routes, stops, and upcoming arrivals.

  • Add functionality to load and cache transit data, including routes, stops, and trip updates, using Pydantic models for structured data representation.

    • get_routes(), get_stops(), and get_trip_updates() parse respective files into structured models and cache the results for efficiency.
  • Implement a command-line interface (CLI) using Typer to provide multiple commands:

    • stops_for_route: Display stops and arrival times for a given route number.
    • library: Retrieve and display the next bus arrivals at a specific library stop.
    • get_latest_data: Download the latest transit data files asynchronously from King County Metro.
  • Provide logging and debugging support via Loguru and Icecream to handle and debug application behavior effectively.

  • Introduce asynchronous data fetching with HTTPX to improve the efficiency of downloading large transit data files.

  • Set up structured logging and error handling with Loguru's @logger.catch() to capture application errors.

This script facilitates the processing and visualization of real-time transit data for King County Metro buses.


tests/e2e/test_blog_server.py

tests/e2e/test_blog_server.py: +244, -0, ~244

TLDR: Add end-to-end tests for blog server API to validate endpoints and ensure proper functionality.

  • Introduce make_request with retry logic to handle HTTP requests and enhance reliability.
    • Includes retry policies and detailed logging with icecream for debugging.
  • Add fixtures auth_headers and base_params to centralize setup for authentication and request payloads.
  • Add end-to-end test test_random_blog_e2e to verify "random_blog" endpoint functionality.
    • Ensures response structure includes valid blog content, title, markdown path, and URL.
  • Add test_blog_info_e2e to validate "blog_info" endpoint.
    • Confirms response contains a list of blog posts with required fields and a minimum count of 100.
  • Add test_read_blog_post_e2e to test "read_blog_post" endpoint for individual blog retrieval.
    • Verifies response consistency when called with direct paths and URL paths.
  • Add test_random_blog_url_e2e to check "random_blog_url_only" endpoint.
    • Confirms response has a valid blog title and URL.
  • Add test_blog_search_e2e for blog search functionality.
    • Validates search results structure, URL formatting, and content relevance to the search term "meditation".

tests/integration/test_blog_handler.py

tests/integration/test_blog_handler.py: +170, -0, ~170

TL;DR: Add integration tests for blog-related endpoints to ensure correct functionality of the blog handling system.

  • Validate the /random_blog endpoint to ensure it returns a non-empty list of blog entries with appropriate fields (content, title, url, markdown_path).
  • Test the /blog_info endpoint to confirm it retrieves a list of blog posts with correct structure and metadata, and ensures the presence of at least 100 posts.
  • Ensure the /read_blog_post endpoint correctly reads blog content:
    • Supports retrieval by both markdown_path and URL path.
    • Verifies the returned content and path match expectations.
  • Confirm the /random_blog_url endpoint returns a valid blog post URL and title, validating the URL format starts with https://idvork.in.
  • Test the /blog_search endpoint:
    • Validate integration with Algolia search for specific queries (e.g., "meditation").
    • Ensure the filtered results exclude a specific collection (ig66) while still containing relevant data.
    • Confirm the structure of search results includes fields like url, title, content, and collection.

storage.py

storage.py: +177, -0, ~177

TL;DR: Introduce a new script to manage Azure Cosmos DB storage operations, including file listing and journal management, with CLI commands.

  • Add CLI functionality using Typer to enable storage and journal management operations.

    • Commands include all_files, read_journal, replace_journal, append_journal, list_journal, and clear_journal.
  • Implement Azure Cosmos DB integration for CRUD operations:

    • Use CosmosClient to interact with a database for storing and retrieving data.
    • Manage journals in a specific container (journal_container).
  • Introduce JournalItemModel using Pydantic for structured journal data validation.

  • Provide utilities for journal management:

    • read_journal: Fetch and print the current journal content.
    • replace_journal: Replace journal content with new content from a file, with backup creation.
    • append_journal: Append a timestamped entry to the journal.
    • list_journal: List all journal entries.
    • clear_journal: Clear all content from the journal.
  • Enhance diagnostics and debugging using loguru for logging and icecream for structured console outputs.


modal_readonly/tony_assistant_spec.json

modal_readonly/tony_assistant_spec.json: +178, -22, ~200

TL;DR: Expanded and upgraded the assistant configuration for enhanced functionality, including new tools, updated models, and improved voice capabilities.

  • Added multiple tools to extend assistant functionality:

    • Functions for bus arrivals, journal interactions, internet searches, and blog-related tasks (retrieval, search, random selection).
    • These tools include specific API configurations and placeholder secrets.
  • Updated the voice model to "eleven_flash_v2" for improved voice capabilities.

  • Upgraded the AI model to "gpt-4o-2024-11-20" with additional features:

    • Enabled emotion recognition.
    • Updated the tools definition to include detailed configurations and async capabilities.
  • Enabled the endCallFunction feature, allowing more dynamic call handling.

  • Retained and slightly reformatted existing features like transcription, end-call phrases, and server messaging for consistency.

  • Simplified JSON formatting for readability.


tests/e2e/test_tony_server.py

tests/e2e/test_tony_server.py: +106, -0, ~106

TLDR: Add end-to-end tests for the Tony Server API endpoints to verify their functionality with real HTTP requests.

  • Add make_request function to handle HTTP requests with error handling and JSON decoding for robust testing.
  • Introduce auth_headers and base_params fixtures for reusable authentication headers and parameter structures.
  • Implement test_journal_read_e2e to validate the /journal-read endpoint response structure and content.
  • Add test_search_e2e to test the /search endpoint using sample tool call parameters and validate the results.
  • Include test_assistant_e2e to verify the /assistant endpoint functionality, particularly the presence and type of the assistant key in the response.
  • Use the icecream library for detailed logging and debugging of requests and responses.

justfile

justfile: +86, -0, ~86

TL;DR: Introduce a justfile for task automation to simplify development, testing, and deployment workflows.

  • Automate common development tasks such as installing dependencies, running tests, and deploying servers.
    • Includes commands for installing the project (install and global-install).
    • Supports running development servers (run-dev-server and run-dev-blog-server) and deploying production servers (deploy and deploy-blog).
  • Streamline testing processes with categorized test commands (test-unit, test-integration, test-e2e, etc.) for quick and organized testing.
  • Add commands to test various API endpoints for functionality verification (test-assistant, test-read, test-append, etc.).
  • Facilitate testing coverage reporting with the test-coverage command.
  • Enable bulk deployment of all services with the deploy-all command.

tests/integration/test_tony_server.py

tests/integration/test_tony_server.py: +104, -0, ~104

TLDR: Add integration tests for tony_server and blog_server functionalities to ensure proper behavior of APIs and utility functions.

  • Validate the /search endpoint of tony_server to ensure correct handling of tool calls and response structure.
  • Test the parse_tool_call function directly to confirm it parses tool call data correctly.
  • Test the make_vapi_response function to verify it generates accurate responses for tool calls.
  • Add a test for the blog_search endpoint of blog_server to ensure it processes search queries and returns expected structured results.
  • Introduce fixtures (auth_headers and base_params) to standardize test setup, improving code reusability and reducing redundancy.

modal_readonly/transit/routes.txt

modal_readonly/transit/routes.txt: +151, -0, ~151

TLDR: Add a new file containing transit route information for integration or data analysis purposes.

  • Introduce a new dataset for transit routes with essential details, including route IDs, agency information, short and long route names, descriptions, types, URLs, and optional color codes.
    • The dataset appears to focus on transit routes in the Seattle area, featuring King County Metro, Sound Transit, and other local transportation services.
  • Enable easier access and reference for transit route details, likely for use in transit-related applications, modeling, or analysis.
  • Provide a standardized format for consolidating route information, ensuring consistency and ease of use.

tests/unit/test_blog_server.py

tests/unit/test_blog_server.py: +100, -0, ~100

TL;DR: Add unit tests for BlogReader, read_blog_post function, and UrlInfo model to verify functionality and edge cases.

  • Ensure read_blog_post correctly fetches and returns content from a markdown file.
  • Verify BlogReader.get_url_info retrieves and maps URL information into UrlInfo objects.
  • Test initialization, default values, and optional fields for the UrlInfo model.
  • Validate BlogReader.url_to_markdown_path correctly converts URLs to associated markdown paths, including handling edge cases like non-existent or empty URLs.
  • Add a test to confirm read_blog_post works with full markdown paths.

journal_cleanup.convo.md

journal_cleanup.convo.md: +36, -0, ~36

TL;DR: Add a new markdown file outlining a conversational requirement for formatting journal entries consistently.

  • Introduce a structured conversation format for enhancing journal processing logic.
    • Standardize TODO items to [TODO] format.
    • Normalize gratitude-related lines to [GRATEFUL] format.
    • Add date-tracking for completed TODO items with [TODO completed=YYYY-MM-DD] syntax.
    • Format wake-up time entries with [WAKEUP: HH:MM].
    • Consolidate duplicate dates in entries to a single date.
    • Remove "ignore" entries entirely from the journal.

bus_tools.md

bus_tools.md: +29, -0, ~29

TL;DR: Introduce a new file to provide tools and instructions for adding bus support for Toni, currently limited to Seattle Metro data.

  • Enable real-time bus support by linking to King County Metro's developer resources.
  • Provide example data for the route of interest (Route 48) from the routes.txt file for context.
  • Include a command to filter trip updates for Route 48 using jq, simplifying data extraction.
  • Add instructions to retrieve and deduplicate stop IDs for Route 48, aiding in route-specific stop identification.

.pre-commit-config.yaml

.pre-commit-config.yaml: +20, -0, ~20

TL;DR: Add pre-commit configuration file to enforce code quality and formatting standards automatically.

  • Introduce Ruff linter and formatter for Python, Pyi, and Jupyter files to ensure consistent linting and formatting.
    • Include automatic fixing capability with --fix argument for Ruff linter.
  • Add Dasel validation hook for structured data validation.
  • Add Prettier for consistent formatting of various file types.

README.md

README.md: +42, -3, ~42

TL;DR: Add a sequence diagram illustrating key interactions within the Tony assistant system.

  • Improve documentation by visualizing system interactions with a Mermaid sequence diagram.
    • Show user (Igor) interactions with the VAPI client and backend services.
    • Highlight key workflows: initialization, weather inquiries, and journal gratitude recording.
    • Clarify dependencies between components such as Azure Cosmos DB, Perplexity AI, and Modal storage.

.gitignore

.gitignore: +4, -0, ~4

TL;DR: Add a .gitignore file to exclude unnecessary files and directories from version control.

  • Prevent tracking of temporary files created by the aider tool by ignoring .aider*.
  • Exclude Python bytecode cache files by ignoring __pycache__/.
  • Avoid tracking metadata files for Python packages by ignoring *.egg-info/.
  • Prevent sensitive environment configuration files from being tracked by ignoring .env.

modal_readonly/transit/trips.txt

modal_readonly/transit/trips.txt: +5,064,506, -0, ~5,064,506

TL;DR: Added a new massive data file containing transit trip information.

  • Introduce a new dataset for transit trips to support transit-related functionality or analyses.
    • Likely includes detailed trip schedules or data for transit systems.
    • May be used for analytics, modeling, planning, or real-time transit applications.

modal_readonly/transit/stops.txt

modal_readonly/transit/stops.txt: +617.0KB, -0, ~large file

TL;DR: Added a new file containing transit stops data likely for use in a transportation or mapping feature.

  • Provide a comprehensive dataset of transit stops for potential integration into a system or application.
  • Likely intended to support functionalities such as route planning, location services, or transit analytics.

modal_readonly/transit/48_stops.txt

modal_readonly/transit/48_stops.txt: +57, -0, ~0

TLDR: Add a new file to list transit stop identifiers for route 48.

  • Introduce a static list of stop identifiers for transit route 48.
    • Likely used for lookup or reference purposes in the transit system.
    • Supports the functionality of identifying or processing stops associated with route 48.

tests/integration/init.py

tests/integration/init.py: +1, -0, ~1

TL;DR: Add an __init__.py file to define the integration directory as a Python package for integration tests.

  • Ensure the integration directory is recognized as a Python package by adding an __init__.py file.
  • The file contains a docstring to indicate its purpose as the integration tests package.

tests/e2e/init.py

tests/e2e/init.py: +1, -0, ~1

TL;DR: Add an __init__.py file to define the tests/e2e directory as a Python package.

  • Enable the tests/e2e directory to be recognized as a Python package for organizing end-to-end tests. This is necessary for importing and running tests within the package.

tests/unit/init.py

tests/unit/init.py: +1, -0, ~1

TL;DR: Create an initialization file for the tests/unit package, making it a valid Python package.

  • Add an __init__.py file to mark the tests/unit directory as a Python package. This is necessary for proper test discovery and execution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment