Skip to content

Instantly share code, notes, and snippets.

@simonw

simonw/audio.md Secret

Created October 28, 2024 03:32
Show Gist options
  • Save simonw/88e8789c329a70ec5f68328f2cf60767 to your computer and use it in GitHub Desktop.
Save simonw/88e8789c329a70ec5f68328f2cf60767 to your computer and use it in GitHub Desktop.

GitHub Gist Audio Extractor

human (Oct 27, 2024, 08:27 PM)

Build an artifact - no react, mobile friendly (16px input font sizes) with a input box and button which accepts a URL like this one:

https://gist.github.com/simonw/67c2a85d5731774a0de2d9ee474b47f0

Then it pulls out the gist ID and constructs this URL:

https://api.github.com/gists/67c2a85d5731774a0de2d9ee474b47f0

It uses fetch() to get the JSON for that which looks like this:

{
  "files": {
    "output-base64.json": {

      "content": "{\n  \"id\": \"chatcmpl..."
    }
  }
}

It fetches the 'content' of the first file in that files object and treats that as JSON, decoding it

It will decode to something like this:

{
  "id": "chatcmpl-ANAb3BE24KdnlhwcMjsBqBDYCjGJx",
  "object": "chat.completion",
  "created": 1730085641,
  "model": "gpt-4o-audio-preview-2024-10-01",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "refusal": null,
        "audio": {
          "id": "audio_671f030b191c8190840a4055ea7449d2",
          "data": "UklGRgYBAwBXQVZFZ...",
          "transcript": "I love pelicans! They're so majestic and ..."

It gets the data and the transcript keys from that audio

The data is base64 encoded .wav data. It decodes that and puts it in a <audio ...> player interface that can be played

It shows the transcript below that

It also includes a "Download" button which downloads that .wav data

assistant (Oct 27, 2024, 08:27 PM)

I'll help create an HTML artifact that implements this audio player interface.

create Gist Audio Player

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Gist Audio Player</title>
    <style>
        body {
            font-family: system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
            max-width: 800px;
            margin: 20px auto;
            padding: 0 20px;
            line-height: 1.6;
        }
        .input-group {
            display: flex;
            gap: 8px;
            margin-bottom: 20px;
        }
        input {
            flex: 1;
            padding: 8px 12px;
            font-size: 16px;
            border: 1px solid #ccc;
            border-radius: 4px;
        }
        button {
            padding: 8px 16px;
            font-size: 16px;
            background: #0066cc;
            color: white;
            border: none;
            border-radius: 4px;
            cursor: pointer;
        }
        button:disabled {
            background: #cccccc;
        }
        button:hover:not(:disabled) {
            background: #0055aa;
        }
        .error {
            color: #cc0000;
            margin: 10px 0;
        }
        .player-container {
            margin: 20px 0;
        }
        audio {
            width: 100%;
            margin: 10px 0;
        }
        .transcript {
            background: #f5f5f5;
            padding: 15px;
            border-radius: 4px;
            margin: 10px 0;
        }
        .loading {
            color: #666;
            font-style: italic;
        }
    </style>
</head>
<body>
    <div class="input-group">
        <input type="url" id="gistUrl" placeholder="Enter Gist URL" aria-label="Gist URL">
        <button id="fetchBtn">Fetch</button>
    </div>
    <div id="error" class="error" style="display: none;"></div>
    <div id="playerContainer" class="player-container" style="display: none;">
        <audio id="audioPlayer" controls></audio>
        <button id="downloadBtn">Download Audio</button>
        <div id="transcript" class="transcript"></div>
    </div>

    <script>
        const gistInput = document.getElementById('gistUrl');
        const fetchBtn = document.getElementById('fetchBtn');
        const errorDiv = document.getElementById('error');
        const playerContainer = document.getElementById('playerContainer');
        const audioPlayer = document.getElementById('audioPlayer');
        const downloadBtn = document.getElementById('downloadBtn');
        const transcriptDiv = document.getElementById('transcript');

        function showError(message) {
            errorDiv.textContent = message;
            errorDiv.style.display = 'block';
            playerContainer.style.display = 'none';
        }

        function clearError() {
            errorDiv.style.display = 'none';
        }

        async function processGistUrl(url) {
            try {
                // Extract gist ID from URL
                const gistId = url.split('/').pop();
                if (!gistId) throw new Error('Invalid Gist URL');

                // Fetch gist data
                const gistResponse = await fetch(`https://api.github.com/gists/${gistId}`);
                if (!gistResponse.ok) throw new Error('Failed to fetch Gist');
                
                const gistData = await gistResponse.json();
                const files = gistData.files;
                if (!files || Object.keys(files).length === 0) {
                    throw new Error('No files found in Gist');
                }

                // Get content from first file
                const firstFile = Object.values(files)[0];
                const content = JSON.parse(firstFile.content);

                // Extract audio data and transcript
                const audioData = content.choices[0].message.audio.data;
                const transcript = content.choices[0].message.audio.transcript;

                // Create audio blob and URL
                const binaryData = atob(audioData);
                const arrayBuffer = new ArrayBuffer(binaryData.length);
                const uint8Array = new Uint8Array(arrayBuffer);
                for (let i = 0; i < binaryData.length; i++) {
                    uint8Array[i] = binaryData.charCodeAt(i);
                }
                const blob = new Blob([uint8Array], { type: 'audio/wav' });
                const audioUrl = URL.createObjectURL(blob);

                // Update UI
                audioPlayer.src = audioUrl;
                transcriptDiv.textContent = transcript;
                playerContainer.style.display = 'block';
                clearError();

                // Set up download button
                downloadBtn.onclick = () => {
                    const a = document.createElement('a');
                    a.href = audioUrl;
                    a.download = 'audio.wav';
                    document.body.appendChild(a);
                    a.click();
                    document.body.removeChild(a);
                };
            } catch (error) {
                showError(error.message || 'An error occurred');
            }
        }

        fetchBtn.addEventListener('click', () => {
            const url = gistInput.value.trim();
            if (!url) {
                showError('Please enter a Gist URL');
                return;
            }
            processGistUrl(url);
        });

        gistInput.addEventListener('keypress', (e) => {
            if (e.key === 'Enter') {
                fetchBtn.click();
            }
        });
    </script>
</body>
</html>

I've created a mobile-friendly HTML artifact that implements all the requested functionality. Here's what it includes:

  1. A responsive input field (16px font size) and fetch button
  2. Error handling and display
  3. Audio player with controls
  4. Download button for the WAV file
  5. Transcript display below the player
  6. Mobile-friendly styling with proper spacing and touch targets

Key features:

  • Input accepts Gist URLs and extracts the ID
  • Fetches and parses the Gist JSON
  • Decodes base64 audio data into a playable format
  • Shows transcript from the JSON
  • Allows downloading the audio file
  • Works on both desktop and mobile devices
  • Handles errors gracefully
  • Supports hitting Enter to submit

You can test it by pasting a Gist URL like the example you provided. Would you like me to explain any part of the implementation in more detail?

human (Oct 27, 2024, 08:30 PM)

Make it so when the form is submitted the URL updates to have ?GISTID in it (using the HTML history API) - and that if when the page is loaded that ?GISTID is present it fills in that form and fetches the data automatically

Add a note that says that the gists should have the JSON response from the OpenAI gpt-4o-audio-preview model

assistant (Oct 27, 2024, 08:30 PM)

I'll update the artifact to handle URL parameters and add a note about the expected format.

rewrite Gist Audio Player

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Gist Audio Player</title>
    <style>
        body {
            font-family: system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
            max-width: 800px;
            margin: 20px auto;
            padding: 0 20px;
            line-height: 1.6;
        }
        .info {
            background: #e8f4ff;
            padding: 15px;
            border-radius: 4px;
            margin: 20px 0;
            border-left: 4px solid #0066cc;
        }
        .input-group {
            display: flex;
            gap: 8px;
            margin-bottom: 20px;
        }
        input {
            flex: 1;
            padding: 8px 12px;
            font-size: 16px;
            border: 1px solid #ccc;
            border-radius: 4px;
        }
        button {
            padding: 8px 16px;
            font-size: 16px;
            background: #0066cc;
            color: white;
            border: none;
            border-radius: 4px;
            cursor: pointer;
        }
        button:disabled {
            background: #cccccc;
        }
        button:hover:not(:disabled) {
            background: #0055aa;
        }
        .error {
            color: #cc0000;
            margin: 10px 0;
        }
        .player-container {
            margin: 20px 0;
        }
        audio {
            width: 100%;
            margin: 10px 0;
        }
        .transcript {
            background: #f5f5f5;
            padding: 15px;
            border-radius: 4px;
            margin: 10px 0;
        }
        .loading {
            color: #666;
            font-style: italic;
        }
    </style>
</head>
<body>
    <div class="info">
        Note: This player expects GitHub Gists containing JSON responses from the OpenAI GPT-4 with audio preview model 
        (<code>gpt-4o-audio-preview</code>). The JSON should include an audio response with base64-encoded WAV data.
    </div>

    <div class="input-group">
        <input type="url" id="gistUrl" placeholder="Enter Gist URL" aria-label="Gist URL">
        <button id="fetchBtn">Fetch</button>
    </div>
    <div id="error" class="error" style="display: none;"></div>
    <div id="playerContainer" class="player-container" style="display: none;">
        <audio id="audioPlayer" controls></audio>
        <button id="downloadBtn">Download Audio</button>
        <div id="transcript" class="transcript"></div>
    </div>

    <script>
        const gistInput = document.getElementById('gistUrl');
        const fetchBtn = document.getElementById('fetchBtn');
        const errorDiv = document.getElementById('error');
        const playerContainer = document.getElementById('playerContainer');
        const audioPlayer = document.getElementById('audioPlayer');
        const downloadBtn = document.getElementById('downloadBtn');
        const transcriptDiv = document.getElementById('transcript');

        function showError(message) {
            errorDiv.textContent = message;
            errorDiv.style.display = 'block';
            playerContainer.style.display = 'none';
        }

        function clearError() {
            errorDiv.style.display = 'none';
        }

        function extractGistId(url) {
            if (!url) return null;
            // Handle both full URLs and just IDs
            const parts = url.split('/');
            return parts[parts.length - 1];
        }

        function updateURL(gistId) {
            const newUrl = new URL(window.location);
            newUrl.searchParams.set('gist', gistId);
            history.pushState({}, '', newUrl);
        }

        async function processGistUrl(url) {
            try {
                fetchBtn.disabled = true;
                const gistId = extractGistId(url);
                if (!gistId) throw new Error('Invalid Gist URL');

                // Update URL with gist ID
                updateURL(gistId);

                // Update input field if it doesn't match
                if (gistInput.value !== url) {
                    gistInput.value = url;
                }

                // Fetch gist data
                const gistResponse = await fetch(`https://api.github.com/gists/${gistId}`);
                if (!gistResponse.ok) throw new Error('Failed to fetch Gist');
                
                const gistData = await gistResponse.json();
                const files = gistData.files;
                if (!files || Object.keys(files).length === 0) {
                    throw new Error('No files found in Gist');
                }

                // Get content from first file
                const firstFile = Object.values(files)[0];
                const content = JSON.parse(firstFile.content);

                // Verify this is an audio response
                if (!content?.choices?.[0]?.message?.audio?.data) {
                    throw new Error('This Gist does not contain a valid GPT-4 audio response');
                }

                // Extract audio data and transcript
                const audioData = content.choices[0].message.audio.data;
                const transcript = content.choices[0].message.audio.transcript;

                // Create audio blob and URL
                const binaryData = atob(audioData);
                const arrayBuffer = new ArrayBuffer(binaryData.length);
                const uint8Array = new Uint8Array(arrayBuffer);
                for (let i = 0; i < binaryData.length; i++) {
                    uint8Array[i] = binaryData.charCodeAt(i);
                }
                const blob = new Blob([uint8Array], { type: 'audio/wav' });
                const audioUrl = URL.createObjectURL(blob);

                // Update UI
                audioPlayer.src = audioUrl;
                transcriptDiv.textContent = transcript;
                playerContainer.style.display = 'block';
                clearError();

                // Set up download button
                downloadBtn.onclick = () => {
                    const a = document.createElement('a');
                    a.href = audioUrl;
                    a.download = 'audio.wav';
                    document.body.appendChild(a);
                    a.click();
                    document.body.removeChild(a);
                };
            } catch (error) {
                showError(error.message || 'An error occurred');
            } finally {
                fetchBtn.disabled = false;
            }
        }

        // Handle form submission
        fetchBtn.addEventListener('click', () => {
            const url = gistInput.value.trim();
            if (!url) {
                showError('Please enter a Gist URL');
                return;
            }
            processGistUrl(url);
        });

        gistInput.addEventListener('keypress', (e) => {
            if (e.key === 'Enter') {
                fetchBtn.click();
            }
        });

        // Check for gist ID in URL on page load
        window.addEventListener('load', () => {
            const params = new URLSearchParams(window.location.search);
            const gistId = params.get('gist');
            if (gistId) {
                // If it's just an ID, construct the full URL
                const fullUrl = `https://gist.github.com/${gistId}`;
                gistInput.value = fullUrl;
                processGistUrl(fullUrl);
            }
        });

        // Handle browser back/forward
        window.addEventListener('popstate', () => {
            const params = new URLSearchParams(window.location.search);
            const gistId = params.get('gist');
            if (gistId) {
                const fullUrl = `https://gist.github.com/${gistId}`;
                processGistUrl(fullUrl);
            } else {
                // Clear everything if there's no gist ID
                gistInput.value = '';
                playerContainer.style.display = 'none';
                clearError();
            }
        });
    </script>
</body>
</html>

I've made several updates to the artifact:

  1. Added URL handling:

    • When a gist is fetched, the URL updates to include ?gist=GISTID
    • On page load, it checks for a gist ID in the URL and automatically fetches it
    • Browser back/forward navigation now works properly
  2. Added an info box at the top explaining the expected format of the gists

  3. Improved error handling:

    • Better validation of the JSON response format
    • Clearer error messages
    • Disable button during fetch to prevent double-submissions
  4. Added loading states:

    • Button is disabled while fetching
    • Better handling of various states
  5. Improved URL handling:

    • Can handle both full Gist URLs and just IDs
    • Maintains URL state when navigating

You can now:

  1. Share URLs with gist IDs (e.g., ?gist=67c2a85d5731774a0de2d9ee474b47f0)
  2. Use browser back/forward to navigate between different gists
  3. See the expected format more clearly

Would you like me to explain any part of the implementation in more detail?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment