Skip to content

Instantly share code, notes, and snippets.

@simonw
Created October 15, 2024 01:14
Show Gist options
  • Save simonw/e0a841a580038d15c7bf22bd7d104ce3 to your computer and use it in GitHub Desktop.
Save simonw/e0a841a580038d15c7bf22bd7d104ce3 to your computer and use it in GitHub Desktop.

Fetch Content in Multiple Formats

human (Oct 14, 2024, 06:09 PM)

Pasted https://github.com/simonw/tools/blob/206d998e614bddd154e77f33260f37048d72228f/jina-reader.html

Add a select box next to the URL box before the submit button with the following options:

Markdown HTML Text LLM Markdown

Default should be the first one This will change how the data is selected. For Markdown:

fetch('https://r.jina.ai/https://example.com', {
  method: 'GET',
    headers: {
    "X-Return-Format": "markdown"
  },
})

For HTML the format is "html", for Text it's "text"

For LLM Markdown the format header isn't sent at all, it just does a normal GET fetch

paste.txt

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Jina Reader</title>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/marked/4.0.2/marked.min.js"></script>
    <style>
        body {
            font-family: Helvetica, sans-serif;
            max-width: 800px;
            margin: 0 auto;
            padding: 20px;
            box-sizing: border-box;
        }
        h1 {
            font-size: 24px;
            margin-bottom: 20px;
        }
        #url-input, #submit-btn, #markdown-raw, #copy-btn {
            font-size: 16px;
            padding: 10px;
            margin-bottom: 10px;
            box-sizing: border-box;
        }
        #url-input {
            width: 70%;
        }
        #submit-btn, #copy-btn {
            background-color: #4CAF50;
            color: white;
            border: none;
            cursor: pointer;
        }
        #submit-btn:hover, #copy-btn:hover {
            background-color: #45a049;
        }
        #markdown-raw {
            width: 100%;
            height: 200px;
            margin-top: 20px;
            resize: vertical;
        }
        #markdown-rendered {
            margin-top: 20px;
            border: 1px solid #ccc;
            padding: 10px;
            overflow-wrap: break-word;
        }
        #loading {
            display: none;
            margin-top: 20px;
        }
        #result {
            display: none;
        }
        iframe {
            width: 100%;
            height: 500px;
            box-sizing: border-box;
            border: 1px solid #ccc;
        }
        @media (min-width: 768px) {
            #url-input {
                width: 70%;
                display: inline-block;
            }
            #submit-btn {
                display: inline-block;
                margin-left: 2%;
            }
            #copy-btn {
                width: auto;
            }
        }
    </style>
</head>
<body>
    <h1>Jina Reader</h1>
    <p>An interface for the <a href="https://jina.ai/reader/">Jina Reader API</a>.</p>
    <form id="url-form">
        <input type="text" id="url-input" placeholder="Enter URL" required>
        <button type="submit" id="submit-btn">Submit</button>
    </form>
    <div id="loading">Loading...</div>
    <div id="result">
        <textarea id="markdown-raw" readonly></textarea>
        <button id="copy-btn">Copy to clipboard</button>
        <iframe id="markdown-rendered" sandbox></iframe>
    </div>

    <script>
        const urlForm = document.getElementById('url-form');
        const urlInput = document.getElementById('url-input');
        const loading = document.getElementById('loading');
        const result = document.getElementById('result');
        const markdownRaw = document.getElementById('markdown-raw');
        const copyBtn = document.getElementById('copy-btn');
        const markdownRendered = document.getElementById('markdown-rendered');

        urlForm.addEventListener('submit', async (e) => {
            e.preventDefault();
            const url = urlInput.value;
            if (!url) return;

            loading.style.display = 'block';
            result.style.display = 'none';

            try {
                const response = await fetch(`https://r.jina.ai/${url}`);
                const markdown = await response.text();
                markdownRaw.value = markdown;
                const htmlContent = `
                    <html>
                        <head>
                            <style>
                                body { font-family: Helvetica, sans-serif; line-height: 1.6; color: #333; }
                                img { max-width: 100%; height: auto; }
                            </style>
                        </head>
                        <body>
                            ${marked.parse(markdown)}
                        </body>
                    </html>
                `;
                markdownRendered.srcdoc = htmlContent;
                result.style.display = 'block';
            } catch (error) {
                console.error('Error fetching markdown:', error);
                markdownRaw.value = 'Error fetching markdown. Please try again.';
                markdownRendered.srcdoc = '<p>Error fetching markdown. Please try again.</p>';
                result.style.display = 'block';
            } finally {
                loading.style.display = 'none';
            }
        });

        copyBtn.addEventListener('click', () => {
            markdownRaw.select();
            document.execCommand('copy');
            
            const originalText = copyBtn.textContent;
            copyBtn.textContent = 'Copied';
            
            setTimeout(() => {
                copyBtn.textContent = originalText;
            }, 1500);
        });
    </script>
</body>
</html>


assistant (Oct 14, 2024, 06:09 PM)

Certainly! I'll modify the HTML to add a select box with the options you specified. I'll also update the JavaScript to change the fetch request based on the selected option. Here's the updated code:

Updated Jina Reader HTML

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Jina Reader</title>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/marked/4.0.2/marked.min.js"></script>
    <style>
        body {
            font-family: Helvetica, sans-serif;
            max-width: 800px;
            margin: 0 auto;
            padding: 20px;
            box-sizing: border-box;
        }
        h1 {
            font-size: 24px;
            margin-bottom: 20px;
        }
        #url-input, #format-select, #submit-btn, #markdown-raw, #copy-btn {
            font-size: 16px;
            padding: 10px;
            margin-bottom: 10px;
            box-sizing: border-box;
        }
        #url-input {
            width: 50%;
        }
        #format-select {
            width: 20%;
        }
        #submit-btn, #copy-btn {
            background-color: #4CAF50;
            color: white;
            border: none;
            cursor: pointer;
        }
        #submit-btn:hover, #copy-btn:hover {
            background-color: #45a049;
        }
        #markdown-raw {
            width: 100%;
            height: 200px;
            margin-top: 20px;
            resize: vertical;
        }
        #markdown-rendered {
            margin-top: 20px;
            border: 1px solid #ccc;
            padding: 10px;
            overflow-wrap: break-word;
        }
        #loading {
            display: none;
            margin-top: 20px;
        }
        #result {
            display: none;
        }
        iframe {
            width: 100%;
            height: 500px;
            box-sizing: border-box;
            border: 1px solid #ccc;
        }
        @media (min-width: 768px) {
            #url-input {
                width: 50%;
                display: inline-block;
            }
            #format-select {
                width: 20%;
                display: inline-block;
                margin-left: 2%;
            }
            #submit-btn {
                display: inline-block;
                margin-left: 2%;
            }
            #copy-btn {
                width: auto;
            }
        }
    </style>
</head>
<body>
    <h1>Jina Reader</h1>
    <p>An interface for the <a href="https://jina.ai/reader/">Jina Reader API</a>.</p>
    <form id="url-form">
        <input type="text" id="url-input" placeholder="Enter URL" required>
        <select id="format-select">
            <option value="markdown">Markdown</option>
            <option value="html">HTML</option>
            <option value="text">Text</option>
            <option value="llm_markdown">LLM Markdown</option>
        </select>
        <button type="submit" id="submit-btn">Submit</button>
    </form>
    <div id="loading">Loading...</div>
    <div id="result">
        <textarea id="markdown-raw" readonly></textarea>
        <button id="copy-btn">Copy to clipboard</button>
        <iframe id="markdown-rendered" sandbox></iframe>
    </div>

    <script>
        const urlForm = document.getElementById('url-form');
        const urlInput = document.getElementById('url-input');
        const formatSelect = document.getElementById('format-select');
        const loading = document.getElementById('loading');
        const result = document.getElementById('result');
        const markdownRaw = document.getElementById('markdown-raw');
        const copyBtn = document.getElementById('copy-btn');
        const markdownRendered = document.getElementById('markdown-rendered');

        urlForm.addEventListener('submit', async (e) => {
            e.preventDefault();
            const url = urlInput.value;
            const format = formatSelect.value;
            if (!url) return;

            loading.style.display = 'block';
            result.style.display = 'none';

            try {
                let fetchOptions = {
                    method: 'GET'
                };

                if (format !== 'llm_markdown') {
                    fetchOptions.headers = {
                        "X-Return-Format": format
                    };
                }

                const response = await fetch(`https://r.jina.ai/${url}`, fetchOptions);
                const content = await response.text();
                markdownRaw.value = content;

                let htmlContent;
                if (format === 'html') {
                    htmlContent = content;
                } else {
                    htmlContent = `
                        <html>
                            <head>
                                <style>
                                    body { font-family: Helvetica, sans-serif; line-height: 1.6; color: #333; }
                                    img { max-width: 100%; height: auto; }
                                </style>
                            </head>
                            <body>
                                ${format === 'text' ? `<pre>${content}</pre>` : marked.parse(content)}
                            </body>
                        </html>
                    `;
                }
                markdownRendered.srcdoc = htmlContent;
                result.style.display = 'block';
            } catch (error) {
                console.error('Error fetching content:', error);
                markdownRaw.value = 'Error fetching content. Please try again.';
                markdownRendered.srcdoc = '<p>Error fetching content. Please try again.</p>';
                result.style.display = 'block';
            } finally {
                loading.style.display = 'none';
            }
        });

        copyBtn.addEventListener('click', () => {
            markdownRaw.select();
            document.execCommand('copy');
            
            const originalText = copyBtn.textContent;
            copyBtn.textContent = 'Copied';
            
            setTimeout(() => {
                copyBtn.textContent = originalText;
            }, 1500);
        });
    </script>
</body>
</html>

I've made the following changes to the HTML file:

  1. Added a select box with the options you specified:

    <select id="format-select">
        <option value="markdown">Markdown</option>
        <option value="html">HTML</option>
        <option value="text">Text</option>
        <option value="llm_markdown">LLM Markdown</option>
    </select>
  2. Updated the CSS to accommodate the new select box and adjust the layout:

    #url-input {
        width: 50%;
    }
    #format-select {
        width: 20%;
    }
  3. Modified the JavaScript to handle the different format options:

    const format = formatSelect.value;
    
    let fetchOptions = {
        method: 'GET'
    };
    
    if (format !== 'llm_markdown') {
        fetchOptions.headers = {
            "X-Return-Format": format
        };
    }
    
    const response = await fetch(`https://r.jina.ai/${url}`, fetchOptions);
  4. Updated the content rendering logic to handle different formats:

    let htmlContent;
    if (format === 'html') {
        htmlContent = content;
    } else {
        htmlContent = `
            <html>
                <head>
                    <style>
                        body { font-family: Helvetica, sans-serif; line-height: 1.6; color: #333; }
                        img { max-width: 100%; height: auto; }
                    </style>
                </head>
                <body>
                    ${format === 'text' ? `<pre>${content}</pre>` : marked.parse(content)}
                </body>
            </html>
        `;
    }

These changes allow users to select the desired format for the content retrieval and display the results accordingly. The default option is set to "Markdown" as requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment