Skip to content

Instantly share code, notes, and snippets.

@roehst

roehst/README.md Secret

Last active August 13, 2024 17:54
Show Gist options
  • Save roehst/003c3b24b60f7e19f91929bcb688af34 to your computer and use it in GitHub Desktop.
Save roehst/003c3b24b60f7e19f91929bcb688af34 to your computer and use it in GitHub Desktop.
The Universal Server

Here's a README.md file for your project:


LLM Wiki Generator

Overview

The LLM Wiki Generator is a dynamic web server built with Python's standard library and an AI-powered language model (LLM) from Ollama. This project allows you to generate Wikipedia-like web pages on-the-fly, where each URL path corresponds to a unique article title. The content is generated by the llama3:8b model, making each visit to the server an opportunity to explore fresh, AI-created content.

Features

  • Dynamic Content Generation: Generates HTML content dynamically based on the request path.
  • AI-Powered: Utilizes the llama3:8b language model from Ollama for content creation.
  • Real-time Debugging: Includes a debug mode to stream AI responses directly to the console.
  • Simple and Portable: Built using Python's http.server for easy setup and portability.

Installation

Prerequisites

  • Python 3.x
  • Ollama installed on your system
  • Python package for Ollama

Steps

  1. Clone the Repository

    git clone https://github.com/your-username/llm-wiki-generator.git
    cd llm-wiki-generator
  2. Install Ollama and the Required Model

    Follow the instructions on Ollama's website to install Ollama and the llama3:8b model.

  3. Install the Ollama Python Package

    pip install ollama

Usage

  1. Run the Server

    Start the server by executing the following command:

    python server.py
  2. Access the Web Pages

    Open your browser and navigate to http://localhost:9000/. Try different URLs like /Python, /AI, /help, or /login to see dynamically generated content.

  3. Debug Mode

    If you want to see the AI response streamed in real-time, set the DEBUG flag to True in the server.py file.

Code Overview

  • server.py: The main script that sets up the HTTP server and interacts with the LLM to generate content.
  • extract_html_block_from_chat_response(response): A utility function to extract the HTML content from the LLM's response.
  • generate_webpage_from_request_path(request_path): The core function that sends prompts to the LLM and retrieves the generated HTML content.

Contributing

Contributions are welcome! Please comment below and suggest changes.

Acknowledgments

  • Ollama for providing the language model.
  • Python's http.server for the simple yet effective HTTP server capabilities.
<!DOCTYPE html>
<html>
<head>
<title>{self.path}</title>
<script src="https://cdn.tailwindcss.com"></script>
<style type="text/tailwindcss">
p {
@apply text-lg font-sans;
}
h1 {
@apply text-3xl font-bold;
}
a {
@apply text-blue-500;
}
a:hover {
@apply text-blue-700;
}
a:visited {
@apply text-purple-500;
}
ul {
@apply list-disc list-inside;
}
li {
@apply mb-2;
}
body {
@apply bg-gray-100 max-w-7xl mx-auto px-4 py-8;
}
</style>
</head>
<body>
{inner_html}
</body>
</html>
# We will use the HTTP server included in Python to serve the generated HTML content, in order
# to make the code more portable and easier to run.
import http.server
# For interacting with a language model, we will use ollama,
# which is simple to use. You need to install
# ollama, pull a model, and then install the Python library.
import ollama
# I am using llama3:8b as model. Depending on your machine, you might
# want a smaller or larger model. You can find the list of models
# available at https://ollama.com/models. I found that
# llama3:8b works fine for this.
model = 'llama3:8b'
# A DEBUG flag to control whether we want to stream the response
# so we can preview it on the console. This is useful for debugging.
DEBUG = True
# When an LLM response is received, it will likely include
# the html content in a code block. This function extracts
# the HTML content from the response. Parsing outputs
# is almost always required when working with LLMs.
def extract_html_block_from_chat_response(response):
import re
html_block = re.search(r'```\n(.*)\n```', response, re.DOTALL)
if html_block:
return html_block.group(1)
else:
# Try to find the start of the HTML block, slurp the rest!
html_block = re.search(r'```\n(.*)', response, re.DOTALL)
if html_block:
return html_block.group(1)
else:
return response
# We ask the language model to generate HTML content for a given
# request path. Since our example is a Wikipedia-like website,
# the prompt instructs the model to generate HTML content for
# an article with the title of the request path. We also ask it
# to add links so we can browse around :)
def generate_webpage_from_request_path(request_path):
system_prompt = '''
You generate HTML content based on the request path.
The application is a Wikipedia-like website where the request path is the title of the article.
Remember to add links to other articles in the content.
Do not add script tags or favicon links.
Generate only the body of the page.
'''
if not DEBUG:
# The ollama chat API can be called in block mode (stream=False).
# When we interact with a chat model, we typically pass
# a list of messages that is equivalent to what we see
# in a chat interface.
# Each message has a role (system, user, assistant) and content.
# We can enrich the prompt by adding more messages; In this example,
# our prompting is simple.
response = ollama.chat(
model=model,
messages=[
dict(role='system', content=system_prompt),
dict(role='user', content=f'Generate HTML content for article "{request_path}"')
],
stream=False,
options=ollama.Options(
num_predict=1000,
)
)
html = response['message']['content']
if DEBUG:
# Stream mode can provide better UX when the response is large.
stream = ollama.chat(
model=model,
messages=[
dict(role='system', content=system_prompt),
dict(role='user', content=f'Generate HTML content for article "{request_path}"')
],
stream=True,
options=ollama.Options(
num_predict=1000,
)
)
html = ''
for chunk in stream:
html += chunk['message']['content']
# Show to console in DEBUG mode.
print(chunk['message']['content'], end='')
print('(Done)')
# Don't forget your output parsing! :)
return extract_html_block_from_chat_response(html)
# Server configuration
HOST = 'localhost'
PORT = 9000
# Server class
# should respond to any GET request with the requested path
class Handler(http.server.BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header('Content-type', 'text/html')
self.end_headers()
inner_html = generate_webpage_from_request_path(self.path)
with open('template.html', 'r') as f:
html = f.read()
# treat html as an f-string
html = html.replace('{inner_html}', inner_html)
html = html.replace('{self.path}', self.path)
self.wfile.write(html.encode())
# Server initialization
httpd = http.server.HTTPServer((HOST, PORT), Handler)
# Server start
print(f'Starting server at http://{HOST}:{PORT}')
try:
httpd.serve_forever()
except KeyboardInterrupt:
print('\nServer stopped')
# Just hop on to http://localhost:9000/ in your browser to see the generated content.
# Try different article urls, and see how the model generates content for them.
# Also try to hit different routes like /help, /login, etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment