Skip to content

Instantly share code, notes, and snippets.

@flavorjones
Created August 29, 2020 16:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save flavorjones/a6d6695a0744a77e0581439b58f24ba1 to your computer and use it in GitHub Desktop.
Save flavorjones/a6d6695a0744a77e0581439b58f24ba1 to your computer and use it in GitHub Desktop.
nokogiri support 2020-08-29
#! /usr/bin/env ruby
require 'nokogiri'
input = <<EOF
<title> The journey </title>
</head>
<body>
<h1> The <index class = "estimate"> Trip </index> </h1>
<p> Our scavenger hunt consists of several stages, starting with the <index class = "treasure"> crossing </index> & mdash; how do we actually get to those islands. The nice thing about an intellectual quest is that we can be in all kinds of places at the same time. If we ever want to revisit a previous episode, all we have to do is click there, and even though we haven't finished a particular stage yet, we can already look ahead to the next. We can also keep in touch with fellow travelers who are in completely different places in the world of thought.
<p> However, this can also easily confuse you. The solution of one problem leads to requirements and limitations that are imposed on subsequent answers, and those who have not found that first solution sometimes do not understand why certain later answers are not possible. With all our wandering around it is therefore important not to lose sight of <a href="../So/Place.htm"> the line of the trip </a>.
EOF
output_template = <<EOF
<! - saved from url = (0028) https://library.biep.org ->
<html lang = nl>
<head>
<meta HTTP-EQUIV = "Content-Type" CONTENT = "text / html; charset = windows-1252">
<script language = JavaScript src = "../../../ Sheet.js"> </script>
<link href = "../../../ Blad.css" rel = "stylesheet" type = "text / css">
</head>
<body>
</body>
</html>
EOF
fragment = Nokogiri::HTML::DocumentFragment.parse(input)
output = Nokogiri::HTML::Document.parse(output_template)
title = fragment.at_css("title")
title.remove
output.at_css("head").add_child(title)
output.at_css("body").add_child(fragment)
puts output
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html lang="nl">
# >> <head>
# >> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
# >> <script language="JavaScript" src="../../../%20Sheet.js"> </script>
# >> <link href="../../../%20Blad.css" rel="stylesheet" type="text / css">
# >> <title> The journey </title>
# >> </head>
# >> <body>
# >>
# >>
# >>
# >> <h1> The <index class="estimate"> Trip </index> </h1>
# >> <p> Our scavenger hunt consists of several stages, starting with the <index class="treasure"> crossing </index> &amp; mdash; how do we actually get to those islands. The nice thing about an intellectual quest is that we can be in all kinds of places at the same time. If we ever want to revisit a previous episode, all we have to do is click there, and even though we haven't finished a particular stage yet, we can already look ahead to the next. We can also keep in touch with fellow travelers who are in completely different places in the world of thought.
# >> </p>
# >> <p> However, this can also easily confuse you. The solution of one problem leads to requirements and limitations that are imposed on subsequent answers, and those who have not found that first solution sometimes do not understand why certain later answers are not possible. With all our wandering around it is therefore important not to lose sight of <a href="../So/Place.htm"> the line of the trip </a>.
# >> </p>
# >> </body>
# >> </html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment