Skip to content

Instantly share code, notes, and snippets.

@steven-tey
Last active July 29, 2024 20:57
Show Gist options
  • Save steven-tey/4319a368d7fa4976493a66124e106050 to your computer and use it in GitHub Desktop.
Save steven-tey/4319a368d7fa4976493a66124e106050 to your computer and use it in GitHub Desktop.
Get Title from URL
// Note: this gist is a part of this OSS project that I'm currently working on: https://github.com/steven-tey/dub
export default async function getTitleFromUrl (url: string) {
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 2000); // timeout if it takes longer than 2 seconds
const title = await fetch(url, { signal: controller.signal })
.then((res) => {
clearTimeout(timeoutId);
return res.text();
})
.then((body: string) => {
let match = body.match(/<title>([^<]*)<\/title>/); // regular expression to parse contents of the <title> tag
if (!match || typeof match[1] !== "string") return "No title found"; // if no title found, return "No title found"
return match[1];
})
.catch((err) => {
console.log(err);
return "No title found"; // if there's an error, return "No title found"
});
return title;
};
@Marcisbee
Copy link

Can you give more detail? Why doesn't this work with CORS?

Sure. Given that this code can be run in node and in browser, browser will try to respect CORS rules on target website.
Hence running this for example: await getTitleFromUrl('https://www.kickstarter.com/projects/81monkeys/world-of-anterra') would result in an error and return "No title found".

@joshdance
Copy link

@Marcisbee thanks, that makes sense.

@steven-tey
Copy link
Author

@Marcisbee @joshdance Oh yeah, good point, I should've mentioned that in the demo, I'm calling a Next.js Edge API Route that runs the getTitleFromURL function. Here's the full code for that:

// /api/utils/title-from-url.ts

import type { NextRequest } from "next/server";
import { getTitleFromUrl } from "@/lib/utils";

export const config = {
  runtime: "experimental-edge",
};

export default async function handler(req: NextRequest) {
  if (req.method === "GET") {
    const url = req.nextUrl.searchParams.get("url");
    if (!url) {
      return new Response("Missing url", { status: 400 });
    }
    const title = await getTitleFromUrl(url);
    return new Response(JSON.stringify(title), { status: 200 });
  } else {
    return new Response(`Method ${req.method} Not Allowed`, { status: 405 });
  }
}

And you'll basically call the /api/utils/title-from-url endpoint on the client side with SWR/fetch.

Hope that clarifies things!

@pablomikel
Copy link

Super cool gist! I ran into an issue when sites' titles had attributes inside (common for sites that use react-helmet) so I'd propose this fix to get around that problem:

.then((body: string) => {
      const match = body.match(/<title([^<]*)>([^<]*)<\/title>/) // regular expression to parse contents of the <title> tag
      if (!match || typeof match[match.length - 1] !== "string")
        return "No title found" // if no title found, return "No title found"
      return match[match.length - 1]
    })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment