Skip to content

Instantly share code, notes, and snippets.

@anatolebeuzon
Last active January 2, 2020 19:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anatolebeuzon/6c4a0be3d84a741d1ce41a61bfe8292c to your computer and use it in GitHub Desktop.
Save anatolebeuzon/6c4a0be3d84a741d1ce41a61bfe8292c to your computer and use it in GitHub Desktop.
Generating and printing a crossword of any size

Recently needed to create a giant crossword (4 sheets of A0 paper, basically 4 m^2 of crosswords). Quick how-to.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+               +               +               +               +
+               +               +               +               +
+               +               +               +               +
+               +               +               +               +
+      A0       +      A0       +      A0       +      A0       +
+               +               +               +               +
+               +               +               +               +
+               +               +               +               +
+               +               +               +               +
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Getting a crossword dataset

We need a list of words and their corresponding clues.

Crosswords are generally protected by copyright, and very few datasets are publicly available. However, a great dataset of historic New York Times crosswords is available here: https://github.com/cwang912/nyt-crossword.

Parse the dataset

The NYT dataset is in TSV format.

We need to:

  • Put it in a syntax that genxword (our crossword generator tool) will recognize
  • Remove some words which, oddly enough, don't have a corresponding clue

parse.go does this (go run parse.go). If you run into parsing issues, you might need to convert the \r line endings to \r\n.

We'll name the output clues_parsed.txt.

Generate the crossword puzzle

genxword is a great tool for that. Output format: SVG (s option) so that we can use other tools to deal with size and print layout.

genxword -n 2000 /path/to/clues_parsed.txt s

genxword will interactively ask grid size.

A0 paper is 1189 x 841 mm. Say that we want each crossword box to be about 20 x 20 mm. That means:

  • floor(1189 / 20) = 59 rows
  • floor(4 * 841 / 20) = 168 columns

Take a bit less to account for print margins.

Generate a PDF from this SVG

Using cairosvg to create a PDF from the previously created SVG.

Let's say we want a 300 dpi PDF, basic calculations using A0 paper size (1189 x 841 mm) give us:

  • width = 37795 points
  • height = 13583 points
cairosvg -f pdf -d 300 -W 37795 -H 13583 -s 1 -o cairo_output.pdf genxword_output.svg

Divide this PDF into a multi-page PDF

This giant crossword will actually be printed on multiple A0 sheets. pdfposter can divide it up into multiple pages.

Input size is four A0 sheets (-p 4x1a0) and output page size is A0 (-m a0):

pdfposter -m a0 -p 4x1a0 cairo_output.pdf pdfposter_output.pdf

Add margins [optional]

You may want to add print margins to the document. Depending on the A0 printer you'll use, you might not need this.

Using pdfjam, and adjusting 0.3cm to whatever margin your printer needs:

pdfjam --paper a0paper --trim "-0.3cm -0.3cm -0.3cm -0.3cm" pdfposter_output.pdf -o pdfjam_output.pdf

TADA!

You can now print your giant crossword.

package main
import (
"bufio"
"encoding/csv"
"fmt"
"io"
"os"
)
const file = "clues.txt"
func main() {
in, err := os.Open(file)
if err != nil {
panic(err)
}
defer in.Close()
r := csv.NewReader(in)
r.Comma = '\t'
r.LazyQuotes = true
out, err := os.Create("clues_parsed.txt")
if err != nil {
panic(err)
}
defer out.Close()
w := bufio.NewWriter(out)
currentLine := 0
for {
line, err := r.Read()
if err != nil {
if err == io.EOF {
break
} else if err, ok := err.(*csv.ParseError); ok && err.Err == csv.ErrFieldCount {
fmt.Println(line)
panic(fmt.Errorf("csv.ErrFieldCount: expected %d, got %d", r.FieldsPerRecord, len(line)))
} else {
panic(err)
}
}
if line[0] == "" || line[1] == "" {
fmt.Printf("Ignored line %d with missing text\n", currentLine)
}
if currentLine > 0 {
w.WriteString(fmt.Sprintf("%s %s\n", line[1], line[0]))
}
currentLine++
}
w.Flush()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment