Skip to content
Create a gist now

Instantly share code, notes, and snippets.

anonymous /gist:a17f0e1dd4f63404c744 secret

val postIDTags = postsXML.flatMap { line =>
// Matches Id="..." ... Tags="..." in line
val idTagRegex = "Id=\"(\\d+)\".+Tags=\"([^\"]+)\"".r
// // Finds tags like <TAG> value from above
val tagRegex = "&lt;([^&]+)&gt;".r
// Yields 0 or 1 matches:
idTagRegex.findFirstMatchIn(line) match {
// No match -- not a line
case None => None
// Match, and can extract ID and tags from m
case Some(m) => {
val postID =
val tagsString =
// Pick out just TAG matching group
val tags = tagRegex.findAllMatchIn(tagsString).map(
// Keep only question with at least 4 tags, and map to (post,tag) tuples
if (tags.size >= 4),_)) else None
// Because of flatMap, individual lists will concatenate
// into one collection of tuples
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.