Skip to content

anonymous /gist:a17f0e1dd4f63404c744 secret

Embed URL


Subversion checkout URL

You can clone with
Download ZIP
val postIDTags = postsXML.flatMap { line =>
// Matches Id="..." ... Tags="..." in line
val idTagRegex = "Id=\"(\\d+)\".+Tags=\"([^\"]+)\"".r
// // Finds tags like <TAG> value from above
val tagRegex = "&lt;([^&]+)&gt;".r
// Yields 0 or 1 matches:
idTagRegex.findFirstMatchIn(line) match {
// No match -- not a line
case None => None
// Match, and can extract ID and tags from m
case Some(m) => {
val postID =
val tagsString =
// Pick out just TAG matching group
val tags = tagRegex.findAllMatchIn(tagsString).map(
// Keep only question with at least 4 tags, and map to (post,tag) tuples
if (tags.size >= 4),_)) else None
// Because of flatMap, individual lists will concatenate
// into one collection of tuples
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.