Skip to content

anonymous /gist:a17f0e1dd4f63404c744 secret
Created

Embed URL

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
val postIDTags = postsXML.flatMap { line =>
// Matches Id="..." ... Tags="..." in line
val idTagRegex = "Id=\"(\\d+)\".+Tags=\"([^\"]+)\"".r
// // Finds tags like <TAG> value from above
val tagRegex = "&lt;([^&]+)&gt;".r
// Yields 0 or 1 matches:
idTagRegex.findFirstMatchIn(line) match {
// No match -- not a line
case None => None
// Match, and can extract ID and tags from m
case Some(m) => {
val postID = m.group(1).toInt
val tagsString = m.group(2)
// Pick out just TAG matching group
val tags = tagRegex.findAllMatchIn(tagsString).map(_.group(1)).toList
// Keep only question with at least 4 tags, and map to (post,tag) tuples
if (tags.size >= 4) tags.map((postID,_)) else None
}
}
// Because of flatMap, individual lists will concatenate
// into one collection of tuples
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.