Skip to content

Instantly share code, notes, and snippets.

@jhy
Created July 26, 2010 22:59
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save jhy/491407 to your computer and use it in GitHub Desktop.
Save jhy/491407 to your computer and use it in GitHub Desktop.
// example solution to remove comments from HTML.
// re: http://groups.google.com/group/jsoup/browse_thread/thread/419b5ac4be88b086
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Node;
public class RemoveComments {
public static void main(String... args) {
String h = "<div><!-- no --><p>Hello<!-- gone --></div>";
Document doc = Jsoup.parse(h);
removeComments(doc);
print(doc.html());
}
private static void removeComments(Node node) {
// as we are removing child nodes while iterating, we cannot use a normal foreach over children,
// or will get a concurrent list modification error.
int i = 0;
while (i < node.childNodes().size()) {
Node child = node.childNode(i);
if (child.nodeName().equals("#comment"))
child.remove();
else {
removeComments(child);
i++;
}
}
}
private static void print(String msg) {
System.out.println(msg);
}
}
@akmishra30
Copy link

Nice work. You saved my lots of time..

Thanks dear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment