Last active
July 18, 2018 10:32
-
-
Save sfat/001999a04d05fbe539ca7edb488b3711 to your computer and use it in GitHub Desktop.
Html Parser using Jsoup
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class HtmlParser { | |
public static void main(String[] args) throws IOException { | |
convertXML(); | |
} | |
public static void convertXML() throws IOException { | |
String getUrl = "https://economictimes.indiatimes.com/motilal-oswal-long-term-equity-fund--direct-plan/mfportfolio/schemeid-29162.cms"; | |
Document document = Jsoup.connect(getUrl).get(); | |
document.outputSettings().syntax(Document.OutputSettings.Syntax.xml); | |
System.out.println(document.html()); | |
W3CDom w3cDom = new W3CDom(); | |
org.w3c.dom.Document w3cDoc = w3cDom.fromJsoup(document); | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi
Will print html content but after this w3cDoc contain value [#document: null]
Can you little bit guide if html to well formed xml is possible through api?