Last active
March 4, 2018 16:35
-
-
Save arun-y/4be89e4d2c1c12e8e1400bed7edfbf20 to your computer and use it in GitHub Desktop.
Generating Spark Dataset<Row> from json string
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ArrayNode rootNode = (ArrayNode)objectMapper.readTree(input); | |
ObjectNode rootObject = (ObjectNode)rootNode.get(0); | |
ArrayList<StructField> stfList = new ArrayList<>(); | |
ArrayList<String> values = new ArrayList<>(); | |
rootObject.fields().forEachRemaining(e -> { | |
stfList.add(new StructField(e.getKey(), DataTypes.StringType, false, Metadata.empty())); | |
values.add(e.getValue().asText()); | |
}); | |
StructType st = new StructType(stfList.toArray(new StructField[] {})); | |
Row r0w = new GenericRow(values.toArray(new String[] {})); | |
List<Row> rows = new ArrayList<>(); | |
rows.add(r0w); | |
Dataset<Row> df = spark.createDataFrame(rows, st); | |
df.show(); |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment