Skip to content

Instantly share code, notes, and snippets.

@josep2
Last active August 14, 2017 00:11
Show Gist options
  • Save josep2/0bc3c7161d9995d7c98cd83027021441 to your computer and use it in GitHub Desktop.
Save josep2/0bc3c7161d9995d7c98cd83027021441 to your computer and use it in GitHub Desktop.
import java.sql._
// Start with a DataFrame and lower the number of coalesce to the number of machines I have
dataframe.coalesce(4).mapPartitions ((d) => Iterator (d) ).foreach {
batch => // Per parition, create a JDBC connection
val dbc: Connection = DriverManager.getConnection ("JDBCURL")
val st: PreparedStatement = dbc.prepareStatement ("YOUR PREPARED STATEMENT")
// Within the partition figure out what batch I'd like to write to MySQL
batch.grouped ("# Of Rows you want per batch").foreach {
session => // Add each of those
session.foreach {
x =>
st.setDouble (1, x.getDouble (1) )
st.addBatch ()
}
// Execute batch command
st.executeBatch ()
}
// Close the connection
dbc.close ()
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment