Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?

(My)SQL Batch Multi-Line Insert Data Export/Extracter for DataGrip

This is a better version of the multi-line insert script for DataGrip, than what is currently floating around.

Compatibility

This is likely compatible with most, if not all, other SQL dialects. However, I am only choosing to directly support & test MySQL.

If you want to make this compatible with other SQL dialects go ahead and fork this and add support yourself and mention it in the comments. I will NOT add support for other dialects. (It's a gist, not a repo that I want to continue to support forever)

Install (DataGrip)

  1. Go to the "Data Extracters" drop down. (in results bar)
  2. Select "Go to Scripts Directory"
  3. Create a new file called "SQL Batch Multi-Line Insert.sql.groovy" (The double extension is important for syntax highlighting)
  4. Paste & save the below script
  5. (optional) Adjust the configuration variables at the top to your liking

Features

  • Batch inserts (defaults to 1000 rows at a time)
  • Minify
  • Custom indentation
  • etc. (see the configuration variables)
/**
* MySQL Multi-Line Insert
* =======================
* See: https://gist.github.com/ProjectCleverWeb/d2362b082af1d7054ebfd464f202ec1b
*
* Notes:
* - I used a output buffer variable since in most languages variables are
* significantly faster than function calls and in this case the syntax for
* using a variable was both shorter and easier to read.
* - Table & column names are quoted to help improve compatibility, but in most
* cases it safe to remove them.
* - Having keywords set to "do not change" for the letter-case will assume
* lowercase keywords. (DataGrip determines this)
*
* @author Nicholas Summers
* @license MIT
*
* Available context bindings:
* COLUMNS List<DataColumn>
* ROWS Iterable<DataRow>
* OUT { append() }
* FORMATTER { format(row, col); formatValue(Object, col) }
* plus ALL_COLUMNS, TABLE, DIALECT
*
* where:
* DataRow { rowNumber(); first(); last(); data(): List<Object>; value(column): Object }
* DataColumn { columnNumber(), name() }
*/
BATCH_SIZE = 1000
DEFAULT_TABLE_NAME = "TABLE_NAME"
MINIFY = false
COLUMNS_ON_ONE_LINE = true
SHOW_CONFIG = true
INDENT = "\t"
VALUE_QUOTE = "\'"
COLUMN_QUOTE = "`"
TABLE_QUOTE = "`"
REAL_EOL = System.getProperty("line.separator") // End of line delimiter
EOL = REAL_EOL
OB = "" // Output Buffer Variable
RI = 0 // Row Iterator Variable
BI = 0 // Batch Iterator Variable
VALUE_DELIMITER = ","
COLUMN_DELIMITER = ","
STATEMENT_DELIMITER = ";"
COLUMN_INDENT = INDENT
ROW_INDENT = ""
KEYWORDS_LOWERCASE = com.intellij.database.util.DbSqlUtil.areKeywordsLowerCase(PROJECT)
KW_INSERT_INTO = KEYWORDS_LOWERCASE ? "insert into " : "INSERT INTO "
KW_VALUES = KEYWORDS_LOWERCASE ? "values" : "VALUES"
KW_NULL = KEYWORDS_LOWERCASE ? "null" : "NULL"
KW_TRUE = KEYWORDS_LOWERCASE ? "true" : "TRUE"
KW_FALSE = KEYWORDS_LOWERCASE ? "false" : "FALSE"
LAST_INDEX = COLUMNS.size() - 1
// Main loop function
def mmli() {
ROWS.each { row ->
mmliPrintRow(COLUMNS, row)
}
OUT.append(
STATEMENT_DELIMITER + REAL_EOL + REAL_EOL +
"-- Inserts Created: " + RI + REAL_EOL +
"-- Batches Created: " + BI + REAL_EOL
)
}
// Generate table name, with parent if it is available
TABLE_NAME = ""
if (TABLE != null) {
TABLE_PARENT = TABLE.getParent().getName()
if (TABLE_PARENT != null && TABLE_PARENT != "") {
TABLE_NAME += TABLE_QUOTE + TABLE_PARENT + TABLE_QUOTE + "."
}
TABLE_NAME += TABLE_QUOTE + TABLE.getName() + TABLE_QUOTE
} else {
TABLE_NAME += TABLE_QUOTE + DEFAULT_TABLE_NAME + TABLE_QUOTE
}
// Show config in comments
if (SHOW_CONFIG) {
OB += "-- Batch Size: " + BATCH_SIZE + EOL
// OB += "-- Table: " + TABLE_NAME + EOL
OB += "-- Minify: " + (MINIFY ? "TRUE" : "FALSE") + EOL
OB += "-- Indent: '" + INDENT + "'" + EOL
OB += EOL
}
// Make extra whitespace empty strings to minify
if (MINIFY) {
EOL = ""
COLUMN_INDENT = ""
ROW_INDENT = ""
}
// Handles batching & printing the buffer.
def mmliPrintRow(columns, row) {
if (RI % BATCH_SIZE == 0) {
// Handle a beginning row
BI++ // Count batches
if (RI != 0) {
// Separate each new statement
OB += STATEMENT_DELIMITER + REAL_EOL + REAL_EOL
}
mmliHandleBeginStatement(columns)
} else {
// Handle a continuation row
OB += VALUE_DELIMITER + EOL
}
// Handle a row's column data
mmliHandleRow(columns, row)
// Print, clear buffer, & increment row count
OUT.append(OB)
OB = ''
RI++
}
// Print the beginning of an insert
def mmliHandleBeginStatement(columns) {
OB += "INSERT INTO " + TABLE_NAME + " (" + EOL + COLUMN_INDENT
// Loop through each column and print its name
columns.eachWithIndex { column, index ->
OB += COLUMN_QUOTE + column.name() + COLUMN_QUOTE
if (index != LAST_INDEX) {
OB += COLUMN_DELIMITER
if (!COLUMNS_ON_ONE_LINE) {
OB += EOL + COLUMN_INDENT
}
}
}
OB += EOL + ")" + EOL + "VALUES" + EOL
}
// Print the values to insert
def mmliHandleRow(columns, row) {
OB += ROW_INDENT + "(" // open new row
// Loop through each column and print its value
columns.eachWithIndex { column, index ->
def stringValue = FORMATTER.format(row, column)
def rawValue = row.value(column)
// Handle MySQL backslashes
if (DIALECT.getDbms().isMysql()) {
stringValue = stringValue.replace("\\", "\\\\")
}
stringValue = stringValue.replace(VALUE_QUOTE, VALUE_QUOTE + VALUE_QUOTE)
// Determine value type so it can be printed correctly
if (rawValue instanceof Boolean) {
OB += rawValue ? KW_TRUE : KW_FALSE
} else if (rawValue == null) {
OB += KW_NULL
} else if (rawValue.toString().isNumber()) {
OB += stringValue
} else {
OB += VALUE_QUOTE + stringValue + VALUE_QUOTE
}
// Delimit each value
if (index != LAST_INDEX) {
OB += VALUE_DELIMITER
}
}
OB += ")" // close new row
}
// Run
mmli()
@Sherpard
Copy link

Sherpard commented Nov 2, 2020

It has issues with boolean types, but other than that, it works wonderfully, Thank you!

@ProjectCleverWeb
Copy link
Author

ProjectCleverWeb commented Nov 2, 2020

@Sherpard - Can you describe what you mean when you say it has issues with boolean types?

As far as I am aware, MySQL only stores booleans as single-digit integers.

@Sherpard
Copy link

Sherpard commented Nov 4, 2020

@ProjectClever That's true, but the extractor takes a 'true' or 'false' with quotes, instead of a quoteless true/false (accepted) or a 0/1.

@ProjectCleverWeb
Copy link
Author

ProjectCleverWeb commented Nov 4, 2020

@Shepard - I cannot replicate the issue you are describing. Would you be able to generate a sample DDL I could test against?

Also, please make sure that the error in question isn't caused by a column that has a type of ENUM('TRUE','FALSE'). Enum's store their values as integers but always return strings when SELECTed.

@Sherpard
Copy link

Sherpard commented Nov 5, 2020

Sure, here you got
https://gist.github.com/Sherpard/114e762a9c57ae30638917c5eb583e39

data type being used is "bit", sorry for the confusion with boolean...

@ProjectCleverWeb
Copy link
Author

ProjectCleverWeb commented Nov 5, 2020

@Shepard - Thanks! That helped me find & fix the issue. Honestly, I wasn't even aware BIT was a column type. The script has been updated with that fix as well as a few other improvements.

@tamert
Copy link

tamert commented Oct 8, 2021

Youniverse mann.. thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment