Skip to content

Instantly share code, notes, and snippets.

@poluektik
Last active March 12, 2020 11:40
Show Gist options
  • Save poluektik/5e54c2117f5db692c02c6472a5866d46 to your computer and use it in GitHub Desktop.
Save poluektik/5e54c2117f5db692c02c6472a5866d46 to your computer and use it in GitHub Desktop.
object SparkHadoopWriter extends Logging {
… …
/**
  * Basic work flow of this command is:
  * 1. Driver side setup, prepare the data source and hadoop configuration for the write job to
  *   be issued.
  * 2. Issues a write job consists of one or more executor side tasks, each of which writes all
  *   rows within an RDD partition.
  * 3. If no exception is thrown in a task, commits that task, otherwise aborts that task;  If any
  *   exception is thrown during task commitment, also aborts that task.
  * 4. If all tasks are committed, commit the job, otherwise aborts the job;  If any exception is
  *   thrown during job commitment, also aborts the job.
  */
def write[K, V: ClassTag]...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment