Here, I'm trying to explain how various file systems (hdfs, s3, emrfs) interacts with hadoop. Understanding this would help address some of the tricky problems arise during development process, e.g. authentication & performance issues. Hadoop file systems nowadays support a variety of applications. Specifically, I'll focus on EMRFS and Spark.
Given a URI (e.g. s3://mybucket/objectname). Spark interacts with hadoop file system API through DataSource.write function.
val caseInsensitiveOptions = new CaseInsensitiveMap(options)