Michał Siatkowski atais

## brew-perms.sh
#!/bin/sh
# Configure homebrew permissions to allow multiple users on MAC OSX.
# Any user from the admin group will be able to manage the homebrew and cask installation on the machine.

# allow admins to manage homebrew's local install directory
chgrp -R admin /usr/local
chmod -R g+w /usr/local

# allow admins to homebrew's local cache of formulae and source files
chgrp -R admin /Library/Caches/Homebrew

## Logstash.xml
<filetype binary="false" description="Logstash Config" name="Logstash Config">
  <highlighting>
    <options>
      <option name="LINE_COMMENT" value="#" />
      <option name="COMMENT_START" value="" />
      <option name="COMMENT_END" value="" />
      <option name="HEX_PREFIX" value="" />
      <option name="NUM_POSTFIXES" value="" />
      <option name="HAS_BRACES" value="true" />
      <option name="HAS_BRACKETS" value="true" />

## spark_tips_and_tricks.md

      
              1 file
            
          
              20 forks
            
          
              1 comment
            
          
              74 stars
            
          
                dusenberrymw
                / spark_tips_and_tricks.md
            
            
              Last active
              February 8, 2023 05:11
            
              
                Tips and tricks for Apache Spark.
              
          
    Spark Tips & Tricks

Misc. Tips & Tricks


If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding).  Always err on the higher side w.r.t. number of partitions.
Pay particular attention to the number of partitions when using flatMap, especially if the following operation will result in high memory usage. The flatMap op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output of flatMap to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the


## learning_scala.org

      
              1 file
            
          
              45 forks
            
          
              6 comments
            
          
              353 stars
            
          
                d1egoaz
                / learning_scala.org
            
            
              Last active
              November 5, 2023 21:39
            
          
    If you want to learn Scala, I’d suggest:


  Install IntelliJ + Scala Plugin
  Don’t do the Coursera courses yet.
  Don’t do the “red book”  Functional Programming in Scala yet.
  Do: http://underscore.io/books/
    
      Essential Scala
      Essential Play
      Essential Slick
    
  
  Do Scala for the Impatient: https://www.amazon.com/Scala-Impatient-Cay-S-Horstmann/dp/0321774094
  The Neophyte’s Guide to Scala http://danielwestheide.com/scala/neophytes.html
	#!/bin/sh
	# Configure homebrew permissions to allow multiple users on MAC OSX.
	# Any user from the admin group will be able to manage the homebrew and cask installation on the machine.

	# allow admins to manage homebrew's local install directory
	chgrp -R admin /usr/local
	chmod -R g+w /usr/local

	# allow admins to homebrew's local cache of formulae and source files
	chgrp -R admin /Library/Caches/Homebrew
	<filetype binary="false" description="Logstash Config" name="Logstash Config">
	<highlighting>
	<options>
	<option name="LINE_COMMENT" value="#" />
	<option name="COMMENT_START" value="" />
	<option name="COMMENT_END" value="" />
	<option name="HEX_PREFIX" value="" />
	<option name="NUM_POSTFIXES" value="" />
	<option name="HAS_BRACES" value="true" />
	<option name="HAS_BRACKETS" value="true" />