- If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
- Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
- Pay particular attention to the number of partitions when using
flatMap
, especially if the following operation will result in high memory usage. TheflatMap
op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output offlatMap
to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
NTFS Write Support On OS X Mountain Lion | |
Posted on April 21, 2013 | |
Source: https://prateekvjoshi.com/2013/04/21/ntfs-write-support-on-os-x-mountain-lion/ | |
If you have noticed, Mac OS X doesn’t support writing onto NTFS disks. But not to worry, you don’t have to install any third party drivers to enable this. Mountain Lion 10.8.3 already has native write support for the NTFS. OSX Mountain Lion does have built-in support for NTFS, and it can read and write. However, Apple does not enable it by default. | |
Here is what you should do: | |
Uninstall other 3rd-party NTFS software, like Paragon, Tuxera or NTFS-3G. | |
Edit /etc/fstab (you can do this with “sudo vi /etc/fstab”) | |
Add the following line: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#sudo apt-get remove maven2 | |
sudo add-apt-repository "deb http://ppa.launchpad.net/natecarlson/maven3/ubuntu precise main" | |
sudo apt-get update | |
sudo apt-get install maven3 | |
#If you encounter this: | |
#The program 'mvn' can be found in the following packages: | |
# * maven | |
# * maven2 |
Sometimes you want to have a subdirectory on the master
branch be the root directory of a repository’s gh-pages
branch. This is useful for things like sites developed with Yeoman, or if you have a Jekyll site contained in the master
branch alongside the rest of your code.
For the sake of this example, let’s pretend the subfolder containing your site is named dist
.
Remove the dist
directory from the project’s .gitignore
file (it’s ignored by default by Yeoman).