This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
This script demonstrates an optimized pipeline. | |
This is not full code, this is merely a snippet. | |
1. Gets the absolute list of filenames. | |
2. Builds a dataset from the list of filenames using from_tensor_slices() | |
3. Sharding is done ahead of time. | |
4. The dataset is shuffled during training. | |
5. The dataset is then parallelly interleaved, which is basically interleaving and processing multiple files (defined by cycle_length) to transform them to create TFRecord dataset. | |
6. The dataset is then prefetched. The buffer_size defines how many records are prefetched, which is usually the mini batch_size of the job. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
This snippet demonstrates a non optimized tf data pipeline. | |
This is not full code, this is merely a snippet. | |
1. Gets the absolute list of filenames. | |
2. Builds a dataset from the list of filenames using TFRecordDataset() | |
3. Create a new dataset that loads and formats images by preprocessing them. | |
4. Shard the dataset. | |
5. Shuffle the dataset when training. | |
6. Repeat the dataset. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hadoop Commands | |
# test code | |
cat testfile | ./mapper.py | sort | ./reducer.py | |
# run a job | |
hs mapper.py reducer.py input_folder output_folder | |
# view the results | |
hadoop fs -cat output_folder/part-00000 | less |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
╔════════════╦═════════════════╦════════╗ | |
║ CompanyName Categoricalvalue ║ Price ║ | |
╠════════════╬═════════════════╣════════║ | |
║ VW ╬ 1 ║ 20000 ║ | |
║ Acura ╬ 2 ║ 10011 ║ | |
║ Honda ╬ 3 ║ 50000 ║ | |
║ Honda ╬ 3 ║ 10000 ║ | |
╚════════════╩═════════════════╩════════╝ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def search(numbers, target, first, last): | |
mid = (first + last) // 2 | |
if first > last: | |
return -1 | |
elif target == numbers[mid]: | |
return mid | |
elif target < numbers[mid]: | |
return search(numbers, target, first, mid - 1) | |
else: | |
return search(numbers, target, mid + 1, last) |