Skip to content

Instantly share code, notes, and snippets.

View sameh-sharaf's full-sized avatar

Sameh Sharaf sameh-sharaf

  • Bangkok, Thailand
View GitHub Profile

Databricks Delta Lake - A Friendly Intro

This article introduces Databricks Delta Lake. A revolutionary storage layer that brings reliability and improve performance of data lakes using Apache Spark.

First, we'll go through the dry parts which explain what Apache Spark and data lakes are and it explains the issues faced with data lakes. Then it talks about Delta lake and how it solved these issues with a practical, easy-to-apply tutorial.

Introduction to Apache Spark

If you don't know what Spark is, Apache Spark is a large-scale data processing and unified analytics engine for big data and machine learning. It was originally developed at UC Berkeley in 2009. Apache Spark is 100% open source, hosted at the vendor-independent Apache Software Foundation.

@sameh-sharaf
sameh-sharaf / docker.md
Last active December 18, 2019 04:43
Your Humble Docker Guide

Your Humble Docker Guide

This guide gathers the basic commands every Docker user will frequently need. Personally, I keep getting back to it from time to time when my Docker knowledge gets rusty. Hope it helps you too!

List downloaded or created images

docker images

Delete image

@sameh-sharaf
sameh-sharaf / dice_game.rb
Created September 12, 2016 09:40
This is a game where 4 players have 6 dices each. They throw the dices each time. Dices with 6 will be removed. Dices with 1 will be added to player's dices in the right.
##
#
# This is a game where 4 players have 6 dices each. They throw the dices each time.
# Dices with 6 will be removed. Dices with 1 will be added to player's dices in the right.
#
##
class DiceGame
private
@sameh-sharaf
sameh-sharaf / poll.php
Created September 12, 2016 09:38
A poll class written in PHP to manage poll voting and view on page.
<?php
class Vote
{
private $username = "root";
private $password = "";
private $database = "";
private $barWidth = 200;
private $barHeight = 10;
private $showtablewidth = "280";
private $showAddForm = 400;
@sameh-sharaf
sameh-sharaf / stream-s3.js
Created September 12, 2016 09:32
This code was meant to be uploaded to AWS Lambda in order to stream dynamoDB records to be written to S3 in order to be loaded later in AWS Redshift.
sprintf = require('sprintf').sprintf;
/* Tables their INSERT commands will be written to CSV file
'articles' and 'range_test' are sample tables to test the streamer.
*/
var tables = {
'articles': ['id', 'title', 'body', 'author', 'publish_date'],
'range_test': ['hash_id', 'range_id', 'item', 'quantity'],
};
@sameh-sharaf
sameh-sharaf / reports-form.tag
Created September 12, 2016 09:28
This page shows a form using Semantic UI library. The form is used to upload reports SQL scripts to system in order to show report content in reports/view page. The form uses codemirror library which is a code-supported text area. It is used by <text-codemirror>, a tag created in a separate file (text-codemirror.tag)
<!--
This page shows a form using Semantic UI library.
The form is used to upload reports SQL scripts to system in order to show
report content in reports/view page.
The form uses codemirror library which is a code-supported text area. It is
used by <text-codemirror>, a tag created in a separate file (text-codemirror.tag)
-->
<reports-form>
@sameh-sharaf
sameh-sharaf / stream-tracker.go
Created September 12, 2016 09:21
This code is from an API code which tracks sessions and links them to their respective personal profiles. It connects to AWS DynamoDB which stores link account-session in an already created table.
package main
import (
"fmt"
"net/http"
"os"
"strings"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/service/dynamodb"