This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Purpose: Generate coffee shop sales data | |
# Author: Gary A. Stafford and GitHub Copilot | |
# Date: 2023-04-12 | |
# Usage: python3 coffee_shop_data_gen.py 100 | |
# Command-line argument(s): rec_count (number of records to generate as an integer) | |
# Write a program that creates synthetic sales data for a coffee shop. | |
# The program should accept a command line argument that specifies the number of records to generate. | |
# The program should write the sales data to a file called 'coffee_shop_sales_data.csv'. | |
# The program should contain the following functions: |
We can make this file beautiful and searchable if this error is corrected: It looks like row 10 should actually have 11 columns, instead of 7. in line 9.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"transaction_id","date","time","product_id","product","calories","price","type","quantity","amount","payment_type" | |
"47a157f84e727fe3335db1519ee736a6","06/27/2022","19:59:42",22,"Quiche",456,4.99,"Food",2,9.98,"Debit" | |
"1bf01013e699ca0f804650ea50826c82","11/20/2022","06:21:14",22,"Quiche",456,4.99,"Food",3,14.97,"Cash" | |
"84f41c15749090d1e79bf9a48a58d6c3","08/18/2022","11:50:22",14,"Chai Tea",200,3.5,"Drink",2,7.0,"Apple Pay" | |
"ef1845b8438bf3b5b99d2f4891a48f03","11/13/2022","17:20:51",12,"Lemonade",120,3.0,"Drink",2,6.0,"Debit" | |
"9863de11be3099d6361392584e30e624","06/03/2022","18:27:03",18,"Muffin",426,3.99,"Food",2,7.98,"Gift card" | |
"f50ed8878250bc06f66b97f5cd2f6df7","02/21/2022","17:02:18",7,"Hot Chocolate",300,3.5,"Drink",2,7.0,"Credit" | |
"1903169473f41a0275ee702f2c6b1dd6","05/24/2022","14:58:25",10,"Smoothie",200,4.0,"Drink",3,12.0,"Venmo" | |
"164a9519fd3db952e721e9f55dc1be74","01/07/2022","14:19:35",14,"Chai Tea",200,3.5,"Drink",2,7.0,"Debit" | |
"dc85a202143de48ad4646190cdc0bf5c","01/28/2022","08:52:38",20,"Wrap",388,5 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Purpose: Generate demographic data | |
# Author: Gary A. Stafford and GitHub Copilot | |
# Date: 2023-04-14 | |
# Usage: python3 demographic_data_gen.py 100 | |
# Command-line argument(s): rec_count (number of records to generate as an integer) | |
# Write an application that creates a file containing demographic data. | |
# The application should accept a command line argument that specifies the number of records to generate. | |
# The application should write the demographic data to a file called 'demographic_data.csv'. | |
# The application should contain the following functions: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
id | address | city | state | zip | country | property_type | assessed_value | |
---|---|---|---|---|---|---|---|---|
1 | 1008 Walk Burg | Houston | TX | 77002 | United States | Multi-family | 1122321 | |
2 | 7088 Second Square | Oklahoma City | OK | 73102 | United States | Single-family | 261940 | |
3 | 1425 Ridge Terrace | Indianapolis | IN | 46204 | United States | Single-family | 1030391 | |
4 | 982 Way Lane | New York | NY | 10007 | United States | Multi-family | 95499 | |
5 | 9404 Port Court | Columbus | OH | 43215 | United States | Single-family | 922404 | |
6 | 7135 Crossing Trail | Virginia Beach | VA | 23451 | United States | Single-family | 272910 | |
7 | 9481 Harbor Brook | New York | NY | 10007 | United States | Multi-family | 232795 | |
8 | 8585 Manor Branch | Raleigh | NC | 27601 | United States | Single-family | 701217 | |
9 | 7703 Bluff Boulevard | Las Vegas | NV | 89101 | United States | Single-family | 530581 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
user_id | first_name | last_name | dob | gender | martital_status | race | religion | |
---|---|---|---|---|---|---|---|---|
1 | Thomas | Powell | 1967-06-10 | Male | Married | Black | Christian | |
2 | Ward | Williams | 1973-07-22 | Male | Single | Asian | Christian | |
3 | Martha | Watson | 1975-02-28 | Feamle | Single | Hispanic | Agnostic | |
4 | Brenda | Bailey | 1979-07-07 | Feamle | Married | Black | Christian | |
5 | Parker | Johnson | 1955-07-14 | Male | Married | White | Christian | |
6 | Rebecca | Wilson | 1972-05-27 | Feamle | Married | White | Christian | |
7 | Doris | Allen | 1956-07-09 | Feamle | Married | Multiracial | Christian | |
8 | Rebecca | Sanchez | 1965-09-16 | Feamle | Single | White | Christian | |
9 | Mary | Johnson | 1971-04-04 | Feamle | Single | White | Christian |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Purpose: Generate coffee shop sales data | |
# Author: Gary A. Stafford and GitHub Copilot | |
# Date: 2023-04-12 | |
# Usage: python3 coffee_shop_data_gen_final.py 100 | |
# Command-line argument(s): rec_count (number of records to generate as an integer) | |
import csv | |
import random | |
from datetime import datetime, timedelta | |
import argparse |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Purpose: Test coffee shop sales data generator | |
# Author: Gary A. Stafford and GitHub Copilot | |
# Date: 2023-04-13 | |
# Usage: pytest coffee_shop_data_gen_tests.py -v | |
# write a python class that inherits from unittest.TestCase | |
# write a unit test for the get_product function | |
# write a unit test for the get_sales_record function | |
# write a unit test for the write_sales_records function |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Purpose: Creates an Amazon DynamoDB table, adds an item to the table, | |
gets that item from the table, and finally deletes the table | |
Author(s): Gary A. Stafford and GitHub Copilot | |
Created: 2023-03-26 | |
Usage: python3 github_copilot_test.py table_name | |
pytest github_copilot_test.py -v | |
""" | |
import boto3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# size of swapfile in megabytes | |
swapsize=512 | |
# does the swap file already exist? | |
grep -q "swapfile" /etc/fstab | |
# if not then create it | |
if [ $? -ne 0 ]; then |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DATA_LAKE_BUCKET="<your_data_lake_s3_bucket>" | |
TARGET_TABLE="tickit.ecomm.sale" | |
spark-submit \ | |
--name %{TARGET_TABLE} \ | |
--jars /usr/lib/spark/jars/spark-avro.jar,/usr/lib/hudi/hudi-utilities-bundle.jar \ | |
--conf spark.sql.catalogImplementation=hive \ | |
--conf spark.yarn.submit.waitAppCompletion=false \ | |
--class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls /usr/lib/hudi/hudi-utilities-bundle.jar` \ | |
--props file://${PWD}/${TARGET_TABLE}.properties \ |