Skip to content

Instantly share code, notes, and snippets.

@sandrabosk
Last active September 30, 2020 14:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save sandrabosk/2ecd152565c0e435a891b3eb0ffa3a33 to your computer and use it in GitHub Desktop.
Save sandrabosk/2ecd152565c0e435a891b3eb0ffa3a33 to your computer and use it in GitHub Desktop.

Data files cleanup

Overall

  • all docx to be removed or moved to some place since they are replaced with md files - Sandra
  • how we will distribute to students files they need for in-class activities? (to be discussed with everyone on Monday, 28/09)

Unit 1

lesson 1.1

  • reference slides: do we need all the slides prefixed with [old]? SOFI

lesson 1.2

  • where do we use file1.xlsx, file2.xlsx, file3.xlsx, file4.xlsx ? MANSH I fixed this one - Sandra
  • where do we use validation_table.txt? MANSH
  • where do we point instructor to use tasks_lesson_1.2.md (previous tasks.docx)? MANSH I fixed this one - Sandra
  • do we need: vlookup_table.csv? MANSH

lesson 1.3

  • which files are being used in lesson? there are 2 file1.csv (inside files_for_labs and outside of that folder) - do we need both? MANSH
  • do we need: df_final_web_data_pt_1.csv? MANSH
  • do we need: merger_clean_ver2.csv? MANSH

lesson 1.4

  • do we need: merger_clean_ver2.csv? MANSH
  • duplicate in ipynb? MANSH

lesson 1.5

  • do we need: excel_regression_data1_copy.xls? MANSH
  • do we need: regression_data1.csv? (regression_data1.csv is actually used in lesson 1.6 and in lesson 1.5 there is file named regression_data___.csv?) MANSH
  • duplicate in ipynb? MANSH

lesson 1.6

  • are the slides the same - do we need both? SOFI

lesson 1.7

  • any blocker for a bit weird file name: regression_data___.csv? I didn’t change it since it requires renaming it in Jupyter notebook as well MANSH
  • duplicate in ipynb? MANSH

lesson 1.8

  • do we need: raw_data_regression.csv? MANSH

lesson 1.9

  • do we need: mysql_dump.sql? MANSH

  • what is this folder and inside files: data_analysis_notes.rtfd? where do we use it? MANSH

  • inside labs folder, there is data folder with marketing_customer_analysis.csv file. do we need this file? is it the same file that is already inside other folders? do we need multiple copies of the same file? MANSH

Unit 2

  • shall we explain to instructors what is additional_practice_sql_select_aggregations.md? MANSH SOFI

Unit 3

lesson 3.7

  • where do we use 3.7.4_multi_class_classification_models_2.pptx? it says in the lesson to refer to intro to sql slides. MANSH [comment added to the lesson]

Unit 4

lesson 4.1

  • where do we use: Unit4_case_study.docx? MANSH [comment added to the lesson]

lesson 4.5

  • do we need both unit4.csv and lesson_4.05_data.csv? it is confusing that notebook is using unit4.csv and the code example in lesson.md is using the other file. MANSH [comment added to the lesson]

lesson 4.7

  • all files have extension .csv and in the example it seems we need .txt? MANSH

lesson 4.9

  • do we use: finalMergedFile.csv? MANSH
  • do we use: 4.9.4_pm_and_agile.pptx? MANSH SOFI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment