Last active
August 28, 2019 00:45
-
-
Save simplymathematics/e5a6e1580641a8629c96bfa4dda72647 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Librerouter Testing and Coverage Mapping | |
Pros: Very useful, great portfolio piece, front-end/db/rest practice. Light C++/Sh development. Distributed development. | |
Cons: Hardware dependency, very hard to accurately test a large coverage map without deploying code to a target system | |
Data: From libremap.net, collected in real-time on target hardware, or simulated with qemu | |
Goal: Front-end application for displaying real-time service data as fed from target hardware | |
# Real-time Chat Bot | |
Pros: Fun hack, great portfolio piece, tensorflow RT practice | |
Cons: Expensive, no immediate value to market or open source project | |
Data: | |
Potentially: https://www.kaggle.com/rtatman/ubuntu-dialogue-corpus. | |
1 million ubuntu support chats between two people. | |
Goal: Ask a box in my room ubuntu questions and have it respond. Develop tooling for training its response at a user level. | |
# Facial Recognition and Combatting Sampling Bias | |
Pros: Very interesting topic. Datasets readily available. Potentially informative and novel research. | |
Cons: Doesn't fit into current portfolio or fit in long-term career goals (embedded data science). Not much tech practice (unless model includes tensorflow) | |
Data: | |
Potentially:https://www.nist.gov/srd/nist-special-database-18 | |
National Institute for Standards dataset for mugshots. | |
Potentially: https://lionbridge.ai/datasets/5-million-faces-top-15-free-image-datasets-for-facial-recognition/ | |
Potentially: http://robotics.csie.ncku.edu.tw/Databases/FaceDetect_PoseEstimate.htm | |
Other Faces in the wild | |
# Investigate causal relationships between technology and wealth | |
Pros: Finance, projection, analytics practice. | |
Cons: Dataset will be unreliable, could get non-interesting result, not much tech practice | |
Data: | |
Probably: Census data by tract or county level | |
Country Data: Did not allow for rigid financial analysis due to currency conversions and purchase parity questions | |
US Census State Data: Wasn't high enough resolution to do good control-based studies (manova, for example) | |
Goal: Investigate with high confidence the covariance relationship between wealth and internet access rates | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment