Hey friends!
Welcome to my Google Summer of Code Final Evaluation Gist! 🔆
I have been working with Activeloopai under The Python Software Foundation on their product Hub. My Project: Auto generation of Schema. (hub auto)
In brief my task was to work on a method to automatically ingest datasts on the Hub platform. After 10 weeks of an epic coding journey, my project is complete.
I have worked on 2 major APIs:
- ingest(): Automatically ingest datasets to Hub platform API
- ingest_kaggle(): Download and ingest datasets to Hub platform API
Additionally, I have worked on some features that enhance the Hub auto experience:
- Auto Compression: Automatically figure out the comrpession type of datasets before ingestion
- Ingestion Summary: Prints a tree like Ingestion summary post ingestion
(All the features + unit tests written by me were reviewed by my mentor and the team before they were merged)
Here i have listed all the contirbutions I have made in the past 10 weeks of working with @activeloopai and @ThePSF.
List of PRs [Merged]: 🚀
- storage benchmark tests for cache read/write
- Integrate hub auto + kaggle
- kaggle argument fix
- add --kaggle functionality
- Auto compression
- Ingestion summary
In Review [as of 21st August, 2021]: 👨🏻💻
Thought I'd mention some of my PRs that didn't make it to the main branch: 🧐
- Removing pickle
- Depickle and convert index_map to list
- depickle, add index_map, fix storage benchmarks
- depickle, add index_map, fix storage benchmarks
- Feature/2.0/auto detect compression
I have also been consistent in documenting my journey at The Python Software Foundation's blogging platform: You can read all my blogs here
I have been tweeting pretty consistenly too, reach out to me on Twitter
I would like to extend my gratitude to my mentor Dyllan McCreary for showing me the light and being an amazing mentor
and last but not the least, thanks to the Google Summer of Code team for an experience of a lifetime!