Skip to content

Instantly share code, notes, and snippets.

@hgamit
Last active August 4, 2020 04:24
Show Gist options
  • Save hgamit/8c6ff605764fc8b56e214faa9e044924 to your computer and use it in GitHub Desktop.
Save hgamit/8c6ff605764fc8b56e214faa9e044924 to your computer and use it in GitHub Desktop.
HG_Resume

Himanshu Gamit

2171 Grand Ave, Saint Paul, MN, USA
hmnshu@gmail.com, +1(651)-214-4908


Education

University of St. Thomas St. Paul, MN.
Jan. 2018 - Expected Dec. 2020
Master of Science, Data Science: GPA: 3.92

Statistics; Machine Learning; Artificial Intelligence; Big Data; Analytics; Visualization; Storytelling.

University of St. Thomas St. Paul, MN.
Jan. 2019 – May 2020
Graduate Certificate in Big Data and Artificial Intelligence: GPA: 3.92

Database Management Systems and Design, Data Analytics and Visualization, Big Data Engineering, Big Data Management, Statistics; Machine Learning; Artificial Intelligence; Big Data; Analytics; Visualization; Storytelling.

Sardar Vallabhbhai National Institute of Technology, Surat, India.
Aug. 2007 - Jul. 2011
Bachelor of Technology in Computer Science; GPA: 3.3

Coding Foundations; Data Structures, Computer Architecture; Comparison of Learning Algorithms; Computational Theory; Operating Systems; Data mining; Databases; Programming Languages; Engineering Entrepreneurship; Calculus III.

Summer Fellow at IISc Bangalore: Implemented Wireless Security Environment with Java Glomosim Library and redesigned Security structure to optimize WPA Compatible Network. (Java)

Experience

Cytilife Inc., St. Paul, MN, USA
Data Scientist, Oct 2019 – Present

  • Analytical Pipeline (AWS Web Services Stack): Built robust analytical infrastructure connecting structured and unstructured data sources from data sources such as (RDS, Click Stream and logs) to AWS Redshift, AWS EMR for ease of use by different analysts using Pandas, AWS Athena and QuickSight. (Amazon Kinesis Data Firehouse, AWS S3, AWS Lamda, Glue, DMS, Athena)
  • Cloud Computing: The AWS infrastructure for the real-time data analysis that involves visualizing and Forecasting Resource Utilization (100K to Millions of rows) of processed data on AWS S3. This helps the administration to investigate business insights from campus resource usage from IoT and Applications Data to determine space and equipment’s demands to take strategic budgeting and resource requirements decisions. (Python, SQL, Visualization)
  • Data Warehousing: Evaluating and Performing Grain Evaluation: Identified the granularity of each table and business process to drill down to very important dimensions and measures. Various scenarios and Questions got brainstormed to improve the intelligent system. (SQL, MySQL, Visualization)
  • ML Insights and Anomaly Detection: Python, QuickSight Visualization, Tableau.
  • Computer Vision: Building real-time people counting application Dashboard for University Administration. OpenCV, Python, AWS.
  • Usage Forecasting: End to End, Designed Forecasting App with machine learning pipeline and hosted model on Rest API to utilize it making predictions. Pytorch, Python, MySQL, Django, AWS. Deployed machine learning model as a REST API using AWS services like ECR, Sagemaker and Lambda for the Mobile App Development.
  • Continuous coordination with QA team, production support team and deployment team.
  • Worked on documenting all tables created to ensure all transactions are drafted properly.

University of St. Thomas, St. Paul, MN, USA
Research and Development Assistant, May 2018 - Present

ML/AI Projects

  • Research Assistant Machine Learning: Unsupervised Contextual Clustering of Abstracts. Used NSF abstract data (300K to Millions of rows) for the last 34 years producing document context through Gensim Doc2Vec Model which suggests similar abstracts based on given abstract. (Deep Learning, ML, NSF, NLP, Python, Visualization, SAS). The submitted paper is being selected at SAS Global Forum 2020 and My Team ranked 1st in the Nationwide SAS Competition. Working on BERT/GPT Models to perform transfer learning/document embeddings.
  • Paper Published - Unsupervised Contextual Clustering of Abstracts, Youtube
  • AWS Implementation of the machine learning model as a REST API using Docker(with pre-requisite libraries) and AWS services like ECR, Sagemaker and Lambda for the AWS Marketplace Competition. AWSMktPlaceApp
  • Computer Vision: Semantic Image Segmentation Fall 2019: Developed Computer Vision Project. This is Artificial Intelligence Class project was meant to familiarize and apply understanding from DL and ML methods solving Semantic Image Segmentation Problem. (Deep Learning, Pytorch Python, OpenCV, GPU Programming, Deep Learning) #WorkDeck
  • Prices Forecasting: Minnemudac/Fastcon 2019, Runner-up in Competition Predictions where used ideas from Stats, ML, Deep learning and Commodity market to produce Multivariate LSTM Prediction Model. (Deep learning, Python, Keras)
  • Classification: AIM Consulting Hackathon, Won the competition. Designed and developed machine learning classifier on a highly imbalanced dataset. (Scikit Learn, RandomForest, XGBoost, Python, SNS, Pandas, Machine Learning)
  • Classification: ML Project, Kickstarter projects to predict the crowdfunded project would be successful, cancelled or unsuccessful. (Scikit Learn, RandomForest, XGBosst, LightGBM, Python, SNS, Pandas, Machine Learning etc..)
  • Kaggle.com and Hackerrank.com (Some of my Recent Projects Listed Below) – Since 2015:
    • Jigsaw Unintended Bias in Toxicity Classification NLP: created classifying text comment model based on toxicity sentiment considering gender bias into the picture, BERT, LSTM – Top 8% (225/3165 competitors) (NLP, Python, Pytorch, Deep Learning)
    • Built Image Classifier: Fastai, Pytorch, Machine Learning, Deep Learning, OpenCV - Top 11% (298/2943 competitors, Deep Learning)

Big Data Projects

  • BigData Architecture: Hands-on experience, working on Apache Hadoop ecosystem components like MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Oozie, Zookeeper, Flume, Spark, Python and EC2 cloud computing with AWS. Build Architecture from scratch to store the dataset from twitter to Hadoop and then perform data analysis using spark/pyspark. (Hadoop, Spark, NiFi, Oozie).
  • Used Sqoop to import data from different RDBMS systems like MySQL, Oracle and loaded into HDFS. Developed Map-Reduce programs to clean and aggregate the data.
  • Developed workflow in Oozie to manage and schedule jobs on Hadoop cluster to trigger daily, weekly and monthly batch cycles. Extensive knowledge of working on NiFi.

Software Development Projects

  • Software Development: UST MicroGrid Monitoring and Controller – Software Development Project Work. Identifying requirements and developing MicroGrid Devices Controlling System. Angular CLI, ReactJS, Java Spring, MySQL, Linux, Ruby, InfluxDB. #ResearchWork
  • Responsible for understanding the scope of project and requirement gathering.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for Unit Testing.
  • Developed application using Eclipse and used to build and deploy a tool like Maven.
  • Used Log4J to print logging, debugging, warning, info on the server console.
  • Django Web Application: Created Graduate Program in Software department-wide Web-based Software Request System. (Python, Django, MySQL)
  • Audio Tagging Application: Implemented Audio Tagging application using various Natural Language Processing techniques and Django Web Framework. (BERT, Google Speech to Text, Google Content Classifier, Python, Django, MySQL, GPUs).

Amphora Inc., Hyderabad, India
Build Analyst, Feb 2012 - May 2017

  • Worked as a backend automation engineer and applications developer.
  • Development work responsibilities consisted of Requirements gathering, estimating change requests, Designing, and developing code, scripting database stored procedures and functions, Participate in client meetings. Primary technologies involved Java, .NET, Web Services, SQL, Jira, Git, DevOps, CI/CD, Worked with oil trading clients like Mercuria Energy, Noble Group, Arcadia, Eni.
  • Remote Application Services Management Tool: Improved multi-server application service control by building a new application for software product allowing easy handling over multiple remote server environments including DB Scripts Automation. (C#, MS SQL)
  • Notifications: Developed service for sending an email, push and in-app notifications. Involved in features such as delivery time optimization, tracking, queuing, and A/B testing. Built an internal app to run batch processing for software delivery etc. (Java, MS SQL)
  • Trade Aggregator: Simplified bulk data processing and injection service from global exchanges to CTRM and provides preprocessed data for application users. (Unix/Linux, Java, MS SQL)
  • Workflows: Outlined and improved Apache Ant workflow to create and manage build and testing pipelines leveraging automation to expedite development productivity. (JavaScript)
  • Technology Migration: Lead team and Effectively implemented GitHub/SCRUM migration process for the development team, which helped developers smoothly transition code repository from TFS to Git. (Git, Subversion, TFS)

Technology Stack

  • ML/AI: OpenCV, Machine Learning, Deep Learning (Computer Vision, LSTM, Transformers, BERT/GPT, Language Models RNN, Natural Language Processing, Audio/Video), Python (Sci-kit Learn, NumPy, SciPy, Pandas, TensorFlow, Fastai, Pytorch, Keras, seaborn, plotly, Django)
  • Visualization: Tableau, AWS QuickSight, MatplotLib, Plotly, Seaborn
  • Big Data: Hadoop, Big Data, HDFS, MapReduce, Hive, Sqoop, Pig, HBase, Flume, Zookeeper, Oozie, Impala, Kafka, Spark, Nifi, Druid, Parquet, ORC, Avro
  • Databases: MS SQL Server, MySQL, Oracle, Hbase, NeO4j, PostgreSQL
  • Languages: R, SQL, PL/SQL, HTML, Java, J2EE, JSP, Servlets, Hibernate, JDBC JSP, UNIX Shell Scripting, Python, Java Spring, JavaScript (Jquery, Angular, Chart, React, Bootstrap), HTML5, CSS3, SAS.
  • Tools: Eclipse, NetBeans, IntelliJ, Maven, Anthill, MySQL Workbench
  • DevOps and Version Control: GitHub, SVN, Git, Jenkins, MAKE, ANT, SVN, TFS.
  • Operating Systems: Windows Server 2008/2012, UNIX, LINUX
  • AWS Services: Cloud Formation, S3, Kinesis, Redshift, Tableau, EMR(Spark/Hadoop), SageMaker, Amazon Athena, Amazon Kinesis, Amazon QuickSight, Amazon Simple Storage Services (S3), AWS Big Data, AWS Database Migration Service, AWS Glue, AWS Lambda, EC2, ELB, RDS, CloudWatch, SNS, SQS, EBS.
  • Other Tools: Azure, GPUs, Google Cloud, Docker, Rest API, Nginx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment