Role | Analytical Skills | Business Acumen | Data Storytelling | Soft Skills | Software Skills |
---|---|---|---|---|---|
Data Analyst | High | Medium to High | High | Medium to High | Medium |
Data Engineer | Medium | Low | Low | Medium | High |
Data Scientist | High | High | High | High | Medium |
ML Engineer | Medium to High | Medium | Low | High | High |
Instance-based Learning | Model-based Learning |
---|---|
Required data preprocessing | Required data preprocessing |
No explicit training required; discovers patterns when new receives new data points | Build mathematical model from data to discover hidden patterns. |
No model to store | Stores the trained model for future predictions |
Original data must be kept for predictions | Discard training data after model training |
k-Nearest Neighbors (kNN), Locally Weighted Regression | Linear Regression, Logistic Regression, Decision Trees, Neural Networks |
Customer ID | Spending Behavior | Shopping Frequency | Brand Preference | Cluster (Segment) |
---|---|---|---|---|
C101 | Low–High Discount | Rare | None | Budget Shoppers |
C205 | High–Low Discount | Frequent | Yes | Brand Loyal |
C309 | Medium–Medium | Frequent | No | Frequent Buyers |
Customer ID | Avg. Monthly Spend | Shopping Frequency | Preferred Brands | Discount Sensitivity | Cluster (Segment) |
---|---|---|---|---|---|
C101 | Low | Rare | None | High | Budget Shoppers |
C205 | High | Frequent | Yes | Low | Brand Loyal |
C309 | Medium | Frequent | No | Medium | Frequent Buyers |
Customer ID | Avg. Monthly Spend | Shopping Frequency | Preferred Brands | Cluster (Segment) |
---|---|---|---|---|
C101 | Low | Rare | No | Budget Shoppers |
C205 | High | Frequent | Yes | Brand Loyal Customers |
C309 | Medium | Frequent | No | Frequent Buyers |
C412 | Low | Moderate | No | Budget Shoppers |
C523 | High | Rare | Yes | Brand Loyal Customers |
C634 | Medium | Frequent | Yes | Frequent Buyers |
C745 | Low | Frequent | No | Budget Shoppers |
C856 | High | Frequent | Yes | Brand Loyal Customers |
Email ID | Contains "Free" | Sender Reputation | Has Attachments | Predicted Class |
---|---|---|---|---|
1 | Yes | Low | No | Spam |
2 | No | High | No | Not Spam |
3 | Yes | Medium | Yes | Spam |
4 | No | High | Yes | Not Spam |
5 | Yes | Low | Yes | Spam |
Plant | Sunlight (hours/day) | Water (liters/day) | Growth (cm/week) |
---|---|---|---|
1 | 4 | 1 | 5 |
2 | 6 | 1.5 | 8 |
3 | 5 | 1.2 | 6 |
4 | 7 | 2 | 10 |
5 | 3 | 0.8 | 4 |
Data Type | Examples |
---|---|
Numeric (Continuous) | Height, temperature, stock price |
Numeric (Discrete) | Number of children, count of clicks |
Categorical | Country names, product category, car brand |
Ordinal | T-shirt sizes (S, M, L, XL), survey ratings (1–5 stars) |
Binary | Gender (M/F), customer churn (yes/no) |
Text | Emails, product reviews, news articles |
Image | Photographs, X-rays, handwritten digits |
Audio | Speech, music, environmental sounds |