- Supervised Learning:
- Goal: To train a model to predict output based on labeled training data.
- Algorithms:
- Linear Regression: Used for predicting continuous outcomes.
- Logistic Regression: Used for binary classification problems.
- Decision Trees: Simplifies data into rules to make predictions.
- Support Vector Machines (SVMs): Finds the best decision boundary to separate data.
- k-Nearest Neighbors (k-NN): Predicts based on similarity to nearby data points.
- Unsupervised Learning:
- Goal: To find structure or patterns in unlabeled data.
- Algorithms:
- Clustering: Groups similar data points into clusters.
- Principal Component Analysis (PCA): Reduces data dimensionality while preserving key features.
- Anomaly Detection: Identifies unusual data points that deviate significantly from the norm.
- Reinforcement Learning:
- Goal: To train an agent to make decisions in an environment to maximize rewards.
- Algorithms:
- Q-Learning: Updates value estimations based on past decisions and rewards.
- SARSA (State-Action-Reward-State-Action): Similar to Q-Learning but uses only one action per state.
- Deep Q-Networks (DQN): Uses neural networks for value estimation in complex environments.
- Deep Learning:
- Goal: To create neural networks that learn from large amounts of data.
- Architectures:
- Convolutional Neural Networks (CNNs): Effective for image and speech recognition.
- Recurrent Neural Networks (RNNs): Used for sequential data like text and time series.
- Transformers: Recent advances for language processing and machine translation.
- Evaluation Metrics:
- Accuracy: Measures the percentage of correct predictions.
- Precision: Measures the proportion of true positives among all predicted positives.
- Recall: Measures the proportion of true positives among all actual positives.
- F1-score: Combines precision and recall into a single metric.
- Mean Squared Error (MSE): Measures the average squared difference between predicted and actual values.
- Bias and Variance:
- Bias: The systematic error introduced by a model due to assumptions or simplifications.
- Variance: The random error introduced by a model due to the randomness in the data.
- Bias-Variance Tradeoff: Balancing bias and variance to optimize model performance.
- Overfitting and Underfitting:
- Overfitting: When a model performs well on training data but poorly on unseen data.
- Underfitting: When a model fails to capture the underlying patterns in the data.
- Regularization:
- Techniques to reduce overfitting by penalizing model complexity.
- L1 Regularization (Lasso): Penalizes the sum of absolute coefficients.
- L2 Regularization (Ridge): Penalizes the sum of squared coefficients.
- Feature Engineering:
- The process of transforming raw data into features that are more suitable for machine learning models.
- Techniques:
- Feature Scaling: Normalizing features to have a consistent range.
- Feature Selection: Selecting the most informative features.
- Feature Extraction: Creating new features from original features.
- Model Selection and Validation:
- Techniques to select the best model and avoid overfitting:
- Cross-Validation: Evaluates a model on multiple subsets of the data.
- Train-Validation-Test Split: Divides the data into separate sets for training, validation, and testing.
- Hyperparameter Tuning: Optimizing the model's hyperparameters to improve performance.
Supervised Learning
Unsupervised Learning
2.Unsupervised learning is also a good choice for problems where you are interested in finding patterns and structure in data that is not obvious to humans.