-
general
- how to collect data
- What type of ML problem is this
- what is the type of data
- what will be the output
- missing values and outliears
- labeled or unlabeled data
- what kind of preprocess and transformation needed
- potential bieases in the model
- what is the size of training and tesing data
- possibiliy of data leakage
- overfit and underfit chances
- How to evaluate the model
- what is the simplest way to make a model
- how is to be diployed in the real world
- how to monitor and update model in production
- What ensemble methods can i consider
- How to explain the model predictions
- model serving and scalability
- how to update the model with new data
-
regression
- if linear relation
- if polynomial relation
- if some multicoliniarity
- if future selection required
- for count data
- if a lot of non linear relations
- outliers
- missing values
- if feauture importance is crutial
- if homoscedasticity is violated
- if the feautures are high dimention
- if real time prediction required
- if regularization needed
- If the data distribution is skewed
- Categorical data included
- interpretebility is important
- if training time is constraint
- Data has temporal dependencies
- Interacion effects
-
classification
- Missing values
- imbalance
- if feature importance is crutial
- categorical features
- if interpretebility is important
- binary
- if we want simple and interpreteble model
- non-linear decision boundaries
- If it is a credit risk analysis
- data is high dimentional
- local petterns matter more than global
- large number of features and independece assumption
- large dataset and complex relation
- multiclass
- multilabel
- anomaly detection
- outliears
- if data is textual
- if overfitting is a concern
- if training time is a constraint
- if data has hierarchical structure
- if data distribution is unknown and changing
- if real time prediction required
-
nlp
- EDA
- token and text length distribution
- vocabulary size
- most frequent words
- rare words
- stop words and punctuations
- N-grams analysis
- Spelling and typos
- Text annotation quality
- entity co-occurance
- error analysis
- data augmentation oppurtunities
- If Dimentionality reduction needed
- If dealing with multilingual text
- If handling noisy or informal text
- text segmentation
- chat bot
- NER recognition
- spam detection
- POS tagging
- transalation
- Text summerisation
- Speech recognition
- Question answering
- Sentiment analysis
- In the case of training
- using pretrained models
- document similarity
- Text generation
- Coreference resolution
- Dependancy parsing
- Semantic role labeling
- if real time processing required
- EDA
-
computer vision
- if dealing with small datasets
- if real time processing required
- If there are many small objects in image
- If needing high accuracy and precision
- If handling noisy or varied environments
- if real time object tracking is required
- Data augmentation oppurtunities
- Image classification
- Object detection
- Image segmentation
- Instance segmentation
- Image generation or style transfer
- Image captioning
- Image super-resolution
- Medical image analysis
- extract info from docs
- Object tracking
- Anomaly detection in image
-
recommendataion system
- If contains a lot of missing values
- Addressing cold start problem
- handling large scale data
- If real time recommendation is important
- If dealing with multicriteria recommendations
- If incorporating social network information
- colaborative filtering content based filtering
- content based filtering
- Matrix factorisation
-
time series
- Anomalies and outliers
- If data shows Non-stationary
- Handling missing values
- Dealing with structural breaks
- If forecasting future values
- time series classification
- seasonal adjustments
- If work with long time series
Created
May 9, 2024 17:02
-
-
Save izam-mohammed/d5490b2e238921931f1c69aaefea3a05 to your computer and use it in GitHub Desktop.
this is the types of problems in Machine Learning
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment