Component | Role in Vision Pipeline | Why It's Useful |
---|---|---|
CNN Layers | Early stages for local feature extraction | Capture edges, textures, and low-level patterns |
Transformer Layers | Later stages for modeling long-range dependencies | Learn relationships across distant parts of the image |