1. Multi-task learning involves training a model on multiple related problems simultaneously to address challenges like data and computation bottlenecks.
2. Common multi-task learning architectures include partitioning networks into task-specific and shared components, using shared trunks with task-specific heads, and allowing cross-talk between tasks.
3. Optimization techniques for multi-task learning include balancing individual task losses, regularization, task scheduling, and gradient modulation.