Fault Tolerance is not the same as Resilency. These two terms are sometimes used interchangeably, but are indeed different.
- Fault Tolerant means the ability of a system to survive (tolerate) when a fault occurs, e.g, surviving a server crash or network partition etc
- There may be some temporary drop in overall performance, however system features are not affected
- Mechanisms such as checkpoint/restore, Replicated State Machines can solve this issue
- The systems usually has the ability to self-detect faults and do failovers