One way this problem can be solved is by having multiple servers with one being the production server. The other server is usually the failover backup, but when it’s time to upgrade the other server can go down for work. Then when it’s ready you use a load balancer to send all requests to the new server. That switchover happens instantaneously: one millisecond requests are being handled by the old server then the next millisecond they’re going to the new server.
It’s kind of like teleporting a new car into place as a way of avoiding a pit stop in a race. Terrible analogy but there you go.