I work on dubbed translation software. Right now, our translation process is theoretically realtime (without lip syncing), taking about a minute per minute of audio. It just has some added setup time per each request that's mostly a scaling issue, and the website's UI is not actually set up to do realtime translation yet (instead it just takes in a video, translates the whole thing, and spits it out).
But realtime dubbed translation is absolutely possible with current tech, there's just some mostly non-ML-related barriers. u/perrochon is right about the legal issues. Realistically, what you'll more likely see soon is YouTubers adding multilingual audio tracks to their videos themselves as YouTube begins to roll out that feature more widely. If you know any YouTubers that might be interested, send them our contact :)