I can't imagine "real time" translation will truly happen, there will always be a delay of some sort.
(To clarify: when I think "real time translation", I'm thinking, like, an earpiece you wear that translates what somebody is saying as they're saying it, giving the impression that they are speaking your language)
Sentence structure varies so much between languages that this type of translation simply isn't feasible. For example: type a moderately complex sentance into DeepL (example) and watch how the translated sentence transforms wildly with nearly every new word.
For instance, the sentance "he has been taking care of his grandmother for twenty dollars an hour".
"he has" -> "tiene", as in has "he possesses"
"he has been" -> "ha sido". as in "he has been [a doctor]"
"he has been taking" -> "ha estado tomando" as in "he has been taking [medicine]"
etc etc.