AI makes it possible
KIT researcher sends voice message from the Titanic
Around 3800 meters below the surface of the sea, you can watch the Titanic disintegrate. How it rusts and how underwater creatures eat away at it. A computer scientist from Karlsruhe has now used a mission to the wreck for something completely different.
During an expedition to the sunken Titanic, computer scientist Alex Waibel tested a voice technology with video function from a submarine. He used sonar to send texts to the surface, where they were converted into spoken language and video using artificial intelligence (AI). The researcher from the Karlsruhe Institute of Technology (KIT) reported to the German Press Agency that they were able to get through some of the dialogs. "We were able to see that it really works."
This is how it works
KIT researcher Axel Waibel (right) in the submarine with the founder of OceanGate, Stockton Rush. The company carries out submarine missions to the Titanic.
© Axel Waibel/KIT/dpaThe tested technique works as follows: Before the dive, Waibel and participating colleagues recorded videos and voice samples of themselves. When text messages reach the computer system, the AI converts them so that the video looks and sounds as if the person is speaking - including lip movements.
What sounds like a PR gimmick by tech-savvy scientists, especially in the context of the Titanic expedition, has a serious background: "There are enough places in the world where the bandwidth is so poor that only text transmission is possible," said Waibel. The new technology could make video communication possible one day.
However, the mission also revealed the pitfalls: one of two sonar devices failed, said Waibel. As a result, only part of the dialog could be transmitted from the submarine. He also came up with new ideas: Submarine crews, for example, work a lot with abbreviations to compress texts. Another goal is to reduce the size of the technology so that it fits in a pocket. All in all, Waibel was satisfied: "We've made a good start."
Incidentally, one of the biggest challenges in converting the texts into videos has nothing to do with language, the scientist revealed: "If the person doesn't say anything, it's surprisingly difficult." Then the lips in the videos hardly move at all.
Waibel was part of a larger mission with biologists and archaeologists, among others. Such expeditions to the Titanic take place again and again.
The researcher has been working on AI and machine learning in speech and communication technology for more than 30 years. Among other things, he developed what KIT claims to be the world's first automatic simultaneous translation service at a university. The "Lecture Translator" automatically records the speaker's lecture and simultaneously translates the speech signals into English, which is then displayed as subtitles. Students without any knowledge of German can follow the lecture via laptop, smartphone or tablet.















