One of the most common types of content shared on Facebook, and other social media platforms are photographs. While most social media users can look at an image and see what it represents, it’s not so easy for those who are blind or visually impaired. Facebook says screen readers can describe the contents of these images using a synthetic voice … Continue reading
Thanks to modern machine learning techniques, text-to-speech engines have made massive strides over the last few years.It used to be incredibly easy to know that it was a computer that was reading a text and not a human being.Amazon’s AWS cloud computing arm today launched a number of new neural text-to-speech models, as well as a new newscaster style that is meant to mimic the way… you guessed it… newscasters sound.“Speech quality is certainly important, but more can be done to make a synthetic voice sound even more realistic and engaging,” the company notes in today’s announcement.For sure, human ears can tell the difference between a newscast, a sportscast, a university class and so on; indeed, most humans adopt the right style of speech for the right context, and this certainly helps in getting their message across.”The new newscaster style is now available in two U.S. voices (Joanna and Matthew) and Amazon is already working with USA Today and Canada’s The Globe and Mail, among a number of other companies, to help them voice their texts.
Siri was the first to catapult voice technology into the mainstream, when it launched with the iPhone 4S back in 2011.At the time, having a voice assistant felt frankly futuristic – because of this it came riding an uneasy wave of Terminator / GLaDOS references – but the reality was somewhat different.Yes, Siri was perfectly functional, but the scope of what could actually be done with it was rather limited – not just because it was beta technology, but because it was restricted to the phone.The company has struggled ever since to make Siri’s code more efficient.”Do you have a brilliant idea for the next great tech innovation?But while Siri got everyone used to speaking to a synthetic voice – and there are rumblings that it's about to get a whole lot better – right now it's Google, and, in particular, Amazon, that are making most of the running in this space.
AI systems are capable of generating realistic-sounding synthetic voice recordings of any individual for whom there is a sufficiently large voice training dataset.The same is increasingly true for video.As of this writing, “deep fake” forged audio and video looks and sounds noticeably wrong even to untrained individuals.However, at the pace these technologies are making progress, they are likely less than five years away from being able to fool the untrained ear and eye.The manipulation of video, images, and sound isn’t exactly new – nearly a decade ago we watched as Jeff Bridges graced the screen of “Tron Legacy” appearing exactly as he did 35 years ago when he starred in the original.It requires no video editing skills and minimal knowledge of AI to use — most DeepFakes apps are built with Google’s open-source AI platform TensorFlow.
Google CEO Sundar Pichai milked the woos from a clappy, home-turf developer crowd at its I/O conference in Mountain View this week with a demo of an in-the-works voice assistant feature that will enable the AI to make telephone calls on behalf of its human owner.The Duplex demos were pre-recorded, rather than live phone calls, but Pichai described the calls as “real” — suggesting Google representatives had not in fact called the businesses ahead of time to warn them its robots might be calling in.King reckons it would need to state up front that it’s a robot and/or use an appropriately synthetic voice so it’s immediately clear to anyone picking up the phone the caller is not human.Now experiments have shown that many people do interact with AI software that is conversational just as they would another person but at the same time there is also evidence showing that some people do the exact opposite — and they become a lot ruder.We’ve seen, for example, how microtargeted advertising platforms have been hijacked at scale by would-be election fiddlers.Pichai said the first — and still, as he put it, experimental — use of Duplex will be to supplement Google’s search services by filling in information about businesses’ opening times during periods when hours might inconveniently vary, such as public holidays.
Thanks to its hyper realistic synthetic voice — right down to the realistic, human-like “uh-huhs” and pauses — the person on the other end of the line probably won’t even realize they’re talking to a bot.HAL remains arguably the world’s referenced chatbot, despite only existing in the realms of science fiction.who controls the operations of the spacecraft Discovery One.Illustrated through a series of slickly-produced video vignettes, Knowledge Navigator was a proposed chatbot-based system in the form of an onscreen butler software agent.Despite possessing its own personality, it was primarily intended to be used for information retrieval and executing commands.However, this tantalizing glimpse forward has since inspired many people outside of the Cupertino company who are interested in building productivity-focused chatbots.
Some apps are almost magical, and just a few years ago, they would have felt like science fiction.What the app does is to turn written text into speech.You can hold up the phone in front of a regular page in a paper book, and after a few seconds, you will hear the text read by a synthetic voice.By hearing the text while reading, it becomes easier to understand.But of course it is not only students who may have problems with reading, and the app can just as easily be used by people with impaired vision – or of all of us who sometimes need a little help to concentrate.Scanpen has become possible thanks to two different technologies.
Swedspot want to simplify the life of car owners with the help of an iot solution for connecting vehicles.Now, the company has received its first big customer: Telenor.Swedspot started in 2013, in the wake of Saab's bankruptcy.the Task was to adapt Android to the driving environment of the Saab cars.One option would of course be to use Android as it is.The synthetic voice, which reads out incoming sms can, for example, have to wait a few seconds, until the going is finished.