From digits recognition to intentional speech disfluency and beyond.
While the research and invention of devices capable of recognizing voice dates back to 1877, the first big breakthrough probably came in 1952, in the form of Audrey, Bell Labs’ Speech Recognition System. While Audrey could only understand the digits 0-9, “she” was still the first known instance of a machine exhibiting the ability to understand speech (and that with a whopping 97% accuracy, woohoo!)
It took only a decade more for IBM to create a machine that could understand other words instead of just digits in 1962 (10 of the 16 words Shoebox could understand were numbers, but still). The wonder of Shoebox was that it could not only understand speech input, but could process the speech to run exercises as well. An operator could say numbers and commands like “Three plus two plus 1 minus four, total,” and Shoebox would print out the correct answer: 2.
IBM’s Shoebox
(Source)
And so began an amazing line of innovation that eventually brought us to the voice assistants of today like Siri, Cortana, and their peers. And of course, the Google Assistant.
Google and the History of Speech in Tech
Google’s progress with voice search over the past few years has been nothing short of phenomenal. And while most people think that Google’s entry into the voice search market was made with Google Assistant in 2016, that was not quite the case. To understand where Google started, we need to go back as far as 2007, to the invention of GOOG-411.
Actually, why don’t we go a little further back while we’re on this topic (since this post is practically a history lesson anyway, at this point?). Communication, as we all know, works both ways, and isn’t only about listening – it’s also about speaking. To understand when machines were taught how to “speak”, we’ll need to take a short detour to 1769.
Wolfgang von Kempelen’s speaking machine was the first known instance of a machine that could talk to us – an invention that cost von Kempelen about 20 years of his life. It was not until almost 70 years later that Bell Labs’ Voder was invented, allowing electronic speech synthesizers to come into existence.
A replica of Kempelen’s speaking machine (Source)
With speech recognition also gaining popularity and the invention of devices like Shoebox popularizing research, science took great strides over the next 50 years in this field. Voice recognition entered the everyday homes of people in 1987 with Julie, an interactive talking doll that could understand speech commands, and could talk back to you. Following that, the next two decades saw a flurry of inventions that allowed us to dictate words to computers, or command their computer to do things for them.
It was right at the end of those two decades that Google entered the market with GOOG-411, and seemingly way behind time, when compared to other tech giants. Google was still just a search engine back then for the most part, and GOOG-411 brought the power to search online directly to telephones.
To those of you who are not familiar with it, the GOOG-411 was a telephone service that was launched in 2007 that allowed users to talk into their phones to run local searches. A user could dial the toll-free telephone number 800-466-4411 (800-GOOG-411, hence the name), and mention their city name, state name, and a business category. Once the user did this, Google would return search results containing up to eight businesses that users could pick from.
Now, this might seem docile even for 2007 standards, but what happened next was completely unexpected.
The GOOG-411 service, as users would soon discover, was apparently never intended to be a long-term business. It was just a way for Google to build a phoneme database that could help the mammoth brand build robust speech-to-text software (what, whaaaat!)
In just a year after the iPhone’s introduction in 2007, Google used this database to bring speech-to-text to phones by creating the Voice Search app for iPhone. This was one of the first instances of speech-to-text making it to commercial devices, and definitely the first instance of “voice search” as we know it. Since then, Google has been making incremental adjustments to its voice search engine throughout the years. And since it had served its purpose, Google also shut down GOOG-411 in 2010.
However, it took only one more year for the next big innovation in the smartphone market to take the world by storm. 2011 saw the introduction of Siri on iPhones, opening up a completely new market that would soon become populous.
The World of Voice Assistants
(Source)
The introduction of Siri marked the first time you could not only talk to, but also have a conversation with a smartphone. Siri could help you get things done by just talking to it, and it would also tell you the answers to the questions using its “voice”.
The revelation caused quite the stir in the smartphone market, and Siri almost literally had no competitors in the market for quite some time after that. The Google-owned OS, Android, introduced Google Now in 2012, and while it greatly improved the user experience of Android smartphones, it was not quite the same without the ability to actually converse with your smartphone. Siri monopolized the voice assistant market, and Apple soon introduced the feature in its computers as well.
Another year passed before other big players started to move into the voice assistants field, with Microsoft unveiling Cortana. One more year, and Amazon introduced Alexa. The year 2015 saw Siri, Cortana, and Alexa growing with Google seemingly nowhere in sight.
Finally, in 2016, Google announced Google Assistant. Much like Siri, Google Assistant was a built-in smartphone feature, and it was powerful—owing to its native integration with Android, and the fact that Android phones already captured a considerable percentage of the smartphone market. And the biggest selling point of them all was that Google Assistant allowed you to have a conversation with your phone, just like Siri.
Though the Google Assistant was a little late to the party, it was extremely successful. In 2017, the Google Assistant was installed on over 400 million devices. Shortly after, Google Assistant gained the reputation of being the most advanced voice-controlled assistant by several YouTubers and technological experts.
In no time, Google was battling Amazon’s Alexa and Echo line of products as well to land the top spot in the smart-speakers market as well with Google Home. Since then, Apple has also moved into the smart speaker segment with the introduction of the Apple HomePod.
Google and Voice Search, Today
Today, over 1 billion devices provide voice-assisted access to consumers. While the most-used voice assistant on smartphones is Siri and the most popular voice assisted smart-speaker is Alexa, the Google Assistant is a close second in most categories.
But that’s not what sets Google apart in today’s world. Despite having a considerable hold over the voice search and voice assistant market today, it is what happened in the latest Google I/O event that made the world gasp.
On May 8, 2018, Google’s CEO Sundar Pichai showed the world a sneak-peek of Google’s latest development with its voice assistant, the Google Duplex. The demo showed a recording of a real incident where the Google Assistant made a phone call to a small business – and it didn’t stop with that. Google Assistant went on to converse with a business representative to set up a salon appointment on behalf of its owner, with absolutely zero manual intervention. And what was more, the Google Assistant intentionally added speech disfluencies like “umm”, and, “ah”, and human-like intonations to its speech, making it indistinguishable from humans. Take a look at it for yourself if you haven’t watch this before.
Google Assistant will be able to make actual phone calls for you.
Posted by Circuit Breaker on Tuesday, May 8, 2018
If that isn’t one of the most impressive things we’ve seen tech do! (Also, great fodder for tech conspiracists to dole out the next “AI is taking over” story)
And that’s not the only major advancement that Google has given the world. Google also presented Dialogflow, an app that allows business owners to create text and voice-based conversational interfaces for their businesses, free of cost. It was also revealed that Google was working with third-party contributors and startups to make Google Assistant even more functionally sound.
Google seems to be creating waves in the tech industry, and for good reason. But what does the future hold for the company and its progress with voice search?
The Future of Voice Search and Google
Google already has a steady hold in the smartphone, smart speaker, and voice-assisted computers/phablet market. However, the one industry where voice search is used abundantly that Google isn’t a part of is the wearable tech segment.
Even though rumors of a Google smartwatch have been flying around for quite some time now, we can (almost) definitely say that Google will soon produce wearable tech, now. How? Because of the fact that it was recently announced that Google will acquire Fossil’s smartwatch tech for $40M. So, first things first, we know that’s going to happen soon.
Another vertical that Google might explore deeper is the automobile industry. Google Android Auto has already introduced itself to the market, but the advent of automobile tech and connected car ecosystems will leave Google wanting to gain “auto”nomity (geddit?) in this segment, and do something much bigger. In fact, Google has talked about doing this as early as in 2014, but it seems to be taking them a while to present something solid on this front. However, considering how voice search plays a big role in controlling in-car systems, we’re predicting that it won’t be long before Google comes up with tech for cars. Besides, one of the most compelling reasons to believe that Google will soon capture this market is their development of the self-driving car company, Waymo.
Google will undoubtedly work on owning a bigger part of the voice search pie with every passing day, and with their tech, it’s hardly going to be difficult. At this point, it’s just a matter of making sure they continue to be competitive and aggressive with their innovation. Businesses need to make sure that they capitalize on the functionality that Google provides on the voice search front – especially considering how comScore projects that 50% of all searches made in 2020 will be via voice. And as it has always been, make sure that you maximize your chances of getting more visits to your business by optimizing it for voice search.
What do you think Google’s going to do next? Let us know in the comments below.