In recent years, voice recognition software and voice-activated technology have been perceived as substandard in comparison to human interaction. In order for a voice-activated function to work properly, users often had to state their required task or intention several times, often still resulting in a failed desired outcome.
As a result of ever more high-spec. smartphones and other ‘smart’ devices, the capabilities and reliability of the functionality has improved considerably, to the extent that voice enabled computers are surely not that far in the future.
What is voice recognition software?
Already used to help people with a variety of disabilities to communicate, voice recognition is an alternative to physically typing on a keyboard or touchpad. You simply talk to the computer or other device and your words appear on the screen. You can also program a device, where the functionality allows, to complete a task of your choice, such as ‘open blinds’ or ‘turn on the TV’.
Which devices are currently available?
Whilst fully voice-activated computers are not yet commonplace, devices are already on the market that are controlled by the user’s voice. Apple’s Siri is an ‘intelligent assistant’ that is built in to all iPhones from the 4S onwards, together with newer iPad and iPod Touch models. Through Siri, the user can issue voice commands which operate the device and its apps.
Amazon Echo is a hands free speaker which you control with your voice. It connects to Amazon’s Alexa voice service, through which you can issue commands such as ‘play music’, ‘what is the weather like?’ and ‘listen to the news’. You can also place orders on the Amazon site directly through Echo and Alexa.
Improved accuracy
The ‘word error rate’ defines how many words a voice recognition program gets wrong. The lower the rate, the more accurate the software and vice versa. Software giants such as Google, IBM and Microsoft all have specialist in-house engineers who work to improve the word error rate. Google’s word error rate is currently at 4.9%, meaning their software gets approximately every 20th word wrong. This is a relatively low error rate, which Google are improving all the time.
The aim is for voice-recognition software to work perfectly even when a user is in a noisy area, which can currently increase the word error rate as the device struggles to differentiate between the user’s command and the other voices and sounds in the background. As accuracy rates improve and errors decrease, the perception within the industry is that the desire to use the technology should, in accordance, also increase.
Anticipating problems
As the ultimate target within voice recognition software development is to decrease errors, resulting in a better outcome and ease of use for the user, companies are looking towards stopping issues such as lack of context and synonyms, alongside background noise, in their tracks.
These shortcomings are being addressed by Microsoft through their Custom Speech Service, which enables a user to ‘train’ their voice recognition enabled device to work efficiently, even when exposed to aspects which may interfere with the performance of the functionality.
Voice recognition in retail
Voice recognition technology has ample potential within the retail sector. We all know how frustrating it is to be looking for our size on the rail, only to find it isn’t there. Rather than having to look for a member of staff, ask them for what you need then wait for them to check the stock room before coming back to you, wouldn’t it be so much easier to perform the whole process through voice recognition?
The software would be integrated with the stockroom hardware and the shelf POS, allowing the employee to make a request, such as enquiring if a specific size, colour or style is available. The voice activated system would then perform the check, before confirming the result to the employee through an earpiece, who could relate it back to the customer. In some instances, it may even result in the customer performing their own checks, removing the need for interaction with a member of staff entirely.
Voice-activated functionality would be highly beneficial for customers as it would allow them to resolve any queries with regards to availability and pricing quickly and easily. Shop staff would also benefit, as they would be able to spend more time helping, advising and serving customers, rather than having to leave the shop floor regularly to search for stock. Customer conversion could always improve, as browsers are converted into purchasers through ease of accessing their desired product.
Voice recognition in law
The potential for voice recognition software is considerable within the police force, as it would reduce the time required to take statements and log paperwork, whilst potentially reducing errors within evidence logging.
In a trial with the Canadian police force, police officers reported that voice recognition technology reduced the time it took to log their reports by 85%, a wholly considerable amount. Given these statistics, it’s surely only a matter of time before our own police force trials the functionality.
Analysts believe that voice recognition software is the future, so, given the potential we have seen so far and the developments in the making, how long will it be until we are speaking into our computers, rather than typing on them?