The Global Automatic Speech Recognition (ASR) Market is estimated to be worth US$ 6.26 Billion in 2019 and is poised to grow at a healthy CAGR of 19.18% to clock a market size of US$ 17.94 Billion in 2025.
The Global ASR Market is driven by Surging demand for speech-based biometrics, Increasing adoption of speech recognition in the consumer verticals and Technological advancements to launch voice-based product purchases.
- Definition / Scope
- Market Overview
- Market Risks
- Top Market Opportunities
- Market Drivers
- Market Restraints
- Industry Challenges
- Technology Trends
- Pricing Trends
- Regulatory Trends
- Other Key Market Trends
- Market Size and Forecast
- Market Outlook
- Technology Roadmap
- Distribution Chain Analysis
- Competitive Landscape
- Competitive Factors
- Key Market Players
- Strategic Conclusion
- Further Reading
Definition / Scope
Automatic Speech Recognition (ASR) is a technology designed to recognize and process the human voice. It helps distinguish a individual’s speech / voice and allows end-users to authenticate the identity of the person communicating inside the framework. It utilizes computer-programmed techniques, and is software-based.
This software-based technology also enables human interaction, similar to a human conversation, through an Artificial Intelligence (AI) or a computer interface. There are two types of automated speech recognition software, such as directed dialogue conversations and natural language conversations or natural language processing (NLP).
Directed Dialogue conversations are the simplified versions of ASR that are mostly used in the workplace, and consist of machine interfaces that verbally prompt end-users to respond using characters, words or digits from a limited list of choices, and in turn provide responses based on the defined request.
Automated telephone banking and other customer service interfaces are only a few examples that operate on directed dialogue ASR software.
Natural Language Conversations / Natural Language Processing (NLP) are more sophisticated ASR versions, and NLP attempts to mimic real conversations by enabling users to use an open-ended chat format with them instead of using heavily restricted menus of words a user may use. The iPhone’s Siri interface is a cutting-edge example of these systems.
Based on End-User, the global market is segmented into healthcare, IT and Telecommunications, automotive, BFSI, government, legal, retail, travel and hospitality, and others.
Healthcare industry will hold the highest share in terms of revenue generation in the global market by end-user. Nowadays in modern healthcare systems Automatic Speech recognition technologies are actively implemented.
Developments in speech and recognition system from data creation to information collection and writing have proven helpful to many doctors today. Speech recognition facilitates the collection of data from electronic health record systems.
This procedure helps doctors to interact with the system by saying a few words.
By deployment, the global automatic speech recognition industry is bifurcated into the cloud and on-premises.
Cloud-based Automatic Speech recognition segment will dominate the global Automatic speech recognition market during the forecast period. Cloud-based Automatic Speech recognition software is an advanced tool for businesses that would help businesses grow their customer base and offer delighted services of having assistance by hand.
Cloud-based software offers multilingual audio segmentation, language identification, speech recognition, and speech digitization technologies to convert raw audio and audiovisual data into the format required.
Cloud-based Automatic Speech recognition applications such as Google from google, Microsoft’s Cortana, IBM’s Watson and Apple’s Siri are among the specific examples of Automatic Speech recognition software systems based on highly advanced artificial intelligence and machine learning algorithms.
Cloud-based Automatic Speech recognition software also offers true-time dictation data and speech sample recognition.
Geographically, the global Automatic Speech market has been analyzed across five major regions, which are North America, Europe, Asia Pacific, the Middle East & Africa, and Latin America.
North America generated revenue of US$ 2.31 Billion in the year 2019 and is expected to augment the global Automatic Speech recognition market during the forecast period. Companies in the US have predicted that in the coming years one-third of the U.S. population will be using a voice assistant monthly.
Increasing the adoption of advanced technologies such as AI and IoT, and the deployment of smart devices on a large scale would compel the US market.
Because of the presence of major vendors, AT&T and Raytheon BBN Technologies and technologically advanced countries such as the U.S. and Canada would propel North America’s market growth in speech and voice recognition.
Changing lifestyles and increasing interest in new technologies are promoting the market for Automatic Speech recognition in countries such as the United States and Canada.
In the forecast period, the Asia Pacific market would be experiencing significant growth. Several automakers’ presence would boost demand for speech and voice recognition systems.
Rapid growth in the use of mobile and cloud technology has resulted in unprecedented technological advances in computing technology owing to the presence of densely populated economies such as China and India.
The Global Automatic Speech Recognition market happens to be highly fragmented at present. Several small and medium-sized vendors are attempting to strengthen their business presence by offering affordable aftermarket solutions.
As a result, the market has begun to experience intense competition between OE fitted systems and aftermarket systems providers.
One of the salient and understudied features of ASR market is that users can move between platforms with relatively low friction or have a presence on several applications at the same time.
For instance, there are multiple platforms that use the same ASR such as Amazon Alexa and Google Assistant for providing their services, which doesn’t differentiate the platforms and hence affecting the business.
This feature of the market has created an intense competition among platforms, thereby affecting the profit margins of the enterprises.
Top Market Opportunities
Emerging trend of Artificial Intelligence (AI)
Artificial Intelligence (AI) is rapidly becoming the next technical breakthrough that will change society inherently for progress. The technology allows connected devices to behave smartly, thus offering unparalleled user experiences. As a result, the AI-powered devices are expected to continue to dominate innovation in the market for voice-enabled applications in the coming years.
Technological developments in the field of artificial intelligence are fabricating hyper-personalized customer experiences; thus, dominating the growth of voice-activated devices.
AI-based technology is widely accepted, especially in the education, BFSI, hospitality, automotive, e-commerce, healthcare, tourism, and enterprise industries around the world, offering lucrative opportunities for the AI-enabled Automatic Speech recognition technology industry.
The demand for automated speech recognition is expected to also lead to the growth of speech analytics. Speech Analytics, popularly known as audio mining, is used to draw logical inferences from the recorded words. Analysis of voice-based important business-related materials is expected to give rise to better strategic and operational decision making.
Surging demand for speech-based biometrics
The growing demand for speech-based biometrics for identification purposes is recognized as a primary driver for this market. Owing to the growing incidence of fraud owing to the use of text passwords the industry is experiencing a tremendous demand for speech recognition applications.
Although regular letter or digit based passwords can be easily memorized and cracked, and are a security threat, it is difficult to reproduce biometric passwords that substitute symbols with a person’s voice. A person’s voice acts as a password, which allows identification and authentication.
Owing to the securities as mentioned above offer, the ASR apps allow the support of authentication and enrollment of clients by enhancing customer service. This approach also eliminates keyboard use as the text would no longer be used as passwords. Thus, these applications are expected to drive consumer demand for automatic speech recognition apps as it improves the security systems’ performance, response time and accuracy.
Increasing adoption of speech recognition in the consumer verticals
During the forecast period, increased demand for intelligent virtual assistant (IVA) smart speakers with voice capabilities is expected to be the prominent driver for the market for Automatic Speech recognition regarding consumer vertical. In the last 2 years, IVA smart speakers such as Amazon Eco, Google Home, and Apple HomePod have seen strong growth in North America and Europe.
In addition, there is expected to be substantial growth in Automatic Speech recognition in the personal robotics market, such as robotic pets, cleaning robots, and robot companions. All these factors together are driving speech recognition market growth.
Technological advancements to launch voice-based product purchases
As technological progress extends the application areas of speech- and voice-enabled technologies, the acceptance and adoption of connected devices for voice-based transactions also offers promising growth aspects for voice-recognition tools.
These technologies offer an alternative to searching, ordering and purchasing items online via voice commands for both the keyboard and mouse.
Increasing the use of voice assistants particularly for the online purchase of products such as grocery, apparel, homecare, and electronic appliances; ordering a meal; playing music; shopping; and navigation among others also help the overall growth of the market for speech recognition.
Leading retailers such as Walmart, Home Depot and Target have collaborated with Google, a leading technology company to offer voice-based shopping experience through Google Express, a shopping app, to their customers.
According to voicebot.ai ‘s 2018 Voice Shopping Consumer Adoption Study, voice commerce is poised to become the 3rd significant online shopping platform to overtake web and mobile.
Consequently, the increasingly growing demand for connected devices, particularly for voice-enabled transactions, drives the demand for Automatic Speech recognition technologies.
Lack of Accuracy in harsh environments
Speech recognition occupies a prominent place in communication between the humans and machine.. Various factors influence speech recognition system accuracy such as environmental condition, recording devices, prosodic variations, speech variations, and others.
Therefore, it is very important to record the speech with proper recording system in a good atmosphere to reduce the error rate. Back ground noise may have a major effect on the accuracy of recognition.
Inaccuracy in ASR systems is one of the main obstacles the voice-based biometrics industry faces. Reduced level of precision due to surrounding noise is a major downside for highly sensitive voice recognition applications. The hassle of ASR systems being highly sensitive poses a serious challenge in accepting such sensitive applications.
Time and lack of efficiency
It is usually assumed that computerizing a process would speed it up. Unfortunately, when it comes to voice recognition systems this isn’t always the case. Using a voice app in certain cases takes longer than going with a standard text-based version.
This is mostly because of the complex human voice patterns that VUIs are still learning to adapt to. Hence, users often need to adjust by slowing down or being more precise than normal in their pronunciation.
Privacy and data security
For a voice assistant being able to learn, data inputs are needed. This can be generated by paid research or studies which is a rather limiting approach. Specifically when contrasted with the sheer endless amount of data produced by the regular use of voice systems.
But the use of this data must be subject to well-placed scrutiny, as the idea of having recorded all of their voice inputs does not sit well with many people.
Most notably for large corporations looking to make a profit manage these data sets, keeping user data secure can quickly become a conflict of interests. A major challenge to voice recognition therefore lies in making data input accessible for AI, but still acknowledge the need for data privacy and security.
Lack of supporting I.T infrastructure
Lack of supporting I.T infrastructure is hampering overall system growth as establishment of automatic speech recognition systems requires highly efficient I.T systems and maintenance.
The increasing cost of computers with high processing speed and complexity of processing along with the lack of proper training to recognize the individual accent is curbing the market growth.
Adoption of artificial intelligence and Internet of Things
The adoption of artificial intelligence and of the Internet of Things drives the global market for speech recognition. The speech recognition based on artificial intelligence is widely used in the automotive and healthcare sectors.
On September 26, 2017, Fluent.ai Inc. released their voice recognition system based on artificial intelligence, which increases reliability and accuracy in recognition. This solution is essentially a broad-based voice assistant which includes minimizing concerns about privacy for the Internet of Things.
Increasing adoption of biometric security in the banking and finance sector is boosting demand for speech recognition technology which enhances customer experience in the automated phone system. It allows the user to interact more safely with the technologies.
The Price of Directed Dialogue ASR Systems depends on the type of application, size of the device, the end-use vertical and the volume of transactions. It is implemented in the form of a service such as IVR (Interactive Voice Response) which is priced on the basis of scale of application and time consumed.
The Price of Natural Language Conversations / Natural Language Processing (NLP) are dependent on the hardware used and the device type.
The price of voice assistants such as Google Home and Amazon Alexa are as in the table below
|Device||Google Home||Amazon Alexa|
|Price||US$ 129||US$ 99.99|
|Digital Assistant||Google Assistant||Amazon Alexa|
|Connectivity||Wi-Fi, Bluetooth||Wi-Fi, Bluetooth|
Many countries are beginning to regulate ASR functions and how businesses portray their ASRs to website visitors, users, and customers.
Regulation in USA
The Bot bill which came into effect on July 1st, 2019, bans automated accounts from pretending to be real people in order to “incentivize a purchase or sale of goods or services in a commercial transaction or to influence a vote in an election.
According to the bill, Automated accounts will still be able to interact with users, but they will have to disclose that they are not, in fact, humans,
Regulation in Europe
Article 22 of the GDPR legislation that went into force in May 2018 dictates that it is illegal for bot builders to design and develop bots to serve as the primary source for a consumer approval decision process. The best example is a ASR cannot approve a consumer for a loan.
The overarching objective of these new regulations is to increase the level of transparency between clients and users. This involves ensuring that end-users know they ‘re talking to a bot and not a real human being.
While developers, marketers, and bot builders may feel these new regulations are hampering business, these regulations are quite an opportunity to mature ASRs, which can enhance the overall experience of customers and users.
Despite recent legislation, ASRs are and will continue to be a great way to generate leads and establish connections with customers.
Other Key Market Trends
Artificial Intelligence (AI), Virtual Reality (VR), and Augmented Reality (AR) Solutions are expected to contribute significantly when reacting to the COVID-19 pandemic and resolving continuously evolving challenges. Owing to the outbreak of the epidemic, the existing situation will encourage pharmaceutical vendors and healthcare establishments to boost their R&D investments in AI, acting as a key technology to enable various initiatives.
The insurance sector is supposed to face the cost-efficiency-related pressures. Using AI can help to minimize operational costs, while also increasing customer loyalty during the renewal process, claims and other services.
Market Size and Forecast
The Market Size of the Global Automatic Speech Recognition Market is valued at US$ 6.26 Billion in 2019 and is poised to grow at 19.18% in the forecast period (2019 – 2025) to reach US$ 17.94 Billion in 2025.
Market Size based on End-Use
The Healthcare segment dominates the End-Use accounting for 23% of the market share and US$ 1.44 Billion in market size in 2019 and is projected to grow at a CAGR of 19.3% to reach US$ 4.15 Billion in 2025.
IT & Telecommunications segment constitutes 20% of the market share and estimated to be worth US$ 1.25 Billion in 2019 and is poised to grow at a CAGR of 18.9% to reach US$ 3.53 Billion in 2025
BFSI segment constitutes 13% of the market share and US$ 813.8 Million in market size in 2019 and is expected to grow at a CAGR of 19.1% to reach US$ 2.32 Billion in 2025.
Automotive segment accounts for 15% of the market share and is poised to reach market size of US$ 2.67 Billion in 2025 from US$ 939 Million in 2019 growing at a CAGR of 19% in the forecast period (2019 – 2025).
Government segment accounts for 7% of the market share and market size of US$ 438.2 Million in 2019 and is expected to reach US$ 1.23 Billion in 2025 growing at 18.7% in the forecast period.
Retail segment constitutes 13% of the market share and US$ 813.8 Million in market size in 2019 and is expected to grow at a CAGR of 19.2% to reach US$ 2.33 Billion in 2025.
Travel and Hospitality segment accounts for 8% of the market share and is poised to reach market size of US$ 1.34 Billion in 2025 from US$ 500.8 Million in 2019 growing at a CAGR of 18.6% in the forecast period (2019 – 2025).
The others segment is witnessing significant growth rate of 19.2% and it accounted for market size of US$ 62.6 Million in 2019 and is expected to reach US$ 179.6 Million in 2025.
Market Size based on Deployment
The Cloud-based Deployment leads the market constituting 70% of the market share and generating about US$ 4.38 Billion in 2019 and is poised to grow at 19.2% to reach US$ 12.56 Billion in 2025.
On-Premise Deployment accounts for 30% of the market share and is expected to witness growth rate of 19% to reach market size of US$ 5.34 Billion in 2025 from US$ 1.88 Billion in 2019.
Market Size based on Region
North America dominates the Global ASR Market accounting for 37% of the market share and US$ 2.31 Billion in market size in 2019 and is projected to grow at a CAGR of 18.8% to reach US$ 6.49 Billion in 2025.
Europe constitutes 21% of the market share and US$ 1.31 Billion in market size in 2019 and is expected to witness growth of CAGR of 19% to reach US$ 3.72 Billion in 2025.
Asia Pacific is the fastest growing region and accounts for 25% of the market share and is poised to reach market size of US$ 4.52 Billion in 2025 from US$ 1.56 Billion in 2019 growing at a CAGR of 19.4% in the forecast period (2019 – ,2025).
South America constitutes 10% of the market share and estimated to be worth US$ 626 Million in 2019 and is poised to grow at a CAGR of 19.1% to reach US$ 1.79 Billion in 2025
Middle-East & Africa accounts for 7% of the market share and market size of US$ 438 Million in 2019 and is expected to reach US$ 1.23 Billion in 2025 growing at 18.8% in the forecast period.
The Global Automatic Speech Recognition Market is estimated to be worth US$ 6.26 Billion in 2019 and is projected to grow at 19.18% to reach US$ 17.94 Billion in 2025.
The Key Factors driving the growth of the Global ASR Market include Surging demand for speech-based biometrics, Increasing adoption of speech recognition in the consumer verticals and Technological advancements to launch voice-based product purchases.
Deployment of speech recognition technologies in customer and retail verticals would definitely increase the overall demand. Changing habits in many countries including the United States, Germany and the United Kingdom Foster voice and speech recognition software deployment.
Growing adoption of smart electronics in India , China, Japan, and Brazil is also poised to fuel consumer vertical demand for voice and speech recognition software. The use of deep learning algorithms in voice and speech solutions to boost the search results is expected to work for the market
Cloud-based solutions help companies minimize CAPEX and OPEX, helping them to reach a substantially high productivity level at reduced expense. Thus, cloud’s high market share is attributed to the cloud infrastructure’s ability to offer self-service apps at minimal expense.
Owing to the large number of voice biometric systems deployed to ensure a high level of security along with the proliferation of speech technologies in consumer electronics and enterprises, the Americas dominates the global automatic speech recognition market.
This area is at the forefront of the introduction of biometric systems to improve the safety and security measures. The US’s strong economy is the main driving force in the development of the America’s market for speech and voice recognition.
Artificial intelligence (AI) is becoming more human with cognitive conversational abilities. What commenced as online ASRs, turned into essential AI-powered assistants for businesses, households, and users worldwide.
Conversational AI and ASR development services are all set to become more agile with expected market size of USD 15.7 billion by 2025. From domain-specific solutions to intelligent retail ASRs, Some of the most exciting Conversational AI trends for the year 2020 include
Self-learning Conversational AI
Conversational AI is trained using rich data from consumers to increase the self-learning capabilities of a program. Self-learning ASRs will allow businesses to train models with customer data, product information, interactions on social media, and other useful data troves.
Conversational AI is expected to update its algorithms for a deeper understanding of the purpose and meaning of the customers in 2020. It leads users to fulfill very specific requests, such as a restaurant table reservation, a haircut appointment, a car service, and what not.
The Automatic Speech recognition market is moving towards the fragmented market as market leaders are relying on product innovation and product development to gain an edge, and there is also an increase in the number of local players creating high market rivalry. Nuance Communications Inc., Auraya Systems Pty Ltd., Microsoft Corporation, Apple Inc., Alphabet Inc., and others are the key players.
Key Market Developments include
In June 2019, Nuance Communications announced the adoption of Nuance AI-powered Dragon Medical by Canadian Northern Health to revolutionise care delivery across Colombia. The Nuance AI-powered dragon device will act as Northern Health’s wide-ranging, front end speech recognition solution. To standardise clinical records, this will include quicker and more reliable reporting inside the Electronic Health Record ( EHR).
In March 2019, Nuance Communications, Inc. announced that its Nuance Security Suite focused on biometrics saved companies more than US$ 1 billion in total fraud costs.
Major voice recognition market vendors are increasing their focus on research & development inventions, which has resulted in product introduction with enhanced accuracy and high integration capability.
Enterprises are also focusing on inorganic growth strategies such as Mergers and Acquisitions, agreements, partnerships, and collaborations to expand their market presence.
In August 2019, Amazon announced that Alexa customers in India could choose from over 30,000 skills offered by top brands and developers. Amazon helps leading brands and developers to build skills and voice services with Alexa for voice-based facilities. Amazon announced Alexa Voxcon in New Delhi.
Key Market Players
Nuance (US), Microsoft (US), Alphabet (US), IBM (US), Amazon (US), Sensory (US), Cantab Research (UK), iflytek (China), Baidu (China), and Raytheon BBN Technologies (US) are a few major players in the Automatic Speech recognition market.
Nuance Communications (US) is focused on agreements, partnerships, and collaborations to retain its global market position.
The company has a vast base of customers with long term contracts. Nuance Communications has invested heavily in R&D to expand its product portfolio and create a world-class portfolio of intellectual property, technology, software, and solutions through internal innovations and acquisitions.
It is expected to explore opportunities to expand its properties, geographic reach, distribution network, and customer base through other market and technology acquisitions. In the Health Sector,
Microsoft Corporation (US) is a technology company. The Company develops, licenses, and supports a range of software products, services and devices. The Company’s segments include Productivity and Business Processes, Intelligent Cloud and More Personal Computing. It also designs, manufactures, and sells devices, including personal computers (PCs), tablets, gaming and entertainment consoles, phones, other intelligent devices, and related accessories, that integrate with its cloud-based offerings.
Alphabet Inc. (US) is a holding company. The Company’s businesses include Google Inc. (Google) and its Internet products, such as Access, Calico, CapitalG, GV, Nest, Verily, Waymo and X.
The Company’s segments include Google and Other Bets. The Google segment includes its Internet products, such as Search, Ads, Commerce, Maps, YouTube, Google Cloud, Android, Chrome and Google Play, as well as its hardware initiatives. It offers Google Assistant, which allows users to type or talk with Google; Google Maps, which helps users navigate to a store, and Google Photos, which helps users store and organize all of their photos.
Amazon.com, Inc. (US) offers a range of products and services through its Websites. Its AWS products include analytics, Amazon Athena, Amazon CloudSearch, Amazon EMR, Amazon Elasticsearch Service, Amazon Kinesis, Amazon Managed Streaming for Apache Kafka, Amazon Redshift, Amazon QuickSight, AWS Data Pipeline, AWS Glue and AWS Lake Formation.
AWS solutions include machine learning, analytics and data lakes, Internet of Things, serverless computing, containers, enterprise applications, and storage.
Sensory, Inc. (US) is a Santa Clara based company which develops and makes speech technologies on both hardware (Integrated Circuit – IC or “chip”) and software platforms for consumer products, offering IC and software-only solutions for speech recognition, speech synthesis, speaker verification, music synthesis.
Sensory’s products are used in consumer electronics applications including mobile, automotive, Bluetooth devices, toys, and various home electronics.
Cantab Research Limited (UK) develops an application software. The Company serves customers in the United Kingdom.
iFlytek (China) is a partially state-owned Chinese information technology company established in 1999. It creates voice recognition software and 10+ voice-based internet/mobile products covering education, communication, music, intelligent toys industries. State-owned enterprise China Mobile is the company’s largest shareholder.
Baidu (China) is a Chinese language Internet search provider. The Company offers a Chinese language search platform on its Baidu.com Website that enables users to find information online, including Webpages, news, images, documents and multimedia files, through links provided on its Website. Its transaction services include Baidu Nuomi, Baidu Takeout Delivery, Baidu Maps, Baidu Connect, Baidu Wallet and others. iQiyi is an online video platform with a content library that includes licensed movies, television series, cartoons, variety shows and other programs.
Raytheon BBN Technologies Corp (US) provides technology research and development activities. The Company offers a range of development services, products, and licensable intellectual property in the fields of information security, speech and language processing, networking, distributed systems, and sensing and control systems. Raytheon BBN Technologies serves customers worldwide.
The US$18-billion Automatic Speech Recognition industry has been forecast to grow at a rate of 19.2 percent from 2019 to 2025.
The technology has found strong use, in transcription applications, in other industries among the smaller and lesser-known companies. Medical professionals now use speech-to – text transcription systems such as Dolbey in healthcare to create electronic medical records for patients.
Among the five leading technology companies offering speech and voice recognition capabilities — Google, Amazon , Microsoft, Apple and Facebook — similar capabilities include scheduling, reminders, managing playlists, connecting with retailers, managing emails, making food orders, and online searches.
The Global Automatic Speech Recognition Market is expected to witness significant growth driven by factors such as Surging demand for speech-based biometrics, Increasing adoption of speech recognition in the consumer verticals and Technological advancements to launch voice-based product purchases.
- ASR – Automatic Speech Recognition
- NLP – Natural Language Processing
- BFSI – Banking, financial services and insurance
- IVA – Intelligent Virtual Assistant
- VUI – Voice User Interface
- IVR – Interactive Voice Response
- GDPR – General Data Protection Regulation
- VR – Virtual Reality
- AR – Augmented Reality
- R&D – Research & Development