Deepgram, a company developing voice-recognition tech for the enterprise, today raised $47 million in new funding led by Madrona Venture Group with participation from Citi Ventures and Alkeon. An extension of Deepgram’s Series B that kicked off in February 2021, led by Tiger Global, it brings the startup’s total raised to $86 million, which CEO Scott Stephenson says is being put toward R&D in areas like emotion detection, intent recognition, summarization, topic detection, translation and redaction.
“We’re pleased that Deepgram achieved its highest-ever pre- and post-money valuation, even despite the challenging market conditions,” Stephenson told TechCrunch in an email interview. (Unfortunately, he wouldn’t reveal what exactly the valuation was.) “We believe that Deepgram is in a strong position to thrive in this tougher macroeconomic environment. Deepgram’s speech AI is the core enabling technology behind many of our customers’ applications, and the demand for speech understanding grows as companies seek greater efficiency.”
Launched in 2015, Deepgram focuses on building custom voice-recognition solutions for customers such as Spotify, Auth0 and even NASA. The company’s data scientists source, create, label and evaluate speech data to produce speech-recognition models that can understand brands and jargon, capture an array of languages and accents, and adapt to challenging audio environments. For example, for NASA, Deepgram built a model to transcribe communications between Mission Control and the International Space Station.
“Audio data is one of the world’s largest untapped data sources. [But] it’s difficult to use in its audio format because audio is an unstructured data type, and, therefore, can’t be mined for insights without further processing,” Stephenson said. “Deepgram takes unstructured audio data and structures it as text and metadata at high speeds and low costs designed for enterprise scale … [W]ith Deepgram, [companies] can send all their customer audio — hundreds of thousands or millions of hours — to be transcribed and analyzed.”
Where does the audio data to train Deepgram’s models come from? Stephenson was a bit coy there, although he didn’t deny that Deepgram uses customer data to improve its systems. He was quick to point out that the company complies with GDPR and lets users request that their data be deleted at any time.
“Deepgram’s models are primarily trained on data collected or generated by our data curation experts, alongside some anonymized data submitted by our users,” Stephenson said. “Training models on real-world data is a cornerstone of our product’s quality; it’s what allows machine learning systems like ours to produce human-like results. That said, we allow our users to opt out of having their anonymized data used for training if they so choose.”
Through Deepgram’s API, companies can build the platform into their tech stacks to enable voice-based automations and customer experiences. For organizations in heavily regulated sectors, like healthcare and government, Deepgram offers an on-premises deployment option that allows customers to manage and process data locally. (Worth noting, In-Q-Tel, the CIA’s strategic investment arm, has backed Deepgram in the past.)
Deepgram — a Y Combinator graduate founded by Stephenson and Noah Shutty, a University of Michigan physics graduate — competes with a number of vendors in a speech-recognition market that could be worth $48.8 billion by 2030, according to one (optimistic?) source. Tech giants like Nuance, Cisco, Google, Microsoft and Amazon offer real-time voice transcription and captioning services, as do startups like Otter, Speechmatics, Voicera and Verbit.
The tech has hurdles to overcome. According to a 2022 report by Speechmatics, 29% of execs have observed AI bias in voice technologies — specifically imbalances in the types of voices that are understood by speech recognition. But the demand is evidently strong enough to prop up the range of vendors out there; Stephenson claims that Deepgram’s gross margins are “in line with top-performing software businesses.”
That’s in contrast to the consumer voice-recognition market, which has taken a turn for the worse as of late. Amazon’s Alexa division is reportedly on pace to lose $10 billion this year. And Google is rumored to be eyeing cuts to Google Assistant development in favor of more profitable projects.
In recent months, Stephenson says that Deepgram’s focus has been on on-the-fly language translation, sentiment analysis and split transcripts of multiway conversations. The company’s also scaling, now reaching over 300 customers and more than 15,000 users.
On the hunt for new business, Deepgram recently launched the Deepgram Startup Program, which offers $10 million in free speech-recognition credits on Deepgram’s platform to startups in education and corporate. Firms participating don’t need to pay any sort of fee and can use the funds in conjunction with existing grant, seed, incubator and accelerator benefits.
“Deepgram’s business continues to grow rapidly. As a foundational AI infrastructure company, we haven’t seen a reduction in demand for Deepgram,” Stephenson said. “In fact, we’ve watched businesses look for ways to cut costs and delegate repetitive, menial tasks to AIs — giving humans more time to pursue interesting, consequential work. Examples of this include reducing large cloud compute costs by switching big cloud transcription to Deepgram’s transcription product, or in new use cases like drive-thru ordering and triaging the first round of customer service responses.”
Deepgram currently has 146 employees distributed across offices in Ann Arbor and San Francisco. When asked about hiring plans for the rest of the year, Stephenson declined to answer — no doubt cognizant of the unpredictability of the current global economy and the optics of committing to a firm number.