Think about a global company where each department has its own unique jargon, and employees speak different languages. To thrive, this company would need a way to keep valuable information from getting lost in translation.
Similarly, businesses looking to take advantage of AI’s powerful computing resources need support in dealing with various file types: PDFs, Google Docs, Slack messages, scanned files. AI systems struggle to process these different formats, but Brian S. Raymond launched Unstructured in 2022 to serve as an efficient AI translator.
After working for another AI company from 2018 to 2022, Raymond saw the challenges firsthand. He recalls spending months to years working with the world’s largest companies, manually converting raw data from formats like PowerPoints and PDFs into a format suitable for algorithms.
“We would hard code these pipelines, and if a document layout changes a little bit … everything would break,” says Raymond, a former CIA intelligence officer, adding that “Algorithms were beginning to be really powerful, but there was nothing to help on the data side of the equation.”
Two months after launching his company, Unstructured released an open-source platform that transforms complex unstructured data formats into AI-friendly JSON files. This prototype “caught on like wildfire,” Raymond says, downloaded more than 8 million times in the past 12 months. But a JSON file is limited.
“If you’re just doing a proof of concept, that’s fine, you can do it one time and that works really well using our open source,” Raymond says. “But if you’re an organization like a large investment bank that produces maybe a quarter million new files every single day … then feeding them down to the language model, you can’t be doing that manually.”
Transforming massive data from a major company requires a more customized solution. Last year, Unstructured shifted its focus from offering only an open-source solution to providing a commercial one. This new solution can be deployed across an entire enterprise, allowing businesses to use their daily data with advanced AI systems.
Raymond is based in Loomis, but the company of 45 people is fully remote. For funding, the team met with about 50 investors and met Bain Capital Ventures, which led its seed round in raising $5 million. In the past 22 months, the startup has raised a total of $65 million.
Ryan Lewis, a partner at SRI Ventures, joined Unstuctured’s advisory board in 2022 for two reasons: the people and the core concept. He has known Raymond for almost a decade, and believes the idea behind Unstructured offers a vital solution to the pressing problem faced by organizations, both big and small, in making sense of corporate data.
“It is, arguably, one of the most critical bottlenecks for an organization looking to adopt AI technology into its workflow,” Lewis says. “I’ve seen this both as a current investor and formerly as an operator.”
Previously, at one of Amazon Web Services’ AI businesses, Lewis saw firsthand that data preparation and curation for use in an AI model were one of the most time-consuming aspects of almost every project. This is why, when he heard Raymond’s pitch, “it clicked within five seconds,” he says. The staggering number of open-source solution downloads speaks to its real value, Lewis adds.
“I’ve worked in software my entire career,” he says. “To see this happen so fast, it’s a testament that they’re attacking a problem and resolving it.”
According to Raymond, the way the world is moving, organizations have an imperative to adopt AI models to help drive productivity and work. Much of that value comes when companies join their data. Most are in experimentation mode right now, he adds, working to move proofs of concept into production.
“We’re a critical piece of that puzzle,” Raymond says. “If you want to connect the data your humans are producing with those (foundation or large language) models, the first step is to get all of your data into a format that models can understand, and we’re the industry standard for that.”
Get all the profiles in our Young Professionals issue delivered to your inbox: Subscribe to the Comstock’s newsletter today!
Recommended For You
Startup of the Month: 811spotter
Ticket management software for contractors’ efficiency, safety
811 is the national toll-free, call-before-you-dig number. Homeowners, excavators and contractors must call 811 before excavating to have underground utility lines marked to prevent accidental damage. But the system is flawed and inefficient, according to 811spotter co-founder and CEO Marc Krichman.
Startup of the Month: Soar Optics
Company targets microplastics in water with high-tech microscopes
Microplastics can be found everywhere, from Antarctica to Mt. Everest to breast milk. A Western Regional Winner in 2022, Soar Optics develops technology to identify these microscopic particles in water.
Startup of the Month: Inspirame
College and career navigation platform aims to repair education pipeline
In 2019, CEO Maria Medrano co-founded the equity-driven startup Inspirame to repair these critical gaps in college enrollment and workforce development.
Startup of the Month: AgriNerds
Mapping tool helps farmers track carriers of bird flu
In recent years, avian influenza (or “bird flu”) has been on a rampage, wiping out wild and domestic birds, disrupting the environment, and causing a shortage of eggs and poultry meat. The Davis-based startup AgriNerds aims to help farmers to identify potential risks and protect poultry against the threat of diseased ducks.
Startup of the Month: 3D Organic Polymer Silk
Trio of researchers aim to revolutionize orthopedic medicine with spider silk
With over 3 billion years of evolution under its belt, the natural world has a pretty long track record of creativity. Knowing this, three interdisciplinary researchers at UC Davis looked to the golden silk orb-weaver spider to develop an innovative biomaterial.