Unstructured logo
AIVC-backedSeries B
San Francisco, USA · Founded 2022

Unstructured

Parse and prepare any document for LLMs

Visit unstructured.io← Back to Index

About Unstructured

Unstructured provides the data preprocessing pipeline for LLM applications. It extracts clean text from PDFs, Word docs, HTML, images, and tables — handling the messy document parsing that blocks most RAG deployments. Used by thousands of enterprises to feed clean data into their AI pipelines.

Unstructured website preview

Screenshot of Unstructured (unstructured.io) — Parse and prepare any document for LLMs

Company facts

Category
AI
Funding stage
Series B
Total raised
$65M
Founded
2022
Team size
51–200
HQ
San Francisco
Country
USA
Website
unstructured.io

Founders of Unstructured

  • Brian Raymond

Investors backing Unstructured

  • NEA
  • Menlo Ventures

Frequently asked questions

What does Unstructured do?

Unstructured provides the data preprocessing pipeline for LLM applications. It extracts clean text from PDFs, Word docs, HTML, images, and tables — handling the messy document parsing that blocks most RAG deployments. Used by thousands of enterprises to feed clean data into their AI pipelines.

Where is Unstructured based?

Unstructured is based in San Francisco, USA.

When was Unstructured founded?

Unstructured was founded in 2022.

How much has Unstructured raised?

Unstructured has raised $65M in their Series B round.

Who founded Unstructured?

Unstructured was founded by Brian Raymond.

Who has invested in Unstructured?

Unstructured is backed by NEA, Menlo Ventures.

More AI startups