DEAR CHROMA, WHAT THE FUCK WAS THAT?
I have a legal research system. 105,904 documents. Colorado statutes, case law, the works. It was beautiful. Until I tried to open the database from a different machine.
chromadb.errors.InvalidCollectionException
That’s it. No warning. No "this was written by a different version." Just an opaque error that means your data is now a brick.
The Toy Infrastructure Problem
We are living in an era where we've mistaken "cool demos" for "stable infrastructure." ChromaDB is a perfect example. Between version 0.4 and 1.5, the HNSW binary format changed silently. There is no migration path. There is no "run this script to upgrade."
The official industry answer? "Just re-index it."
For 100,000 documents, "just re-indexing" is 14 hours of compute. In a real database world—PostgreSQL, SQLite—this would be considered a catastrophic failure. In the vector DB world, it's just a Tuesday.
The Verdict
Vector DBs are currently toys. If a system doesn't have schema migrations and binary stability, it's not a database; it's a cache that thinks it's special.
We eventually survived this by building a decoupled embedding cache—a layer that treats the vector DB as the ephemeral consumer it actually is.
If you're actually trying to solve this version hell, see my Technical Guide: Surviving ChromaDB Versioning.
But don't let the guide fool you: we shouldn't have to build a database to protect ourselves from our database.