Vector embeddings are numerical representation of words and ideas in the form of a matrix, and core to how large language models work. While most people use the large language models directly with ChatGPT, you may have heard of semantic search, vector search, or embeddings – which refers to the use of these matrices themselves.
A common use case today is using mathematical function called cosine similarity (comparing two embeddings) to get “the most relevant results”. This is used for simple recommendation systems, but also in what’s often referred to as the “retrieval” method – for example, when you pull the most relevant context from a large PDF to answer a question from a user.
For tasks like classification, using embeddings can be more efficient than using a large language model. I recently ran a test comparing classification via embeddings (ada-002) and large language models (gpt-3.5) and found that embeddings were 2x faster and 20x cheaper, with similar output. You can also cluster embeddings without specifying the “buckets” (eg. genres), here’s an example where I clustered the ~200 YC companies of this batch – it’s an interesting way to look at data.
The cluster names above are a bit weird, as it’s using a new tool called wikivec2text, which is a fine-tuned model that turns embedding data (ada-002) back into text. The output tends to be a bit weird still, but useful for experimentation around semantic math.
Semantic math is the concept of applying math directly to embeddings. A famous example is taking the embedding for “king”, subtracting “man”, adding “woman”, to get “queen”. Recently. I took this a step further by experimenting with sentiment morphing, temporal shift, character development, data enrichment and more. Check out the experiments here.