How We're Building AI Search Engines Using LLM Embeddings
A lot of people have asked us for ideas of how they can leverage Large Language Models (LLMs) for their business applications.
One example we continue to see is search.
This uses the native language comprehension capabilities of LLMs to find matching content. In the past when using a search function, the word you search is the word you find. A search for "dog" yields results that include "dog." The native language comprehension capabilities of an LLM means you can now match similar words or ideas. Searching for "dog" may now yield "man's best friend" rather than simply "dog." This makes LLMs an excellent tool for search.
Our CTO, William Huster, put together an explainer video about how we built a prototype application that enables searching for job descriptions using an unstructured, English-language description of a job seeker.
We hope this effectiveness of this example sparks inspiration for how you could utilize LLMs for your own AI search or other functions within your business. If you're looking for a technical team to help integrate AI into your business, we would love to hear from you. Contact us here.
Without further adieu, William has some 'splanin to do 👇🏾
For the tech savvy among us, the code for this demo can be found here:
Jump to a specific topic:
00:00 Intro - Why Build an LLM-based Search Engine?
01:00 Demo of Searching Job Descriptions
01:46 What is an Embedding?
03:06 Search by Meaning, not Content
03:52 Search with Unstructured Data
05:10 How Search with Embeddings Works
06:01 Set Up Database, Data Models, and Data
08:33 Generating Embeddings for JDs
11:04 How the Search Code Works
12:05 Creative Ways to Use Search Results
12:37 Outro - Other Use Case Examples
13:40 Outro - Final Words
Technologies used in this demo:
Django
PostgreSQL + pgvector
Python sentence-transformers library
Links and Resources:
https://www.sbert.net/ - Sentence Transformers package for Python
Enjoy this article? Sign up for more CTO Insights delivered right to your inbox.