Personal Knowledge Search

How to Create an AI Chatbot Using a Personal Knowledge Base in Just 20 Minutes
In the age of information, having instant access to knowledge is invaluable. What if you could build an AI chatbot that uses your personal knowledge base to provide accurate answers? The good news is, it's easier than you think. In this tutorial, we'll walk you through setting up an AI chatbot using Vercel and OpenAI in just 20 minutes.
Setting Up the Site
1. Get Started with Vercel
Begin by signing up on Vercel. This platform allows for seamless deployment of web projects.
2. Kickstart with a Template
Create a new project using the Next.js OpenAI Doc Search Starter template available on Vercel.
3. Integrate Supabase
Complete the integration process with Supabase, a powerful backend-as-a-service platform.
4. Pull the GitHub Repo
Clone the repository to your local machine to start the customization process.
Add Your API Keys
1. Set Up Environment Variables
Make a copy of the .env.example file and rename it to .env.
2. Input OpenAI Key
Fill in your OpenAI API key which can be obtained from the OpenAI portal.
3. Configure Supabase Keys
Fetch your unique Supabase keys from your Supabase dashboard and input them into the .env file.
Modify the Repo to Suit Your Data
1. Update Embeddings Generation Code
Locate the generate embeddings section and on line 330, modify to: type Singular<T> = T extends any[] ? any : T.
2. Implement pdf2md
- Install the pdf2md tool using the command:
pnpm i pdf2md. - Add a script in
package.json:"pdf2md": "npx @opendocsg/pdf2md --inputFolderPath=./pdfs --outputFolderPath=./pages/docs --recursive".
3. Store Knowledge Files
Create a folder named ./pdf at the root level and add all your knowledge PDFs to it.
4. Convert PDFs to Markdown
Run the command: pnpm run pdf2md. Your data will now be converted and stored in the ./pages/docs directory as markdown files.
Creating Embeddings from .md Files
1. Generate Embeddings
Execute the command: pnpm run embeddings to generate vector embeddings from the markdown files.
2. Verify Embeddings Upload
Visit your Supabase dashboard and ensure that the embeddings have been successfully uploaded to your database.
Enhance Vector Search
1. Adjust Vector Search Criteria
The application sends a vector embedding of our text to Supabase's vector database to query similar texts. Modify the criteria on line 85 as shown in the outline for optimal results.
2. Generalize Content Summarization
Update the pages/api/vector-search function for broader content summarization. Follow the changes mentioned in the outline to adjust the GPT-3.5 prompt and usage.
Deploying on Vercel
1. Commit and Push Changes
After making all the necessary changes, commit your code and push it to GitHub.
2. Deploy Your Chatbot
Head over to Vercel, select your domain, and type a question into the search box. Voilà! Your AI chatbot is now live and ready to answer queries.
Optimizations for the Future
To further enhance the performance and efficiency of your chatbot:
- Consider breaking up the
.mdfiles into smaller chunks before embedding. - Clean the markdown files by removing redundant or irrelevant information.
- Stay tuned for more updates and optimization techniques.
Conclusion
Creating a personalized AI chatbot has never been this straightforward. With the right tools and a bit of coding, you can have a chatbot ready to serve your knowledge in no time. Dive in and give it a try!
