How I Vibe Coded an AI Assistant for My Blog in One Weekend

4 min read

Liz Saunders of Type A Circle recently conducted an office hours session on vibe coding. She walked through the concepts clearly, showed what was possible, and by the end I was inspired to vibe code an AI assistant for myself. I also had a practical reason: I wanted a way for readers to get answers from this blog without having to read every post to find them. While I have spent 25 years working with technologies to support my work in machine learning and AI, I have no front-end web development experience. I started mid-day Saturday, and by Sunday evening I had a working AI assistant embedded in this blog, running on about five dollars of cloud services using open source tools. I want to tell you what was involved, because the parts that were challenging were not what I expected.

There is a common assumption that vibe coding with AI requires a technical background. It does not require nearly as much as you would think. What it requires is being able to describe what you want clearly and keep going when something breaks. While I went into this weekend knowing the concepts behind what I was building, the code was written by a large language model, or LLM, one prompt at a time.

The assistant is live right now, at the bottom right corner of this page. Ask it something about machine learning and it will answer by referencing the specific content on this blog, not from general LLM knowledge, but from the posts you have been reading. A general AI assistant will answer from everything it has ever been trained on. That is useful, but it is not the same as getting an answer based on a specific corpus you are working through. When you ask this assistant something, it answers from the same material you have been reading, in the same voice, building on the same foundations. That continuity is what makes it a learning tool rather than just a search engine.

How the Assistant Works

The technique is called RAG, which stands for Retrieval Augmented Generation. Before getting into the mechanics, it helps to understand the problem it solves. A standard LLM like ChatGPT has broad knowledge, but it does not know what is on this blog. If you ask it about my posts, it will either guess or tell you it does not have access to my content. RAG solves this by giving the model access to a specific set of documents before it answers. The model does not memorize those documents. It retrieves the relevant ones on demand and reads them before responding.

Before the assistant can search anything, every post on this blog gets split into smaller pieces called chunks. A full post on supervised learning might cover five different ideas. If you ask about one of them, retrieving the entire post would bury the relevant part in everything else. Splitting posts into chunks means the assistant can find the specific section that matches your question rather than pulling in content that has nothing to do with it.

Each chunk then gets converted into a vector, also known as a vector embedding. A vector, in this context, is a set of coordinates that describes the meaning of a piece of text mathematically. The useful property of vectors is that similar meanings end up with similar coordinates. If you ask “what is supervised learning” and one of my posts talks about “training a model on labeled data,” those two phrases will have coordinates close to each other even though they share almost no words. An embedding model handles this conversion – its only job is to turn text into vectors, consistently, so that similar meanings cluster near each other in mathematical space. Each chunk gets its own vector, stored in the vector database. When you ask a question, the assistant converts it into a vector and finds the chunks whose vectors are closest. Those are the ones it reads before responding.

What I Used to Build It

I did not go into this weekend knowing which tools to use. I described what I was trying to build to Claude and it recommended all four. That is part of what vibe coding means in practice: you do not need to know the landscape of available tools, you just need to be able to describe the problem clearly. I already had a GitHub account. The other tools, described below, took under five minutes each to set up, with Claude walking me through every step.

Pinecone stores the vectors. It is a service built specifically for searching by meaning rather than exact matches, and it holds the vectors for every chunk of this blog. Vercel hosts the query function, the piece of code that runs when someone asks a question, receives it, runs the vector search, calls Claude, and returns an answer. It runs only when called, with no server to manage. GitHub holds the code and connects to Vercel automatically, so when I push a change, Vercel picks it up and redeploys. OpenAI provides the embedding model, which converts text into vectors. I used Claude as my coding assistant throughout, but the embedding model is a separate tool with a separate job. It does not generate text, it only converts it into vectors, and OpenAI’s is widely used for this purpose and fit the budget

The chat widget itself is about 80 lines of code pasted into my WordPress footer. There was no plugin or framework required. It calls the Vercel function, gets a response, and displays it.

One limitation worth knowing about: Vercel’s free tier only retains logs for 30 minutes, which meant I had no way to see what people were asking beyond that threshold. Rather than upgrading to a paid tier, Claude walked me through persisting every question to a Google Sheet instead. I can now see every question the assistant has been asked, which turns out to be useful for understanding what my readers want to know. The most common questions so far are about how to get started with machine learning and how to evaluate whether a model is actually working. Both of those are now on my list to write about.

What Took a Few Hours

My hosting provider’s security firewall blocked my script from fetching posts through the WordPress API. The fix was to export the posts as an XML file directly from WordPress and read that instead. JavaScript has two syntax systems for importing code that are not compatible with each other, and mixing them silently breaks things. Once I made everything consistent the error disappeared. Python had two versions installed on my machine and they were conflicting. I uninstalled one.

There were also moments when Claude and I went in circles, with each fix introducing a new error. That skill, knowing when to reframe rather than push through, turned out to be more useful than any technical knowledge I brought in.

Every hour of confusion I had this weekend came from configuration problems, not gaps in ML knowledge or web development. I know virtually nothing about networking or front-end web development, and it did not matter. When something broke, I described the error to Claude and it told me what was happening and how to fix it. The concepts behind what I was building were never the obstacle.

One note: LLMs hallucinate. If a set of instructions does not make sense or keeps leading you in circles, push back and ask the LLM to try a different approach. That happened more than once this weekend, and reframing the problem was always more useful than continuing to follow instructions that were not working.

The Full Walkthrough

If you want the complete step-by-step, every command, every error, every fix, the technical guide is linked below. It covers everything from setting up the accounts to pasting the widget into WordPress, with no experience assumed, because I did not have any. Or you can just vibe code it yourself using your favorite LLM!

Technical Guide to Creating an AI Assistant for WordPress

The assistant is on every page of this site. Ask it something! I built an AI assistant for an AI blog using AI. I find that satisfying in a way that’s hard to explain and probably doesn’t need to be.

Leave a Reply

Your email address will not be published. Required fields are marked *

ML Advocate Assistant
Answers from the blog
Hi! 👋 Ask me anything about machine learning — I'll answer using ML Advocate blog posts.