MVP Demo

A short demo outlining the functionality in my AI Assistant MVP.

I’ve spent the last few weeks building something that started as a personal challenge – to really understand how AI products work by building one myself. It’s part passion project, part crash course in delivery, engineering, and design, and every step has been about learning by doing.

I decided to focus on small businesses – the people who often see AI as something distant or overly complex, even though they stand to benefit the most. My aim was to create something practical: an assistant that could manage bookings, answer customer questions, gauge sentiment, and forecast demand in a way that feels clear and approachable.

A lot of AI tools in this space are powerful but inaccessible – too fragmented, too technical, or too expensive to adopt. I wanted to prove that with the right delivery structure and mindset, it’s possible to build something trustworthy, explainable, and genuinely useful without a large team or enterprise budget.

How I Set It Up

I approached this the same way I would any structured delivery programme. I mapped out a 24-sprint roadmap, wrote Architecture Decision Records to capture the rationale behind key choices, and set up continuous integration so every change was linted, formatted, and tested before merging.

The tech stack is modern but pragmatic: FastAPI for the backend, Postgres for data, React and Tailwind for the frontend, and AI components that are powerful enough to add real value yet simple enough to explain clearly to anyone.

Phase 1 was about building foundations – a working product loop before any polish. That loop is: book an appointment, answer a question, capture feedback, forecast demand. Everything else builds on that cycle.

Bookings – the Backbone

I started with bookings because they anchor everything else. I defined a clean data model in Postgres for customer details, start and end times, status, and notes, then built REST endpoints with FastAPI. Each request carries a unique ID through the logs and metrics, so I can trace a single user action from browser to database and back again. I also set up privacy guardrails – no personal data in logs – and captured the rationale in a privacy ADR.

The result is a simple interface that behaves like a professional system. You submit a booking, it’s validated, stored, and confirmed. If something fails, you get a clear error, and I get structured logs to debug quickly.

Not everything went smoothly. Alembic migrations broke a few times – a reminder that schema history matters. I rebuilt the revision chain, added tests that apply migrations from zero, and moved on stronger.

Teaching the Assistant to Answer Carefully

Next came the FAQ assistant. I kept it deliberately simple and honest. A small curated knowledge set is stored, turned into embeddings, and when you ask a question, the system finds the closest match and returns an answer with a confidence score.

If confidence is high, it answers cleanly. If it’s middling, the UI signals caution. If it’s low, the assistant doesn’t bluff – it escalates. Eventually those escalations will route to Slack or a ticketing system, but for now the important part is the behaviour. It knows its limits.

In most AI demos, the headline is speed or fluency. In real use, the question is trust. People need to know when to rely on an answer and when to involve a human. Confidence scoring and escalation give that clarity without complexity.

A few dependencies tripped me up here – broken test clients after library updates, real emails firing during tests until I added mocks. The kind of tiny problems that quietly teach you how systems actually behave.

Reading the Room with Sentiment

Next, I added a sentiment pipeline. When feedback arrives, the service classifies it as positive, neutral, or negative, stores the result with a timestamp, and displays summary counts and a simple chart. It’s not flashy, but it turns individual comments into visible trends.

You can even post new feedback from the terminal and watch the dashboard update – a good end-to-end check that frontend, backend, and database are properly in sync.

Two principles guided this piece: consistency and governance. Consistency meant indexing the table for fast queries. Governance meant checking every stored record and log for personal data, then capturing those findings in the privacy ADR. If this ever runs in production, that audit trail will matter.

Looking Ahead with Forecasting

With the present covered, I wanted a view of the near future. I integrated Prophet as a baseline forecasting model and benchmarked it against ARIMA and a lighter ML variant. The goal wasn’t precision for its own sake – it was clarity. A forecast a business owner can understand and trust.

Each run records the model version, parameters, and error metrics for traceability. Sometimes the model labels a trend as flat even when the line drifts upward – because the variance is low. That’s intentional. It keeps the narrative honest rather than overstating movement.

A few async tests failed during this phase thanks to event-loop quirks, but refactoring session handling fixed the issue. Nothing glamorous – just solid plumbing that makes systems dependable.

Making It Observable

By the end of Phase 1, I wanted evidence that the system performed under pressure. I added structured JSON logs and Prometheus-style metrics for latency, error rate, and request counts. Then I used Locust to simulate load at 100 requests per second, targeting a p95 latency under 300 ms.

Each test run is stored with its chart and metadata so performance can be tracked over time, not just as a one-off number.

I also built accessibility checks into the frontend – testing focus order, labels, and colour contrast. Those details don’t show up in a screenshot, but they’re what separate a demo from a product.

What I Learned

A few things stood out along the way:

Governance early saves time later. Writing ADRs as I worked slowed me down initially but saved far more time when debugging or explaining design choices.
Discipline beats speed. Continuous integration caught issues the moment they appeared. It’s not glamorous, but it keeps momentum sustainable.
Explainability builds trust. Confidence scores, escalation rules, and model metadata give stakeholders clear visibility into how the system behaves.
Small wins compound. Each sprint felt modest in isolation, but together they formed a credible, auditable product loop.

What You See in the MVP

The demo ties everything together: you can make a booking, ask a question, send feedback, and run a forecast. Each step is traceable and observable, with sensible safeguards around privacy and data handling.

It’s a working product loop with privacy-aware logging, confidence-aware responses, a live sentiment signal, and a reproducible forecast – all under test.

What Comes Next

Phase 2 is about opening it up responsibly: a self-serve route where others can try it without my laptop in the loop, transparent consent and privacy notices, and production-grade metrics pipelines.

Beyond that, the roadmap moves toward conversational bookings, Slack escalations, Google Calendar sync, OAuth, a richer analytics layer, and enterprise-style governance. The direction stays the same – useful, explainable, observable.

Closing Thoughts

This project has been as much about delivery discipline as it has been about code. I wanted to prove to myself that I could take an idea from concept to product with the same care and structure expected in enterprise teams. It’s been early mornings, late nights, and a lot of learning along the way.

If you watch the demo, I’d love your feedback. And if you’re building something similar, let’s compare notes. The aim isn’t perfection – it’s progress towards AI products that people can understand, trust, and actually use.