Customer Feedback Intelligence

TL;DR

Problem: Move beyond an isolated IMDb benchmark and turn sentiment modeling into something closer to a usable customer-feedback analysis tool.
Approach: Rebuilt the project around a reproducible IMDb baseline, transfer evaluation on Amazon reviews, and a Gradio dashboard for full-batch upload, triage, clustering, and export.
Outcome: End-to-end workflow that benchmarks a saved sentiment model, checks cross-domain transfer, and applies it to real customer-feedback-style review batches.

What I Built

Reproducible benchmark: built a clean IMDb pipeline with deterministic sampling, TF-IDF + Logistic Regression as the active saved baseline, and a RoBERTa fine-tuning path for later higher-capacity runs.

Transfer checks: evaluated the IMDb-trained model on Amazon polarity reviews and on a fixed local 200-example customer-feedback evaluation set to see how far the benchmark generalizes without retraining.

Dashboard product surface: added a Gradio interface that accepts pasted text or uploads, preserves metadata like `channel` and `product`, scores the whole batch, exports the filtered results as CSV, and is now deployed publicly on Hugging Face Spaces.

Triage and summarization: layered confidence, uncertainty, manual-review gating, priority scoring, and exploratory theme clustering on top of raw sentiment predictions so the tool feels useful for analysts instead of just model inspection.

Results

IMDb Benchmark

90.15%

TF-IDF baseline test accuracy and macro F1 on IMDb with a reproducible saved inference artifact.

Amazon Transfer

85.65%

Zero-shot accuracy on Amazon polarity, showing the IMDb-trained model transfers partially but not perfectly to product reviews.

Product Surface

Public Demo Live

Full-batch customer-feedback dashboard is deployed publicly on Hugging Face Spaces rather than staying as a local-only Gradio app.

Project evolution: this started from an older IMDb sentiment and MoE direction, but I intentionally rebuilt it into a cleaner product-facing story centered on reusable inference and customer-feedback analysis.

What the current results mean: IMDb is strong enough to produce a useful starting sentiment model, while the Amazon and local-customer-feedback checks make the cross-domain limits explicit instead of hiding them.

Why it matters: the project now connects benchmarking, transfer evaluation, and a usable public dashboard surface in one coherent workflow rather than stopping at model training.

Links

Live Demo GitHub