How We Planned the Architecture & Tech Stack for Our ETL Platform (MVP → Next-Level)

Discover the architectural decisions behind our ETL platform, including MVP tech stack choices, real-time updates, and a clear path to scale.

Siva_Rails Expert

Sivamanikandan K

Solution Architect

Depiction of ETL architecture building in desktop monitor

At Railsfactory, we are building a modern ETL (Extract–Transform–Load) platform that connects with external services such as Microsoft ERP, Zoho ERP, and many third-party APIs.

Our system fetches relevant data (e.g., Sales Report tables), maps schemas, processes jobs through an ETL engine, and finally makes the output available to reporting tools like Power BI or Tableau.

This blog explains how we architected the MVP to be simple, fast, and powerful. It also elaborates how the architecture can evolve after the MVP, with a clear upgrade path.

I’ve written this so that even beginners can clearly understand how to choose the right architecture and tech stack.

MVP Architecture – Simple, Practical, and Real-Time

The goal of our MVP was to build a working ETL system quickly without over-engineering. At the same time, I wanted it to feel modern, so we included real-time job progress using Server-Sent Events (SSE) inside the MVP itself.

Here’s how the full MVP architecture looks.

1. Frontend: Next.js

Why I choose Next.js:

  • React ecosystem + excellent developer experience

  • Server-side rendering useful for dashboards

  • Easy to integrate with REST API + SSE

  • Supports API routes for proxying calls

  • Fast to build an MVP UI

Compared to Alternatives:

ETL blog infog_1.png

2. Engine Backend: FastAPI + PostgreSQL

The ETL engine is developed using FastAPI, with Postgres as the database.

FastAPI Responsibilities:

  • Handle external integrations (Zoho, Microsoft ERP, etc.)

  • Map data structures dynamically

  • Store connection details

  • Push jobs to queue table

  • Stream job progress to UI (SSE)

  • Push final updates via webhooks

Why I choose FastAPI:

  • Async & extremely fast

  • Easy schema validation with Pydantic

  • Perfect for hitting 3rd-party APIs

  • Clean design & easy to maintain

Compared to Alternatives:

etl blog_infog 2.png

3. Queue System (MVP): PostgreSQL Table

MVP must stay simple.

So instead of heavy queue systems like Kafka/RabbitMQ, I used a Postgres message queue table.

Why?

  • Zero extra infra

  • Easy to debug

  • Perfect for low to medium load during MVP

  • Predictable behavior

Table Example:

message_queue

  • id

  • job_type

  • payload

  • status

  • progress

  • created_at

A worker service polls this table every few seconds and processes jobs.

4. Real-Time Updates: Server-Sent Events (SSE)

This is the most important technical choice I made in the MVP.

Instead of polling every few seconds (slow + inefficient), I implemented SSE (Server-Sent Events) directly in FastAPI and Next.js.

Why SSE in MVP?

  • Lightweight

  • Event-based

  • Easy to implement (1–2 days only)

  • Perfect for one-way updates (Engine → UI)

  • Automatic reconnection on browser side

  • No need for WebSockets or Redis Pub/Sub

Use Cases:

  • Job progress %

  • ETL logs

  • Step-by-step pipeline updates

  • Success / Failure notifications

For ETL, this feels modern, smooth, and accurate.

Why not web sockets?

WebSockets are bidirectional but heavier; we only need server→client updates. GraphQL subscriptions require schema + resolver overhead — overkill for simple job progress.

5. Webhook Flow for Final Job Completion

Even though SSE streams progress, the engine still calls a webhook upon completion.

Why use both?

  • SSE = live progress

  • Webhook = final confirmation

This gives strong reliability even if the browser disconnects.

MVP Architecture Diagram (With SSE)

etl_diagram-2.png

Note: Postgres → (triggers via LISTEN/NOTIFY) → FastAPI → (SSE stream) → Next.js

Summary of MVP Tech Stack

etl blog infog_3.png The MVP is fast to deliver, stable, and gives users a delightful real-time experience.

Post-MVP – Future Architecture (Tech Stack Upgrades Only)

Once the product gets traction, the architecture can easily scale without rewriting the MVP.

Below are the tech stack upgrades planned after MVP.

1. Queue System Upgrade

Replace Postgres queue with:

Recommended Choices:

  • Redis Streams → simple, fast, great for medium scale

  • RabbitMQ → stable queue system

  • Kafka → heavy load, real-time streaming, enterprise-grade

2. Real-Time Event System

Move from SSE-only to:

  • SSE + Redis Pub/Sub

or

  • Kafka → FastAPI → SSE

Still using SSE at UI layer (no change in frontend).

3. Microservices Structure

Split FastAPI engine into:

  • Connector Service

  • Transform Service

  • Scheduler Worker

  • Notification Service

  • Metadata Service

  • File Processing Service

4. Observability

Add modern monitoring stack:

  • Grafana

  • Prometheus

  • Loki

  • ELK Stack

5. Security / Identity

Use:

  • OAuth2 (for connectors)

  • Key Management

  • RBAC for multi-tenant access

Why This Architecture Works

The architecture I designed gives us:

Speed during MVP

No heavy systems. Only essential components.

Real-time experience using SSE

Feels like a premium data product even at MVP stage.

Smooth upgrade path

Every MVP component can evolve without rewrites.

Scalable for enterprise

Kafka, Redis, microservices, observability — all ready for future adoption.

Adding MCP for Chat-Based Application Support

As the ETL product grows, usability becomes a major factor.

End-users such as data analysts, business teams, Power BI/Tableau users often don’t want to manually configure connectors, map tables, or trigger pipelines.

To solve this, the future architecture will include MCP (Model Context Protocol) to offer chat-driven interaction inside the application.

MCP enables LLMs (like ChatGPT or custom on-prem models) to interact directly with:

  • our API

  • our ETL engine

  • our connectors

  • our metadata store

  • user configuration

  • job monitoring and logs

This gives users a conversational UI layer on top of the product.

Why MCP?

MCP provides:

A standard protocol for LLMs to talk to backend systems

Instead of custom integrations, MCP defines how tools, resources, and APIs are exposed to the AI agent.

Secure access to internal operations

Example:

“Create a Zoho Books connection for account X”

“Run the Sales ETL pipeline for last month”

“Show me the progress of Job #12345”

No need to build a manual UI for every function

The AI assistant will guide the user step by step.

Works with multiple LLMs

OpenAI, Anthropic, Azure, self-hosted models.

Great for onboarding & support

Users can simply ask what they want instead of navigating menus.

How MCP Fits Into the System

Below is the high-level integration plan:

etl_diagram-1.png

The MCP server will expose:

  • list/connectors

  • trigger ETL job

  • map tables

  • list fields

  • get logs

  • retrieve progress (SSE)

  • check job history

  • generate schema suggestions

MCP Architecture Components

1. Chat UI (Next.js)

A dedicated AI section:

  • inside app

  • or as a floating assistant

  • or as a dedicated /assistant route

2. MCP Client (Browser)

Uses OpenAI’s MCP client libraries to communicate with the backend.

3. MCP Server (Backend Service)

A new service that:

  • exposes ETL engine functions as “tools”

  • reads/writes from Postgres

  • can call FastAPI routes

  • can fetch live updates (SSE)

  • provides structured responses to the LLM

4. LLM Integration

Supports:

  • OpenAI GPT-5 / GPT-4o

  • Local LLM (Ollama)

  • Enterprise secured LLM

5. Permissions & Security Layer

Ensures:

  • role-based access

  • scoped API access

  • audit logs for AI-triggered actions

What Users Will Be Able to Do via Chat

Example Commands:

  • “Connect my Zoho Books account”

  • “Show me yesterday’s failed pipelines”

  • “Run the transformation pipeline for Microsoft ERP Sales data”

  • “Explain why the last job failed”

  • “Map the Customer table to the Analytics schema automatically”

This reduces:

  • UI complexity

  • onboarding effort

  • support load

  • learning curve

Why Add MCP After MVP?

  • MVP should be simple

  • Users should adopt the product first

  • ETL features must be stable

  • Chat assistant requires proper tools & stable APIs

  • MCP works best when backend operations are well-defined

So:

MVP = UI + Engine + Queue + SSE

After MVP = Add MCP, AI assistant, auto-suggestions

How MCP Helps Product Adoption

Etl blog_infog 4.png Note: We use MCP-inspired patterns, exposing backend capabilities as structured ‘tools’ (name, params, description) consumable by LLMs via standard APIs. This aligns with emerging protocols like MCP, but is implementable today using REST + JSON Schema.

Conclusion

This ETL platform was architected with a clear principle: build for today’s constraints without compromising tomorrow’s scale. The initial implementation is intentionally simple, but the overall system design is not short-sighted.

Each layer of the architecture has a defined responsibility and a clear evolution path. Queueing, eventing, service boundaries, observability, and AI-driven interaction are treated as progressive enhancements, not afterthoughts. This allows the platform to grow in capability and throughput without destabilizing the core system or forcing large rewrites.

The real value of this approach is architectural clarity. The system remains understandable, debuggable, and operable at every stage of growth. As requirements shift from basic ETL workflows to enterprise-scale data pipelines and AI-assisted operations, the architecture absorbs that change rather than fighting it.

This is the difference between shipping something that works and building a platform that lasts.

If you’re looking for a partner who can design and build MVPs that evolve into long-term platforms, reach out to the Railsfactory team.

Written by Sivamanikandan K

Sivamanikandan is a Solution Architect with extensive experience designing and delivering scalable, high-performance applications across modern web and mobile technologies. With hands-on expertise in Rails, Node.js, Next.js, React.js, Angular, React Native, and Python, he has successfully led and executed 20+ projects and contributed to multiple enterprise-grade products.

Other blogs

You may also like


Your one-stop shop for expert RoR services

join 250+ companies achieving top-notch RoR development without increasing your workforce.