Build a self-hosted product analytics system in Replit using Express and PostgreSQL in 1-2 hours. You'll track user events, visualize funnels and retention, and query feature usage — a lightweight Mixpanel alternative. Replit Agent scaffolds the event ingestion pipeline, daily aggregation job, and React dashboard in one prompt.
What you're building
A self-hosted product analytics system lets you track exactly what users do inside your app — which features they use, where they drop off in a signup funnel, and how many come back after day 1. Third-party tools like Mixpanel start at $20/month and send your user data to a third party. A Replit-hosted system costs nothing beyond Replit's deployment fee and your data stays in your own PostgreSQL database.
Replit Agent generates the full backend in one prompt: an events table that stores every user interaction with arbitrary JSON properties, a funnels table for defining conversion steps, and aggregation tables for fast dashboard queries. The event ingest endpoint must return a response immediately — insert the event and return 200 without waiting for aggregation. A Scheduled Deployment runs hourly to roll raw events into daily_aggregates using an upsert, keeping dashboard queries millisecond-fast regardless of total event volume.
The most technically interesting part is funnel calculation. A simple COUNT of funnel step events undercounts because it doesn't verify sequential order. The correct approach is a PostgreSQL CTE that for each user finds the first occurrence of step 1, then checks if step 2 happened after it within the time window, then step 3 after step 2, and so on. This query is a database function — not built in JavaScript — so it benefits from PostgreSQL's query planner optimizations.
Final result
A product analytics backend with event ingestion, funnel analysis, cohort retention, and a React dashboard — all running on your Replit PostgreSQL database with your data staying private.
Tech stack
Prerequisites
- A Replit account (Free tier is sufficient)
- Basic understanding of what user events and funnels mean for your product
- The app you want to track (so you know which events to instrument — e.g., 'signup', 'feature_used', 'purchase')
- No external API keys needed for the core build
Build steps
Scaffold the project with Replit Agent
Create a new Replit App and paste this prompt. Agent builds the complete analytics backend with all five tables, routes, and a React dashboard structure.
1// Build a product analytics system with Express and PostgreSQL using Drizzle ORM.2//3// Tables:4// 1. events: id serial primary key, user_id text not null, event_name text not null,5// properties jsonb (arbitrary key-value pairs), session_id text,6// timestamp timestamp default now()7// 2. event_definitions: id serial, name text unique not null, category text,8// description text, is_active boolean default true9// 3. funnels: id serial, name text not null, steps jsonb not null (ordered array of event_name strings),10// owner_id text not null, time_window_hours integer default 72, created_at timestamp default now()11// 4. segments: id serial, name text not null,12// filter_config jsonb not null (e.g. {event_name: 'signup', property: 'plan', value: 'pro'}),13// owner_id text not null14// 5. daily_aggregates: id serial, event_name text not null, date date not null,15// count integer not null, unique_users integer not null, unique(event_name, date)16//17// Routes:18// POST /api/events (ingest — API key auth via x-api-key header from process.env.ANALYTICS_API_KEY)19// POST /api/events/batch (bulk ingest up to 100 events)20// GET /api/analytics/overview (top events by count over ?days=30 date range)21// GET /api/analytics/funnel/:id (calculate funnel conversion)22// GET /api/analytics/retention (cohort retention grid ?weeks=8)23// GET /api/analytics/events/:name/timeseries (count over time for a specific event)24//25// Admin routes (Replit Auth):26// POST /api/funnels, GET /api/funnels27//28// Bind to 0.0.0.0:3000.Pro tip: Add ANALYTICS_API_KEY to Replit Secrets (lock icon) before running. Set it to any long random string — this is what your tracked app will send as the x-api-key header to authenticate event ingestion.
Expected result: Running Express server with all five tables created. GET / returns the React dashboard shell.
Build the event ingest routes with fast response
The ingest endpoint is called on every user interaction in your app, so it must be fast. Insert the event and return 200 immediately — never do aggregation in the request path.
1const express = require('express');2const { db } = require('../db');3const { events } = require('../schema');4const { withDbRetry } = require('../lib/retryDb');56const router = express.Router();78// API key middleware for event ingestion9function requireApiKey(req, res, next) {10 const key = req.headers['x-api-key'];11 if (!key || key !== process.env.ANALYTICS_API_KEY) {12 return res.status(401).json({ error: 'Invalid API key' });13 }14 next();15}1617// Single event ingestion18router.post('/api/events', requireApiKey, express.json(), async (req, res) => {19 const { userId, eventName, properties, sessionId } = req.body;20 if (!userId || !eventName) {21 return res.status(400).json({ error: 'userId and eventName are required' });22 }23 // Return 200 immediately — don't await if you want sub-10ms responses24 // For correctness, we await but keep the handler fast25 await withDbRetry(() =>26 db.insert(events).values({27 userId: String(userId),28 eventName: String(eventName),29 properties: properties || null,30 sessionId: sessionId ? String(sessionId) : null,31 })32 );33 return res.json({ recorded: true });34});3536// Batch ingestion — up to 100 events37router.post('/api/events/batch', requireApiKey, express.json(), async (req, res) => {38 const { events: batch } = req.body;39 if (!Array.isArray(batch) || batch.length === 0) {40 return res.status(400).json({ error: 'events array required' });41 }42 if (batch.length > 100) {43 return res.status(400).json({ error: 'Max 100 events per batch' });44 }45 const rows = batch.map(e => ({46 userId: String(e.userId),47 eventName: String(e.eventName),48 properties: e.properties || null,49 sessionId: e.sessionId ? String(e.sessionId) : null,50 }));51 await withDbRetry(() => db.insert(events).values(rows));52 return res.json({ recorded: rows.length });53});5455module.exports = router;Pro tip: In your tracked app, send events via fetch('/api/events', ...) with the ANALYTICS_API_KEY. Keep the client-side call fire-and-forget using .catch(() => {}) so analytics tracking never breaks the user experience if the analytics server is slow.
Expected result: POST /api/events with {userId: 'u1', eventName: 'signup', properties: {plan: 'pro'}} returns {recorded: true} in under 100ms.
Build the funnel analysis query
Funnel analysis is the core analytics feature. This PostgreSQL CTE correctly calculates sequential conversion — a user counts for step 2 only if they did step 1 before it within the time window.
1const express = require('express');2const { db } = require('../db');3const { funnels } = require('../schema');4const { eq } = require('drizzle-orm');56const router = express.Router();78router.get('/api/analytics/funnel/:id', async (req, res) => {9 const [funnel] = await db.select().from(funnels)10 .where(eq(funnels.id, parseInt(req.params.id))).limit(1);11 if (!funnel) return res.status(404).json({ error: 'Funnel not found' });1213 const steps = funnel.steps; // e.g. ['page_view', 'signup_start', 'signup_complete', 'purchase']14 const windowHours = funnel.timeWindowHours || 72;1516 if (steps.length < 2) return res.status(400).json({ error: 'Funnel needs at least 2 steps' });1718 // Build a CTE that tracks each user through the funnel sequentially19 // For each step N, find users who did step N after step N-1 within the time window20 const cteLines = steps.map((step, i) => {21 if (i === 0) {22 return `step_0 AS (23 SELECT user_id, MIN(timestamp) AS step_time24 FROM events WHERE event_name = $${i + 1}25 GROUP BY user_id26 )`;27 }28 return `step_${i} AS (29 SELECT e.user_id, MIN(e.timestamp) AS step_time30 FROM events e31 JOIN step_${i - 1} prev ON e.user_id = prev.user_id32 WHERE e.event_name = $${i + 1}33 AND e.timestamp > prev.step_time34 AND e.timestamp <= prev.step_time + interval '${windowHours} hours'35 GROUP BY e.user_id36 )`;37 });3839 const countSelects = steps.map((_, i) =>40 `(SELECT COUNT(DISTINCT user_id) FROM step_${i}) AS step_${i}_count`41 ).join(',\n ');4243 const sql = `WITH ${cteLines.join(',\n')}\nSELECT ${countSelects}`;4445 const result = await db.execute({ sql, params: steps });46 const row = result.rows[0];4748 const counts = steps.map((step, i) => ({49 step: i + 1,50 event: step,51 users: parseInt(row[`step_${i}_count`]),52 conversionFromPrev: i === 0 ? 100 :53 row[`step_${i - 1}_count`] > 054 ? Math.round(row[`step_${i}_count`] / row[`step_${i - 1}_count`] * 100)55 : 0,56 }));5758 return res.json({ funnelId: funnel.id, name: funnel.name, steps: counts });59});6061module.exports = router;Pro tip: Create funnel definitions via POST /api/funnels before testing. A simple first funnel: steps: ['page_view', 'signup_start', 'signup_complete'] with timeWindowHours: 48. Send test events for a few userIds to verify the conversion counts.
Expected result: GET /api/analytics/funnel/1 returns an array of steps with user counts and conversion percentages. Step 1 shows total users who started; step 2 shows who completed.
Add the daily aggregation Scheduled Deployment
Raw events accumulate fast. A Scheduled Deployment runs every hour to roll event counts into the daily_aggregates table, keeping dashboard overview queries fast without scanning millions of raw rows.
1// server/jobs/aggregateEvents.js — run as a Scheduled Deployment every hour2const { db } = require('../db');3const { events, dailyAggregates } = require('../schema');4const { sql } = require('drizzle-orm');56async function runAggregation() {7 console.log('[aggregate] Starting daily aggregation job...');89 // Upsert daily counts for events from the last 48 hours10 // (overlap ensures we catch any late-arriving events from yesterday)11 await db.execute(sql`12 INSERT INTO daily_aggregates (event_name, date, count, unique_users)13 SELECT14 event_name,15 DATE(timestamp) AS date,16 COUNT(*) AS count,17 COUNT(DISTINCT user_id) AS unique_users18 FROM events19 WHERE timestamp >= NOW() - INTERVAL '48 hours'20 GROUP BY event_name, DATE(timestamp)21 ON CONFLICT (event_name, date)22 DO UPDATE SET23 count = EXCLUDED.count,24 unique_users = EXCLUDED.unique_users25 `);2627 // Clean up raw events older than 90 days to stay within PostgreSQL 10GB limit28 const deleted = await db.execute(sql`29 DELETE FROM events WHERE timestamp < NOW() - INTERVAL '90 days'30 `);3132 console.log(`[aggregate] Done. Deleted ${deleted.rowCount} old events.`);33}3435runAggregation()36 .then(() => process.exit(0))37 .catch(err => { console.error('[aggregate] Error:', err); process.exit(1); });Pro tip: In Replit's Publish pane, create a Scheduled Deployment pointing to this file as the entrypoint, set to run every hour. This keeps the main Autoscale app light while the heavy aggregation runs separately.
Expected result: After the job runs, the daily_aggregates table shows counts per event name per day. The overview dashboard query hits daily_aggregates instead of scanning all events — response time drops from 2 seconds to under 50ms.
Build the React analytics dashboard
Ask Agent to create the frontend with the four core analytics views: top events, funnel visualization, retention matrix, and event detail timeseries.
1// Ask Agent to build the React analytics dashboard with this prompt:2// Build a React analytics dashboard with four views:3//4// 1. Overview Page:5// - Date range selector (last 7/30/90 days)6// - Fetch GET /api/analytics/overview?days=307// - Bar chart using recharts BarChart showing top 10 events by count8// - Each bar shows event name and total count9// - Click on a bar navigates to that event's detail page10//11// 2. Funnel Page:12// - Dropdown to select from GET /api/funnels list13// - Fetch GET /api/analytics/funnel/:id on selection14// - Horizontal funnel visualization: series of bars narrowing left to right15// - Each bar shows step name, user count, and conversion % from previous step16// - Color gradient from green (high conversion) to red (drop-off)17//18// 3. Retention Page:19// - Fetch GET /api/analytics/retention?weeks=820// - Cohort grid: rows = cohort week, columns = weeks since first event21// - Cell value = % of cohort users who returned that week22// - Color cells from white (0%) to deep blue (100%)23//24// 4. Event Detail Page (/events/:name):25// - Line chart showing daily count for this event over the past 30 days26// - Properties breakdown: top 10 values for each property key found in this event's jsonb27//28// Use React Router for navigation. Add a nav bar with links to all four views.Expected result: The dashboard loads with a top events bar chart. Selecting a funnel shows conversion steps. The retention grid shows cohort data as a colored matrix.
Complete code
1const express = require('express');2const { db } = require('../db');3const { events } = require('../schema');4const { withDbRetry } = require('../lib/retryDb');56const router = express.Router();78function requireApiKey(req, res, next) {9 if (req.headers['x-api-key'] !== process.env.ANALYTICS_API_KEY) {10 return res.status(401).json({ error: 'Invalid API key' });11 }12 next();13}1415function validateEvent(e) {16 return e && typeof e.userId === 'string' && typeof e.eventName === 'string';17}1819router.post('/api/events', requireApiKey, express.json(), async (req, res) => {20 if (!validateEvent(req.body)) {21 return res.status(400).json({ error: 'userId (string) and eventName (string) are required' });22 }23 try {24 await withDbRetry(() =>25 db.insert(events).values({26 userId: req.body.userId,27 eventName: req.body.eventName,28 properties: req.body.properties || null,29 sessionId: req.body.sessionId || null,30 })31 );32 return res.json({ recorded: true });33 } catch (err) {34 console.error('[events] insert error:', err.message);35 return res.status(500).json({ error: 'Failed to record event' });36 }37});3839router.post('/api/events/batch', requireApiKey, express.json(), async (req, res) => {40 const { events: batch } = req.body;41 if (!Array.isArray(batch) || batch.length === 0) {42 return res.status(400).json({ error: 'events array is required' });43 }44 if (batch.length > 100) {45 return res.status(413).json({ error: 'Max 100 events per batch' });46 }47 const invalid = batch.find(e => !validateEvent(e));48 if (invalid) return res.status(400).json({ error: 'Each event needs userId and eventName' });4950 const rows = batch.map(e => ({51 userId: String(e.userId),52 eventName: String(e.eventName),53 properties: e.properties || null,54 sessionId: e.sessionId ? String(e.sessionId) : null,55 }));56 try {57 await withDbRetry(() => db.insert(events).values(rows));58 return res.json({ recorded: rows.length });59 } catch (err) {60 console.error('[events] batch error:', err.message);61 return res.status(500).json({ error: 'Failed to record batch' });62 }63});6465module.exports = router;Customization ideas
Session replay markers
Add a session_id to every event, then build a GET /api/sessions/:id/timeline endpoint that returns all events for a session in chronological order — a lightweight session timeline for debugging user journeys.
Property-based segments
Use the segments table to define user groups by event properties (e.g., users who did 'signup' with properties.plan = 'pro'). Add a GET /api/segments/:id/users endpoint that queries matching users for targeted analysis.
Alerting on event drops
Add a Scheduled Deployment that compares yesterday's event counts in daily_aggregates against the 7-day average. If any event drops more than 50%, send an email via SendGrid — catching instrumentation regressions before they go unnoticed.
Public shareable dashboards
Add a share_token column to funnels and dashboards. A GET /public/funnel/:token endpoint returns funnel data without auth — so you can share analytics results with stakeholders who don't have a Replit account.
Common pitfalls
Pitfall: Running aggregation queries against raw events on every dashboard load
How to avoid: Use the daily_aggregates table for overview queries. Only query raw events for drill-down pages that need properties-level detail, and add an index on (event_name, timestamp).
Pitfall: Counting funnel steps independently instead of sequentially
How to avoid: Use the PostgreSQL CTE approach from step 3. Each step's CTE JOINs the previous step's results to ensure sequential ordering within the time window.
Pitfall: Exposing the event ingest endpoint without authentication
How to avoid: Always require the x-api-key header on ingest endpoints. Store ANALYTICS_API_KEY in Replit Secrets, never hardcode it in the tracked app's client-side code — send events server-to-server if possible.
Pitfall: Not adding a data retention policy for raw events
How to avoid: The aggregation job deletes raw events older than 90 days. Keep daily_aggregates forever — they're tiny (one row per event name per day) and contain the historical trend data you actually need.
Best practices
- Return 200 from event ingest immediately after insert — never do aggregation, notification, or other work in the ingest request path.
- Use ANALYTICS_API_KEY in Replit Secrets (never hardcoded) to authenticate the ingest endpoint from your tracked app.
- Store the aggregation job in server/jobs/ and deploy it as a separate Scheduled Deployment — keep it completely separate from the main Express app.
- Index the events table on (event_name, timestamp) and (user_id, timestamp) for fast funnel and retention queries.
- Use Drizzle Studio (Database tab in Replit sidebar) to inspect raw events during development — faster than writing SELECT queries every time.
- Set an ANALYTICS_API_KEY that's at least 32 characters long — short keys are brute-forceable and your analytics data has competitive value.
- For the retention query, use PostgreSQL DATE_TRUNC to group users by their first-seen week — this creates consistent cohort boundaries.
AI prompts to try
Copy these prompts to build this project faster.
I'm building a self-hosted product analytics system with Express and PostgreSQL. I have an events table (user_id, event_name, properties jsonb, timestamp) and a funnels table (steps jsonb — ordered array of event names, time_window_hours integer). Help me write a PostgreSQL function calculate_funnel(funnel_id integer) that uses a recursive CTE to find users who completed each step sequentially within the time window, returning an array of {step, event_name, user_count, conversion_from_previous} objects.
Add an A/B testing layer to the product analytics app. Create an experiments table (id serial, name text unique, variants jsonb array of {name, weight} objects, start_date timestamp, end_date timestamp, status text enum draft/running/concluded) and an experiment_assignments table (user_id text, experiment_id integer, variant text, assigned_at timestamp, unique on user_id+experiment_id). Build GET /api/experiments/:id/assign?userId= that deterministically assigns a variant using a hash of (userId + experimentId) mod totalWeight — same user always gets same variant. Add GET /api/experiments/:id/results that calculates conversion rate per variant by joining assignments against events for a specified goal event. Show results in a React page with variant comparison cards and statistical significance indicator.
Frequently asked questions
How is this different from just using Google Analytics?
Google Analytics tracks page views and basic goals — it can't track custom application events like 'draft_saved' or 'report_exported' inside your app. This system tracks any event with any properties you define, stores them in your own database, and lets you write custom queries for analysis Google Analytics can't do.
Can I track events from a mobile app or multiple websites?
Yes. The ingest endpoint is a standard HTTP API. Any app that can make an HTTP POST request with the x-api-key header can send events. You might use different event_name prefixes per platform (e.g., 'ios_signup', 'web_signup') to distinguish sources in your queries.
What Replit plan do I need?
Free tier is sufficient for the core build. The main Express app deploys on Autoscale (free). The aggregation Scheduled Deployment also runs on Free tier. You only need a paid plan if you want always-on Reserved VM deployment.
How many events can I store before hitting the 10GB PostgreSQL limit?
Each raw event row is approximately 200-500 bytes with properties. At 500 bytes per event, 10GB holds about 20 million events. The aggregation job deletes events older than 90 days, so active apps should stay well within the limit.
Why does the funnel calculation use a CTE instead of simple COUNTs?
A simple COUNT of each step event overcounts users who did step 3 without doing step 1. The CTE approach ensures each step only counts users who completed all previous steps in the correct order within the configured time window — which is what makes funnel data actionable.
Should I deploy on Autoscale or Reserved VM?
Autoscale works well for the main analytics API — event ingest traffic is bursty and the app handles cold starts gracefully. The aggregation job runs as a separate Scheduled Deployment so it's not affected by Autoscale cold starts.
Can RapidDev help build a custom analytics system for my app?
Yes. RapidDev has built analytics backends for 600+ apps and can extend this system with custom event schemas, advanced funnel analysis, real-time dashboards, and integrations with your existing data stack. Free consultation available.
Can I use this analytics system alongside Mixpanel or Amplitude?
Absolutely. Send events to both systems simultaneously from your app — your own system for custom analysis and full data ownership, the third-party tool for its polished UI features. The API is compatible with any event-tracking pattern.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation