Build a custom report builder in Replit using Express and PostgreSQL in 1-2 hours. Users define reports by picking columns, adding filters, and choosing aggregates — the engine safely translates their selections into SQL queries using Drizzle's query builder. Exports to CSV or PDF with a Scheduled Deployment for emailed reports.
What you're building
A custom report builder solves a problem every growing business faces: your team wants to answer specific questions from your data, but they can't write SQL. A reporting tool gives non-technical users a way to define exactly the data slice they need — which columns to show, which filters to apply, and how to aggregate — without touching a database query.
Replit Agent builds the complete backend with four tables: data_sources (registered tables in your PostgreSQL), reports (saved report definitions as JSON), report_schedules (automated email delivery), and report_exports (generated file history). The execution engine is the critical piece — it reads a report's JSON config and builds a safe SQL query using Drizzle's parameterized query builder. Never string-interpolating user input into SQL is the most important security rule here.
The schema introspection feature makes the tool self-documenting: when a user picks a data source, the API queries PostgreSQL's information_schema.columns to return the actual column names and data types for that table. This means the column picker always shows accurate, up-to-date options without you maintaining a separate schema definition file. A Scheduled Deployment runs daily to check which reports have email delivery configured and are due to run — generating the report, attaching a CSV, and sending via SendGrid.
Final result
A custom report builder with schema introspection, SQL-safe query execution, CSV/PDF export, and scheduled email delivery — running on your Replit PostgreSQL database with no external BI tool costs.
Tech stack
Prerequisites
- A Replit account (Free tier is sufficient)
- A SendGrid account (free tier) for scheduled report email delivery
- SENDGRID_API_KEY added to Replit Secrets and Deployment Secrets
- Some data in your PostgreSQL tables to report on — the tool works on your existing tables
Build steps
Scaffold the project with Replit Agent
Create a new Replit App and paste this prompt. Agent builds the complete reporting backend with all four tables and the React wizard frontend.
1// Build a custom reporting tool with Express and PostgreSQL using Drizzle ORM.2// Use Replit Auth for authentication.3//4// Tables:5// 1. data_sources: id serial primary key, user_id text not null, name text not null,6// type text default 'postgres' (enum: postgres/csv/api),7// connection_config jsonb (stores table name for postgres type),8// created_at timestamp default now()9// 2. reports: id serial primary key, user_id text not null, name text not null,10// data_source_id integer references data_sources not null,11// columns jsonb not null (array of {field, label, aggregate: null/sum/count/avg/min/max}),12// filters jsonb (array of {field, operator: eq/neq/gt/lt/gte/lte/like/in, value}),13// group_by text[] (PostgreSQL text array of field names),14// sort_by jsonb ({field, direction: asc/desc}),15// created_at timestamp default now(), updated_at timestamp default now()16// 3. report_schedules: id serial primary key,17// report_id integer references reports not null,18// frequency text not null (enum: daily/weekly/monthly),19// email_to text not null, last_run_at timestamp, next_run_at timestamp20// 4. report_exports: id serial primary key,21// report_id integer references reports not null,22// format text not null (enum: csv/pdf/json),23// file_url text, created_at timestamp default now()24//25// Routes:26// GET/POST /api/data-sources27// GET /api/data-sources/:id/schema (introspect columns using information_schema)28// GET/POST /api/reports, GET /api/reports/:id29// POST /api/reports/:id/run (execute report, return JSON results)30// POST /api/reports/:id/export (generate CSV or PDF)31// POST /api/report-schedules32//33// Bind to 0.0.0.0:3000. Use Replit Auth.Pro tip: After scaffolding, register your first data source via POST /api/data-sources with {name: 'Orders', type: 'postgres', connection_config: {table: 'orders'}}. Then call GET /api/data-sources/1/schema to see all columns auto-detected from information_schema.
Expected result: Running Express app with four tables. GET /api/data-sources returns empty array. Replit Auth login gate on all routes.
Build the schema introspection route
This route queries PostgreSQL's information_schema to return the columns and data types for any registered table — powering the dynamic column picker in the report wizard.
1const express = require('express');2const { db } = require('../db');3const { dataSources } = require('../schema');4const { eq, and } = require('drizzle-orm');56const router = express.Router();78router.get('/api/data-sources/:id/schema', async (req, res) => {9 if (!req.user) return res.status(401).json({ error: 'Login required' });1011 const [source] = await db.select().from(dataSources)12 .where(and(13 eq(dataSources.id, parseInt(req.params.id)),14 eq(dataSources.userId, req.user.id)15 )).limit(1);1617 if (!source) return res.status(404).json({ error: 'Data source not found' });1819 const tableName = source.connectionConfig?.table;20 if (!tableName) return res.status(400).json({ error: 'No table configured for this source' });2122 // Validate table name is alphanumeric to prevent SQL injection23 if (!/^[a-zA-Z0-9_]+$/.test(tableName)) {24 return res.status(400).json({ error: 'Invalid table name' });25 }2627 const columns = await db.execute({28 sql: `29 SELECT column_name, data_type, is_nullable, column_default30 FROM information_schema.columns31 WHERE table_schema = 'public'32 AND table_name = $133 ORDER BY ordinal_position34 `,35 params: [tableName],36 });3738 // Map PostgreSQL types to friendlier labels39 const typeMap = {40 'integer': 'number', 'bigint': 'number', 'numeric': 'number',41 'text': 'string', 'character varying': 'string', 'varchar': 'string',42 'timestamp with time zone': 'date', 'timestamp without time zone': 'date',43 'date': 'date', 'boolean': 'boolean', 'jsonb': 'json', 'json': 'json',44 'ARRAY': 'array',45 };4647 const schema = columns.rows.map(col => ({48 field: col.column_name,49 type: typeMap[col.data_type] || col.data_type,50 nullable: col.is_nullable === 'YES',51 // Suggest aggregates based on type52 supportedAggregates: typeMap[col.data_type] === 'number'53 ? ['count', 'sum', 'avg', 'min', 'max']54 : ['count'],55 }));5657 return res.json({ sourceId: source.id, tableName, columns: schema });58});5960module.exports = router;Pro tip: The table name validation regex (alphanumeric + underscore only) is essential. Without it, a malicious user could inject SQL via the table name field — even though it's used in a parameterized query context, table names can't be parameterized in PostgreSQL.
Expected result: GET /api/data-sources/1/schema returns an array of columns with types. A table with 8 columns returns 8 schema entries, each with field name, type, and supported aggregates.
Build the safe SQL query executor
The report execution engine reads the report's JSON config and builds a parameterized SQL query using Drizzle's query builder. Never string-interpolate user values — always use parameterized queries.
1const express = require('express');2const { db } = require('../db');3const { reports, dataSources } = require('../schema');4const { eq, and } = require('drizzle-orm');5const { sql } = require('drizzle-orm');67const router = express.Router();89// Map operator names to SQL operators10const OPERATORS = {11 eq: '=', neq: '!=', gt: '>', lt: '<', gte: '>=', lte: '<=', like: 'LIKE',12};1314router.post('/api/reports/:id/run', express.json(), async (req, res) => {15 if (!req.user) return res.status(401).json({ error: 'Login required' });1617 const [report] = await db.select().from(reports)18 .where(and(eq(reports.id, parseInt(req.params.id)), eq(reports.userId, req.user.id)))19 .limit(1);20 if (!report) return res.status(404).json({ error: 'Report not found' });2122 const [source] = await db.select().from(dataSources)23 .where(eq(dataSources.id, report.dataSourceId)).limit(1);24 const tableName = source?.connectionConfig?.table;2526 if (!tableName || !/^[a-zA-Z0-9_]+$/.test(tableName)) {27 return res.status(400).json({ error: 'Invalid data source table' });28 }2930 // Build SELECT clause31 const columnDefs = report.columns;32 const selectParts = columnDefs.map(col => {33 const safeField = col.field.replace(/[^a-zA-Z0-9_]/g, '');34 if (col.aggregate && col.aggregate !== 'none') {35 return `${col.aggregate.toUpperCase()}(${safeField}) AS "${col.label || safeField}"`;36 }37 return `${safeField} AS "${col.label || safeField}"`;38 });3940 // Build WHERE clause with parameterized values41 const params = [];42 const whereParts = (report.filters || []).map(filter => {43 const safeField = filter.field.replace(/[^a-zA-Z0-9_]/g, '');44 const op = OPERATORS[filter.operator] || '=';45 params.push(filter.value);46 return `${safeField} ${op} $${params.length}`;47 });4849 // Build GROUP BY50 const groupByFields = (report.groupBy || []).map(f => f.replace(/[^a-zA-Z0-9_]/g, ''));5152 // Build ORDER BY53 let orderBy = '';54 if (report.sortBy?.field) {55 const safeSort = report.sortBy.field.replace(/[^a-zA-Z0-9_]/g, '');56 const dir = report.sortBy.direction === 'desc' ? 'DESC' : 'ASC';57 orderBy = `ORDER BY ${safeSort} ${dir}`;58 }5960 const query = [61 `SELECT ${selectParts.join(', ')}`,62 `FROM ${tableName}`,63 whereParts.length ? `WHERE ${whereParts.join(' AND ')}` : '',64 groupByFields.length ? `GROUP BY ${groupByFields.join(', ')}` : '',65 orderBy,66 'LIMIT 1000',67 ].filter(Boolean).join('\n');6869 const result = await db.execute({ sql: query, params });70 return res.json({ rows: result.rows, count: result.rows.length });71});7273module.exports = router;Pro tip: The LIMIT 1000 at the end prevents memory exhaustion on large tables. For exports, the CSV streaming route can remove this limit and stream rows in chunks of 500 to avoid loading the entire result into memory.
Expected result: POST /api/reports/1/run returns {rows: [{column1: val, column2: val}], count: 47} based on the report's saved configuration.
Add CSV export and scheduled report delivery
The export route generates a CSV from the report results. A Scheduled Deployment checks for due report schedules and emails the CSV attachment via SendGrid.
1// Ask Agent to add CSV export and scheduled delivery with this prompt:2// Add to server/routes/reports.js:3//4// POST /api/reports/:id/export:5// 1. Require auth, load report config6// 2. Run the same SQL query as /run (extract to a shared executeReport(reportId) function)7// 3. If format='csv':8// - Generate CSV string: first row = column labels, remaining rows = data values9// - Use Array.join() to build CSV, escape values with quotes if they contain commas10// - Set Content-Type: text/csv, Content-Disposition: attachment; filename=report.csv11// - Send the CSV string directly (no file storage needed for on-demand exports)12// 4. If format='pdf':13// - Use pdfkit npm package14// - Create a PDF doc with the report name as title, run date, and data as a simple table15// - Send as application/pdf response16// 5. Insert into report_exports table with format and created_at17//18// Create server/jobs/sendScheduledReports.js:19// 1. Query report_schedules where next_run_at <= now()20// 2. For each due schedule, call executeReport(scheduleId) to get rows21// 3. Generate CSV attachment22// 4. Send email via SendGrid (SENDGRID_API_KEY from process.env) with CSV attachment23// 5. Update last_run_at = now(), calculate next_run_at based on frequency24// (daily: + 1 day, weekly: + 7 days, monthly: + 1 month)25// 6. Deploy as Scheduled Deployment running every hourPro tip: For the CSV export, handle commas and newlines in cell values by wrapping all values in double quotes and escaping internal double quotes as '""'. This prevents broken CSV files when report data contains punctuation.
Expected result: POST /api/reports/1/export with {format:'csv'} triggers a CSV file download. The Scheduled Deployment sends the report CSV via email at the configured frequency.
Build the three-step React report wizard
Ask Agent to create the React frontend with the wizard for building reports and a saved reports list.
1// Ask Agent to build the React frontend with this prompt:2// Build a React reporting tool frontend with three pages:3//4// 1. Reports List page (/):5// - List of user's saved reports with name, data source, and created date6// - Each row has: Run (executes and shows results), Export CSV, Export PDF, Schedule buttons7// - 'Create Report' button navigates to the wizard8//9// 2. Report Builder Wizard (/reports/new and /reports/:id/edit):10// Step 1 — Select Data Source and Columns:11// - Dropdown of GET /api/data-sources sources12// - On selection, fetch GET /api/data-sources/:id/schema for column list13// - Checkbox list of columns, each with a label input and aggregate selector dropdown14// (none/count/sum/avg/min/max — only show numeric aggregates for number-type columns)15//16// Step 2 — Add Filters:17// - 'Add Filter' button adds a row: field selector, operator selector, value input18// - Operators: equals, not equals, greater than, less than, contains, is empty19// - A 'Sort by' row with field selector and ASC/DESC toggle20//21// Step 3 — Preview and Save:22// - 'Preview' button calls POST /api/reports/:id/run and shows results in a data table23// - Results table with sortable column headers24// - Row count badge25// - 'Save Report' saves the config, 'Export CSV' triggers download26//27// 3. Schedule Dialog:28// - Opens when clicking 'Schedule' on a saved report29// - Frequency selector (daily/weekly/monthly)30// - Email input31// - Calls POST /api/report-schedules on saveExpected result: The wizard shows column checkboxes populated from schema introspection. Adding filters shows the correct operators. Preview renders a data table with real query results.
Complete code
1const express = require('express');2const { db } = require('../db');3const { dataSources } = require('../schema');4const { eq, and } = require('drizzle-orm');5const { withDbRetry } = require('../lib/retryDb');67const router = express.Router();89const TYPE_MAP = {10 integer: 'number', bigint: 'number', numeric: 'number', real: 'number',11 text: 'string', 'character varying': 'string', varchar: 'string', char: 'string',12 'timestamp with time zone': 'date', 'timestamp without time zone': 'date', date: 'date',13 boolean: 'boolean', jsonb: 'json', json: 'json', 'ARRAY': 'array',14};1516router.get('/api/data-sources', async (req, res) => {17 if (!req.user) return res.status(401).json({ error: 'Login required' });18 const rows = await db.select().from(dataSources)19 .where(eq(dataSources.userId, req.user.id))20 .orderBy(dataSources.name);21 return res.json({ sources: rows });22});2324router.post('/api/data-sources', express.json(), async (req, res) => {25 if (!req.user) return res.status(401).json({ error: 'Login required' });26 const { name, type, connectionConfig } = req.body;27 if (!name || !type) return res.status(400).json({ error: 'name and type are required' });28 if (type === 'postgres' && !/^[a-zA-Z0-9_]+$/.test(connectionConfig?.table)) {29 return res.status(400).json({ error: 'Invalid table name — alphanumeric and underscores only' });30 }31 const row = await withDbRetry(() =>32 db.insert(dataSources).values({33 userId: req.user.id, name, type,34 connectionConfig: connectionConfig || null,35 }).returning()36 );37 return res.status(201).json(row[0]);38});3940router.get('/api/data-sources/:id/schema', async (req, res) => {41 if (!req.user) return res.status(401).json({ error: 'Login required' });42 const [source] = await db.select().from(dataSources)43 .where(and(eq(dataSources.id, parseInt(req.params.id)), eq(dataSources.userId, req.user.id)))44 .limit(1);45 if (!source) return res.status(404).json({ error: 'Data source not found' });46 const tableName = source.connectionConfig?.table;47 if (!tableName || !/^[a-zA-Z0-9_]+$/.test(tableName)) {48 return res.status(400).json({ error: 'Invalid table name in data source config' });49 }50 const result = await db.execute({51 sql: `SELECT column_name, data_type, is_nullable FROM information_schema.columns52 WHERE table_schema = 'public' AND table_name = $1 ORDER BY ordinal_position`,53 params: [tableName],54 });55 const columns = result.rows.map(col => ({56 field: col.column_name,57 type: TYPE_MAP[col.data_type] || col.data_type,58 nullable: col.is_nullable === 'YES',59 supportedAggregates: TYPE_MAP[col.data_type] === 'number'60 ? ['none', 'count', 'sum', 'avg', 'min', 'max']61module.exports = router;Customization ideas
Report sharing via public link
Add a share_token column to reports. A GET /public/reports/:token/run endpoint executes the report without auth, returning JSON data — so stakeholders can embed report data in external dashboards without a Replit account.
Chart visualization layer
After the results table, add a 'Create Chart' button that opens a chart configurator. Users pick a chart type (bar, line, pie) and map report columns to axes. Render using recharts in the preview panel.
CSV data source support
Allow users to upload a CSV file as a data source. Parse the CSV with csv-parse, store rows in a dynamic_data table with dataset_id, and treat the column headers as the schema for the report builder.
Report versioning
Add a report_versions table that stores snapshots of the report config on each save. Users can view version history and restore a previous config if they accidentally break a useful report.
Common pitfalls
Pitfall: String-interpolating user-selected field names directly into SQL
How to avoid: Always sanitize field names with a regex that only allows alphanumeric characters and underscores: field.replace(/[^a-zA-Z0-9_]/g, ''). Never use raw user input as table or column names without this validation.
Pitfall: Running reports without a LIMIT clause
How to avoid: Always add LIMIT 1000 (or configurable max) to the generated query for preview. For exports, stream rows in batches of 500 rather than loading the full result set at once.
Pitfall: Using the same data source for the reporting tool and the application it reports on
How to avoid: Reporting queries use SELECT only, so they shouldn't lock rows. But limit max concurrent report queries to 3 using a simple in-memory counter middleware to prevent overloading the connection pool.
Pitfall: Forgetting to add SENDGRID_API_KEY to Deployment Secrets
How to avoid: Add SENDGRID_API_KEY to both Workspace Secrets (lock icon) and Deployment Secrets (Publish pane → Secrets). Redeploy after adding.
Best practices
- Validate all user-provided field names and table names against a strict alphanumeric regex before using them in any SQL context — never trust user input for identifiers.
- Always add LIMIT to generated queries. Let users configure a higher limit (up to 10,000) only for exports, and stream those in chunks.
- Use PostgreSQL information_schema for column discovery — it's always accurate and eliminates maintenance of a separate schema definition file.
- Store report configs as JSONB rather than separate relational columns — report definitions change shape often, and JSONB handles this without schema migrations.
- Deploy on Autoscale for the main report API. Create a separate Scheduled Deployment for the email delivery job — run every hour to check for due schedules.
- Add SENDGRID_API_KEY to Replit Secrets via the lock icon in the sidebar, and remember to add it again in Deployment Secrets in the Publish pane — they are completely separate.
- Use Drizzle Studio (Database tab in Replit sidebar) to inspect report_schedules and verify next_run_at is being calculated correctly after each delivery.
AI prompts to try
Copy these prompts to build this project faster.
I'm building a custom report builder with Express and PostgreSQL. I have a reports table where each report has columns as JSONB (array of {field, label, aggregate}), filters as JSONB (array of {field, operator, value}), group_by as text[] array, and sort_by as JSONB ({field, direction}). Help me write a JavaScript function buildReportQuery(report, tableName) that translates this config into a safe parameterized SQL string and params array using Drizzle ORM's sql template tag. Field names must be sanitized to alphanumeric+underscore only before being used in the query.
Add a report templates library to the reporting tool. Create a report_templates table (id serial, name text, description text, category text, data_source_type text, config jsonb — contains the full columns/filters/groupBy/sortBy structure, is_public boolean default true). Pre-populate with 5 templates: 'Monthly Revenue Summary', 'Top Customers by Order Count', 'Weekly Signups by Source', 'Product Sales by Category', 'Refund Rate by Month'. Build a GET /api/templates route that returns public templates, a POST /api/templates/:id/fork route that creates a copy of the template as a user's saved report (they can then customize it), and a React Templates page showing template cards with name, description, and a 'Use Template' button.
Frequently asked questions
Can I report on tables from a different database, not Replit's built-in PostgreSQL?
The current build uses Replit's built-in PostgreSQL and the information_schema for introspection. To connect to an external database, store the external connection string in Replit Secrets and create a second Drizzle client in server/externalDb.js. Note that external databases require whitelisting 0.0.0.0/0 since Replit has dynamic outbound IPs.
Is the report execution safe from SQL injection?
Yes, if you follow the pattern in step 3: sanitize all field and table names with regex (only alphanumeric + underscore), use parameterized queries for all filter values via the params array, and never string-interpolate user input directly into SQL strings. The OPERATORS whitelist ensures only valid comparison operators are used.
What Replit plan do I need?
Free tier is sufficient. The main app runs on Autoscale, and the Scheduled Deployment for email delivery also runs on Free tier. SendGrid has a free tier of 100 emails/day — enough for most scheduled report needs.
Can non-technical users add data to the reporting tool without SQL?
That's the whole point — the three-step wizard lets users pick columns, add filter conditions, and choose aggregates through dropdowns and checkboxes. They never write SQL. The introspection API shows only the columns that actually exist in the table.
How do I handle reports that join multiple tables?
The current build supports single-table reports. For joins, extend the report schema with a joins JSONB field (array of {table, join_type, on_field}). The query builder would need to generate JOIN clauses for each entry. This is the main complexity increase for a v2 build.
Should I deploy on Autoscale or Reserved VM?
Autoscale works for the main report API. Reports are run on-demand and users tolerate a short cold-start wait. The scheduled report delivery job runs as a separate Scheduled Deployment, completely independent of the main app's deployment target.
Can RapidDev help build a custom reporting tool for my business?
Yes. RapidDev builds custom reporting systems with multi-table joins, role-based access to specific data sources, white-labeled report exports, and integration with your existing BI stack. 600+ apps built, free consultation available.
What happens if a report returns too many rows and crashes the server?
The LIMIT 1000 in the query execution prevents loading too many rows for preview. For CSV exports, stream rows in batches of 500 using a cursor-based approach rather than loading the full result set. Add a 30-second query timeout on the database connection to prevent runaway queries.
Talk to an Expert
Our team has built 600+ apps. Get personalized help with your project.
Book a free consultation