Knowledge Hub

Modern ELT with dbt, BigQuery, and Python ingestion

How we cut data platform costs by 60% by pairing dbt with BigQuery, using lightweight Python + make.com to ingest Oracle NetSuite—then deliver faster Tableau.

Back to Knowledge Hub

A practical blueprint: dbt + BigQuery + Python/make.com from NetSuite

Many teams inherit expensive ELT stacks dominated by per-connector fees and heavy ops. We rebuilt the stack for a multi-entity Oracle NetSuite customer on Google Cloud with dbt, BigQuery, and a thin Python / make.com ingestion layer. Result: ~60% lower run rate, simpler change management, and faster Tableau dashboards.

Why this stack

  • Serverless warehouse: BigQuery delivers separation of storage/compute, slot autoscaling, and flat-rate options for predictable spend.
  • Modelling you control: dbt makes transformations versioned, tested, and reviewable—no black-box connectors.
  • Low-cost ingestion: Python on Cloud Run plus make.com scenarios handle Oracle NetSuite APIs without per-connector premiums.
  • Tableau-ready outputs: Curated marts and certified extracts keep dashboards fast and trusted.

Reference architecture

  1. Ingestion: Cloud Run jobs (Python) pull NetSuite via RESTlet/SuiteTalk; make.com orchestrates delta pulls, retries, and alerts. Files land in Cloud Storage, then load to BigQuery raw tables.
  2. Transform: dbt models layered as staging → core → marts, with freshness + schema tests. Metrics are defined once for Tableau/AI reuse.
  3. Serve: Tableau connects live/Extract to marts; row-level security driven from dbt dimension tables.
  4. Ops & observability: Cloud Scheduler triggers runs; Cloud Logging + dbt artifacts feed a lightweight health dashboard.

NetSuite ingestion playbook

  • Authentication: Token-based auth per environment; secrets in Secret Manager.
  • Delta strategy: System notes + last modified timestamps drive incremental pulls; Cloud Storage keeps 30-day raw archives.
  • Schema drift: Python normalizes field changes; dbt tests catch missing/extra columns before they hit marts.
  • Error handling: make.com retries transient API errors; fatal errors open PagerDuty/Slack alerts with request IDs.

Cost controls that delivered ~60% savings

  • Ingestion cost: Replaced per-connector billing with Cloud Run + make.com scenario minutes; NetSuite API calls stayed within plan.
  • Compute: dbt runs on scheduled slots; most models are incremental to avoid full scans.
  • Storage: Tiered Cloud Storage for raw + Time Travel pruning in BigQuery to trim retention.
  • Tableau extracts: Only business-critical dashboards use extracts; others run live with aggregate tables to minimize compute.

Operations & change management

  • Git + CI: Pull requests run dbt test and a slim data diff on staging.
  • Runbooks: Standard playbooks for failed loads, schema additions, and backfills.
  • Data contracts: dbt tests and source freshness enforce SLAs with upstream app teams.

Tableau delivery patterns

  • Certified data sources: Only marts with passing tests are published to Tableau Server/Cloud.
  • Performance: Extracts are filtered and aggregated; live connections hit summary tables with clustering/partitioning.
  • RLS: User entitlements stored in a dbt-managed dimension; applied in Tableau via data source filters.

If you’re not on Google Cloud

Prefer AWS? Swap BigQuery for Redshift or Snowflake; run Python ingestion in Lambda/Fargate; orchestrate with EventBridge + Step Functions. The principles remain: light ingestion, versioned transformations, cost-aware compute, and certified BI outputs.

Key takeaways

  • Own your transformations with dbt; don’t outsource core logic to connectors.
  • Keep ingestion thin and observable; API-first beats heavy agents.
  • Design marts for the tools your users live in—here, Tableau—and enforce trust with tests + RLS.
  • Cost wins come from architectural choices, not just discounts.

Need a similar rebuild? We’ve used this pattern to drop platform costs by more than 60% while speeding dashboard delivery. Let’s map it to your stack.