Back to Insights
Data Engineering 3/11/2025 5 min read

Unlocking True ROAS: Combining Server-Side GA4, Google Ads & Facebook Ads in BigQuery for Custom Attribution

Unlocking True ROAS: Combining Server-Side GA4, Google Ads & Facebook Ads in BigQuery for Custom Attribution

You've invested heavily in building a robust server-side Google Analytics 4 (GA4) pipeline, leveraging Google Tag Manager (GTM) Server Container on Cloud Run for advanced data collection, transformations, and granular consent management. This architecture provides unparalleled control and data quality, forming the backbone of your modern analytics strategy. You're effectively capturing online interactions, enriching events, and perhaps even unifying your raw data with GA4 exports in BigQuery.

However, a persistent and critical challenge for data-driven marketing teams is achieving a truly unified view of marketing performance and accurately calculating Return On Ad Spend (ROAS) across disparate advertising platforms. Relying solely on the attribution models and reporting interfaces provided by Google Ads, Facebook Ads, or even GA4, often leads to:

  • Fragmented Data: Cost data lives in ad platforms, conversion data in analytics, and blending them manually is time-consuming and error-prone.
  • Inconsistent Attribution: Each platform has its own default attribution model (e.g., last-click for Google Ads, 1-day view/7-day click for Facebook), leading to discrepancies and making it impossible to compare apples to apples.
  • Limited Customization: You can't apply custom, business-specific attribution models (e.g., a blended model that prioritizes certain touchpoints) directly within platform UIs.
  • Elusive True ROAS: Without a consolidated view of costs and revenue attributed under a consistent model, calculating accurate, blended ROAS across all channels is a guessing game.

The problem, then, is the need for a centralized, extensible platform to combine server-side GA4 conversion data with advertising cost data, enabling custom attribution modeling and precise ROAS calculations that reflect your business's unique understanding of customer value.

The Challenge: Beyond Platform Silos

Imagine a customer journey where a user sees a Facebook ad, clicks a Google Search ad, browses your site, and then converts.

  • Facebook attributes the conversion to its ad.
  • Google Ads attributes it to its ad.
  • GA4 (depending on its default model) attributes it to the last non-direct channel.

Each platform claims credit, and none provide the holistic picture needed for optimal budget allocation. You need to pull all the pieces together into your own data warehouse, BigQuery, where you have the flexibility to define your own rules.

Our Solution: A BigQuery-Centric Marketing Performance Hub

Our solution leverages Google Cloud's powerful data capabilities to build a unified marketing performance data model directly in BigQuery. This hub will ingest GA4 conversion data (from your existing BigQuery export), pull in cost and impression data from Google Ads and Facebook Ads, and then enable you to apply custom attribution logic and calculate blended ROAS.

This approach ensures:

  1. Unified Data Source: A single, comprehensive dataset for all your marketing performance analysis.
  2. Custom Attribution: Apply any attribution model you define in SQL, tailored to your business needs.
  3. Accurate ROAS: Calculate true Return On Ad Spend by blending costs and revenues consistently.
  4. Data Ownership: You own and control your core marketing performance data.
  5. Flexibility: Easily integrate new ad platforms or refine attribution models as your business evolves.
  6. Actionable Insights: Drive smarter budget allocation and campaign optimization decisions.

Architecture: Blending GA4 Conversions with Ad Costs in BigQuery

The core idea is to bring all relevant marketing and analytics data into BigQuery and use powerful SQL to stitch them together and apply custom logic.

graph TD
    subgraph Data Sources
        A[Client-Side Events] -- HTTP Request (to GTM SC) --> B(GTM Server Container on Cloud Run);
        B -- Dispatch to GA4 MP (with gclid, fbclid) --> C[Google Analytics 4];
        C -- Daily/Near Real-time Export --> D[BigQuery GA4 Export (events_*)];
        
        E[Google Ads Account] -- Daily Sync --> F(BigQuery Data Transfer Service: Google Ads);
        F --> G[BigQuery: Google Ads Cost Data];
        
        H[Facebook Ads Account] -- API Call (from Cloud Run) --> I(Facebook Marketing API);
        I --> J(Cloud Run: Facebook Ads ETL) -- Schedule (Cloud Scheduler) --> J;
        J --> K[BigQuery: Facebook Ads Cost Data];
    end

    subgraph BigQuery Marketing Performance Hub
        D -- SQL Join (on gclid, fbclid, transaction_id) --> L(BigQuery: Unified Conversions & Costs);
        G -- SQL Join --> L;
        K -- SQL Join --> L;
    end
    
    subgraph Analysis & Activation
        L -- Custom Attribution Logic (SQL) --> M[BigQuery: Attributed Conversions];
        M -- Calculate ROAS (SQL) --> N[BigQuery: ROAS Metrics];
        N --> O[Looker Studio (Dashboards)];
        N --> P[Custom BI Tools/Data Science];
    end

Key Flow:

  1. GA4 Data Ingestion: Your existing server-side GA4 pipeline sends conversion events to GA4. Crucially, your GTM Server Container ensures gclid (for Google Ads clicks) and fbclid (for Facebook Ads clicks) are captured and sent as event_parameters to GA4. GA4 then exports this data to BigQuery.
  2. Google Ads Cost Data: BigQuery Data Transfer Service (BQ DTS) automatically imports daily cost, impression, and click data from your Google Ads accounts into BigQuery.
  3. Facebook Ads Cost Data: A scheduled Cloud Run service periodically calls the Facebook Marketing API to fetch campaign-level cost and impression data, then loads it into BigQuery.
  4. Unified Data Model: SQL queries in BigQuery join these datasets on common identifiers (gclid, fbclid, transaction_id, event_timestamp) to create a comprehensive table of conversions and their associated costs.
  5. Custom Attribution: Apply SQL-based attribution models (e.g., first-click, last-click, linear, custom weighted) to distribute credit for conversions across various marketing touchpoints.
  6. ROAS Calculation: Calculate ROAS metrics based on your attributed revenue and actual ad costs.
  7. Reporting: Visualize the results in Looker Studio or other BI tools for actionable insights.

Core Components Deep Dive & Implementation Steps

1. Server-Side GA4: Capturing gclid and fbclid

For accurate attribution, your server-side GA4 pipeline must capture gclid (Google Click ID) and fbclid (Facebook Click ID) from the URL query parameters. These are the crucial join keys.

If your GTM Server Container uses the built-in GA4 Client, it often captures these automatically. However, it's good practice to ensure they are explicitly mapped as event_parameters. For instance, the page_location parameter typically contains these. If not, a custom variable can extract them from the URL:

GTM SC Custom Variable to extract gclid or fbclid:

  • Variable Type: URL
  • Component Type: Query
  • Query Key: gclid (create {{URL Query - gclid}})
  • Query Key: fbclid (create {{URL Query - fbclid}})

Then, in your GA4 Event Tags (especially for purchase, generate_lead):

  • Add an Event Parameter: gclid, value {{URL Query - gclid}}
  • Add an Event Parameter: fbclid, value {{URL Query - fbclid}}

GA4 will then export these as event_params in BigQuery, and you can access them via UNNEST(event_params).

2. Ingesting Google Ads Cost Data: BigQuery Data Transfer Service (BQ DTS)

BQ DTS provides a native, automated way to import your Google Ads data into BigQuery.

Setup Steps:

  1. Enable BQ Data Transfer Service API: In your GCP Project, search for "BigQuery Data Transfer API" and enable it.
  2. Create a New Transfer: Navigate to BigQuery -> Data transfers in the GCP Console.
  3. Click CREATE TRANSFER.
  4. Source: Select Google Ads.
  5. Transfer config name: Google Ads Daily Sync
  6. Schedule: Daily (or your preferred frequency).
  7. Destination dataset: Choose an existing dataset or create a new one (e.g., marketing_raw).
  8. Data source specific parameters:
    • Customer ID: Enter your Google Ads Customer ID(s).
    • Preferences: Configure which tables to transfer (e.g., Campaigns, AdGroups, KeywordStats, ClickPerformance, Impressions). ClickPerformance is crucial for gclid and cost data.
  9. Service Account: Ensure the service account associated with the transfer has Viewer access to your Google Ads account.
  10. Save.

BQ DTS will now automatically create and populate tables like p_Clicks_XXXXXX, p_Campaign_XXXXXX, p_Cost_XXXXXX in your marketing_raw dataset, containing campaign data, costs, and gclid values.

3. Ingesting Facebook Ads Cost Data: Cloud Run ETL

Facebook Ads data doesn't have a direct BQ DTS integration. We'll build a lightweight Python Cloud Run service to fetch this data via the Facebook Marketing API and stream it to BigQuery.

a. BigQuery Table for Facebook Ads Data:

Create a table to store your Facebook Ads campaign performance data.

CREATE TABLE `your_gcp_project.marketing_raw.facebook_ads_performance` (
    report_date DATE NOT NULL,
    account_id STRING,
    account_name STRING,
    campaign_id STRING,
    campaign_name STRING,
    adset_id STRING,
    adset_name STRING,
    ad_id STRING,
    ad_name STRING,
    spend NUMERIC,
    impressions INTEGER,
    clicks INTEGER,
    unique_clicks INTEGER,
    reach INTEGER,
    -- fbclid (Facebook Click ID) is usually retrieved from URL/landing page,
    -- not directly available in aggregate performance reports from API.
    -- For linking, you'd rely on matching by campaign/date for cost, and fbclid in GA4 for conversions.
    gcp_insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP()
)
PARTITION BY report_date
CLUSTER BY campaign_id, ad_id
OPTIONS (\n    description = 'Daily performance data from Facebook Ads Marketing API.'\n);

b. Python Facebook Ads ETL Service (Cloud Run):

This Flask app will use the facebook_business SDK to fetch data and insert it into BigQuery.

facebook-ads-etl/main.py:

import os
import json
import datetime
from flask import Flask, request, jsonify
from google.cloud import bigquery, secretmanager
from facebook_business.api import FacebookAdsApi
from facebook_business.adobjects.adaccount import AdAccount
from facebook_business.adobjects.campaign import Campaign
from facebook_business.adobjects.adsinsights import AdsInsights
import logging

app = Flask(__name__)
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# --- BigQuery Configuration ---
BIGQUERY_PROJECT_ID = os.environ.get('GCP_PROJECT_ID')
BIGQUERY_DATASET_ID = os.environ.get('BIGQUERY_DATASET_ID', 'marketing_raw')
BIGQUERY_TABLE_ID = os.environ.get('BIGQUERY_TABLE_ID', 'facebook_ads_performance')
TABLE_FULL_ID = f"{BIGQUERY_PROJECT_ID}.{BIGQUERY_DATASET_ID}.{BIGQUERY_TABLE_ID}"

bigquery_client = bigquery.Client(project=BIGQUERY_PROJECT_ID)
secret_manager_client = secretmanager.SecretManagerServiceClient()

# --- Facebook Ads API Configuration (use Secret Manager for tokens!) ---
# These environment variables hold the Secret Manager resource names
FB_APP_ID_SECRET_NAME = os.environ.get('FB_APP_ID_SECRET_NAME', 'FB_APP_ID')
FB_APP_SECRET_SECRET_NAME = os.environ.get('FB_APP_SECRET_SECRET_NAME', 'FB_APP_SECRET')
FB_ACCESS_TOKEN_SECRET_NAME = os.environ.get('FB_ACCESS_TOKEN_SECRET_NAME', 'FB_ACCESS_TOKEN')
FB_AD_ACCOUNT_ID = os.environ.get('FB_AD_ACCOUNT_ID', 'act_XXXXXXXXXXXXXXX') # Your Ad Account ID

def get_secret_value(secret_name):
    try:
        secret_path = secret_manager_client.secret_version_path(
            BIGQUERY_PROJECT_ID, secret_name, 'latest'
        )
        response = secret_manager_client.access_secret_version(request={"name": secret_path})
        return response.payload.data.decode('UTF-8')
    except Exception as e:
        logger.error(f"Error accessing secret {secret_name}: {e}")
        return None

def init_facebook_api():
    app_id = get_secret_value(FB_APP_ID_SECRET_NAME)
    app_secret = get_secret_value(FB_APP_SECRET_SECRET_NAME)
    access_token = get_secret_value(FB_ACCESS_TOKEN_SECRET_NAME)

    if not all([app_id, app_secret, access_token, FB_AD_ACCOUNT_ID]):
        logger.error("Missing Facebook Ads API credentials. Cannot initialize API.")
        return None
    
    FacebookAdsApi.init(app_id=app_id, app_secret=app_secret, access_token=access_token)
    logger.info("Facebook Ads API initialized.")
    return True

@app.route('/fetch-facebook-ads-data', methods=['POST'])
def fetch_facebook_ads_data():
    if not init_facebook_api():
        return jsonify({'status': 'failed', 'message': 'Failed to initialize Facebook Ads API'}), 500
    
    try:
        # Determine report date. For scheduled jobs, usually fetch yesterday's data.
        # Can be overridden by request payload if needed for backfill.
        report_date_str = request.get_json().get('report_date') if request.is_json else None
        
        if report_date_str:
            report_date = datetime.datetime.fromisoformat(report_date_str).date()
        else:
            report_date = datetime.date.today() - datetime.timedelta(days=1) # Default: yesterday
        
        logger.info(f"Fetching Facebook Ads data for date: {report_date.isoformat()}")

        # Fetch insights at the AD level
        ad_account = AdAccount(FB_AD_ACCOUNT_ID)
        params = {
            'time_range': {
                'since': report_date.isoformat(),
                'until': report_date.isoformat()
            },
            'level': 'ad', # Fetch at Ad level for granularity
            'limit': 1000 # Max limit per API call
        }
        fields = [
            AdsInsights.Field.account_id,
            AdsInsights.Field.account_name,
            AdsInsights.Field.campaign_id,
            AdsInsights.Field.campaign_name,
            AdsInsights.Field.adset_id,
            AdsInsights.Field.adset_name,
            AdsInsights.Field.ad_id,
            AdsInsights.Field.ad_name,
            AdsInsights.Field.spend,
            AdsInsights.Field.impressions,
            AdsInsights.Field.clicks,
            AdsInsights.Field.unique_clicks,
            AdsInsights.Field.reach,
            AdsInsights.Field.date_start # This will be the report_date
        ]

        insights = ad_account.get_insights(fields=fields, params=params)
        
        rows_to_insert = []
        for insight in insights:
            row = {
                'report_date': insight['date_start'],
                'account_id': insight.get('account_id'),
                'account_name': insight.get('account_name'),
                'campaign_id': insight.get('campaign_id'),
                'campaign_name': insight.get('campaign_name'),
                'adset_id': insight.get('adset_id'),
                'adset_name': insight.get('adset_name'),
                'ad_id': insight.get('ad_id'),
                'ad_name': insight.get('ad_name'),
                'spend': float(insight.get('spend', 0.0)),
                'impressions': int(insight.get('impressions', 0)),
                'clicks': int(insight.get('clicks', 0)),
                'unique_clicks': int(insight.get('unique_clicks', 0)),
                'reach': int(insight.get('reach', 0)),
            }
            rows_to_insert.append(row)
        
        if rows_to_insert:
            errors = bigquery_client.insert_rows_json(TABLE_FULL_ID, rows_to_insert)
            if errors:
                logger.error(f"BigQuery insert errors for Facebook Ads data: {errors}")
                return jsonify({'status': 'failed', 'message': 'Partial success, some rows failed to insert', 'errors': errors}), 500
            else:
                logger.info(f"Successfully ingested {len(rows_to_insert)} rows of Facebook Ads data for {report_date.isoformat()}.")
                return jsonify({'status': 'success', 'rows_ingested': len(rows_to_insert)}), 200
        else:
            logger.info(f"No Facebook Ads data found for {report_date.isoformat()}.")
            return jsonify({'status': 'success', 'rows_ingested': 0}), 200

    except Exception as e:
        logger.error(f"Error fetching/ingesting Facebook Ads data: {e}", exc_info=True)
        return jsonify({'status': 'failed', 'message': str(e)}), 500

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))

facebook-ads-etl/requirements.txt:

Flask
google-cloud-bigquery
google-cloud-secret-manager
facebook-business
requests # facebook-business depends on requests

c. Deploy the Facebook Ads ETL Service to Cloud Run:

# First, create secrets in Secret Manager for Facebook API credentials
echo "YOUR_FB_APP_ID" | gcloud secrets create FB_APP_ID --data-file=- --project YOUR_GCP_PROJECT_ID
echo "YOUR_FB_APP_SECRET" | gcloud secrets create FB_APP_SECRET --data-file=- --project YOUR_GCP_PROJECT_ID
echo "YOUR_FB_ACCESS_TOKEN" | gcloud secrets create FB_ACCESS_TOKEN --data-file=- --project YOUR_GCP_PROJECT_ID

# Grant the Cloud Run service account access to these secrets
# The service account is typically [email protected]
gcloud secrets add-iam-policy-binding FB_APP_ID --role="roles/secretmanager.secretAccessor" --member="serviceAccount:[email protected]" --project YOUR_GCP_PROJECT_ID
gcloud secrets add-iam-policy-binding FB_APP_SECRET --role="roles/secretmanager.secretAccessor" --member="serviceAccount:[email protected]" --project YOUR_GCP_PROJECT_ID
gcloud secrets add-iam-policy-binding FB_ACCESS_TOKEN --role="roles/secretmanager.secretAccessor" --member="serviceAccount:YOUR_GCP_PROJECT_ID.iam.gserviceaccount.com" --project YOUR_GCP_PROJECT_ID

gcloud run deploy facebook-ads-etl-service \
    --source ./facebook-ads-etl \
    --platform managed \
    --region YOUR_GCP_REGION \
    --no-allow-unauthenticated \
    --set-env-vars \
        GCP_PROJECT_ID="YOUR_GCP_PROJECT_ID",\
        BIGQUERY_DATASET_ID="marketing_raw",\
        BIGQUERY_TABLE_ID="facebook_ads_performance",\
        FB_APP_ID_SECRET_NAME="FB_APP_ID",\
        FB_APP_SECRET_SECRET_NAME="FB_APP_SECRET",\
        FB_ACCESS_TOKEN_SECRET_NAME="FB_ACCESS_TOKEN",\
        FB_AD_ACCOUNT_ID="act_YOUR_FACEBOOK_AD_ACCOUNT_ID" \
    --memory 512Mi \
    --cpu 1 \
    --timeout 300s # Allow ample time for API calls and BQ inserts

# Grant the Cloud Run service account 'roles/bigquery.dataEditor' on your marketing_raw dataset

d. Schedule with Cloud Scheduler:

gcloud scheduler jobs create http facebook-ads-daily-etl \
    --location YOUR_GCP_REGION \
    --schedule "0 2 * * *" \
    --uri "https://facebook-ads-etl-service-YOUR_SERVICE_HASH-YOUR_GCP_REGION.a.run.app/fetch-facebook-ads-data" \
    --http-method POST \
    --headers "Content-Type=application/json" \
    --oidc-service-account-email YOUR_CLOUD_RUN_SERVICE_ACCOUNT_EMAIL \
    --oidc-token-audience "https://facebook-ads-etl-service-YOUR_SERVICE_HASH-YOUR_GCP_REGION.a.run.app" \
    --message-body "{}" \
    --project YOUR_GCP_PROJECT_ID

Important: Replace YOUR_CLOUD_RUN_SERVICE_ACCOUNT_EMAIL with the actual service account email of your facebook-ads-etl-service (e.g., [email protected]). This ensures secure, authenticated invocation. The schedule 0 2 * * * means daily at 2:00 AM UTC.

4. Unifying Data in BigQuery: unified_marketing_performance

Now, let's create a view that combines GA4 conversions with Google Ads and Facebook Ads data.

-- Replace with your actual project IDs and GA4 property ID
CREATE OR REPLACE VIEW `your_gcp_project.marketing_analytics.unified_marketing_performance` AS
WITH GA4_Conversions AS (
    SELECT
        FORMAT_DATE('%Y-%m-%d', TIMESTAMP_MICROS(event_timestamp)) AS event_date,
        user_pseudo_id,
        (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'gclid') AS gclid,
        (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'fbclid') AS fbclid,
        (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'transaction_id') AS transaction_id,
        (SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'value') AS revenue,
        event_name,
        (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'campaign') AS ga4_campaign,
        (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS ga4_source,
        (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS ga4_medium,
        -- Add other relevant GA4 event parameters or user properties
        FROM `your_gcp_project.analytics_YOUR_GA4_PROPERTY_ID.events_*`
        WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
        AND event_name IN ('purchase', 'generate_lead') -- Focus on conversion events
        QUALIFY ROW_NUMBER() OVER(PARTITION BY transaction_id ORDER BY event_timestamp DESC) = 1 -- Deduplicate purchases if necessary
),
GoogleAds_Costs AS (
    SELECT
        DATE(ad.date) AS report_date,
        ad.campaign_id AS google_campaign_id,
        ad.campaign_name AS google_campaign_name,
        ad.source AS google_source, -- e.g., 'google'
        ad.medium AS google_medium, -- e.g., 'cpc'
        SUM(ad.cost) AS google_ad_cost,
        FROM `your_gcp_project.marketing_raw.p_Ad_XXXXXX` AS ad -- Replace with your actual BQ DTS Ad table
        WHERE ad.date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND CURRENT_DATE()
        GROUP BY 1,2,3,4,5
),
FacebookAds_Costs AS (
    SELECT
        report_date,
        campaign_id AS facebook_campaign_id,
        campaign_name AS facebook_campaign_name,
        'facebook' AS facebook_source,
        'cpc' AS facebook_medium, -- Or other relevant medium
        SUM(spend) AS facebook_ad_cost,
        FROM `your_gcp_project.marketing_raw.facebook_ads_performance`
        WHERE report_date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND CURRENT_DATE()
        GROUP BY 1,2,3,4,5
)
SELECT
    COALESCE(gac.event_date, gca.report_date, fba.report_date) AS event_date,
    gac.user_pseudo_id,
    gac.gclid,
    gac.fbclid,
    gac.transaction_id,
    gac.revenue,
    gac.event_name,
    gac.ga4_campaign,
    gac.ga4_source,
    gac.ga4_medium,
    gca.google_ad_cost,
    fba.facebook_ad_cost
FROM
    GA4_Conversions AS gac
LEFT JOIN
    GoogleAds_Costs AS gca
    ON gac.event_date = gca.report_date
    AND gac.gclid = gca.gclid -- Note: BQ DTS click performance table will have gclid to join on
LEFT JOIN
    FacebookAds_Costs AS fba
    ON gac.event_date = fba.report_date
    -- For fbclid, you'd typically match by date/campaign (if fbclid is not in FB Ads data directly)
    -- More complex stitching might involve window functions or a separate user_journey table
WHERE
    gac.event_date IS NOT NULL
;

Important Considerations for Joining:

  • gclid and fbclid: These are the primary keys for connecting conversions to specific ad clicks. Ensure your GA4 events always capture these.
  • _TABLE_SUFFIX: When querying GA4 export tables (events_*), always filter by _TABLE_SUFFIX to limit the data scanned and reduce costs.
  • Time Zones: Ensure all timestamps are standardized (e.g., UTC) before joining to avoid discrepancies, especially around midnight. The GA4 export event_timestamp is in UTC microseconds. Your Google Ads and Facebook Ads data should ideally also be normalized to UTC.
  • Deduplication: The QUALIFY ROW_NUMBER() OVER(PARTITION BY transaction_id ORDER BY event_timestamp DESC) = 1 is crucial for purchase events in GA4 to handle potential duplicates (from retries or parallel client/server-side firing).

5. Implementing Custom Attribution Models

With unified_marketing_performance, you can now apply various attribution models. This typically involves window functions to analyze user touchpoints leading up to a conversion.

First, let's create a hypothetical user_journey_events table (or view) that captures all relevant touchpoints (page views, clicks, conversions) for a user, enriched with marketing channel information. This might involve complex joins with your raw event data lake for richer referrer information, or using GA4's traffic_source fields.

-- This is a simplified example. A real user journey model is more complex.
CREATE OR REPLACE VIEW `your_gcp_project.marketing_analytics.user_journey_events` AS
SELECT
    user_pseudo_id,
    TIMESTAMP_MICROS(event_timestamp) AS event_time,
    event_name,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'gclid') AS gclid,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'fbclid') AS fbclid,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'source') AS source,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'medium') AS medium,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'campaign') AS campaign,
    (SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'value') AS revenue,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'transaction_id') AS transaction_id,
    ROW_NUMBER() OVER(PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS touchpoint_sequence_num
FROM
    `your_gcp_project.analytics_YOUR_GA4_PROPERTY_ID.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
  -- Consider only relevant events that constitute a touchpoint or conversion
  AND event_name IN ('page_view', 'session_start', 'purchase', 'generate_lead')
;

-- Now, using `user_journey_events` to model attribution for 'purchase' events.

-- Example 1: Last-Click Attribution
CREATE OR REPLACE VIEW `your_gcp_project.marketing_analytics.last_click_attributed_conversions` AS
SELECT
    uj.transaction_id,
    uj.revenue,
    LAST_VALUE(uj.source IGNORE NULLS) OVER (PARTITION BY uj.transaction_id ORDER BY uj.event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_click_source,
    LAST_VALUE(uj.medium IGNORE NULLS) OVER (PARTITION BY uj.transaction_id ORDER BY uj.event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_click_medium,
    LAST_VALUE(uj.campaign IGNORE NULLS) OVER (PARTITION BY uj.transaction_id ORDER BY uj.event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_click_campaign
FROM
    `your_gcp_project.marketing_analytics.user_journey_events` AS uj
WHERE
    uj.event_name = 'purchase'
    AND uj.transaction_id IS NOT NULL
QUALIFY
    ROW_NUMBER() OVER(PARTITION BY uj.transaction_id ORDER BY uj.event_time DESC) = 1 -- Ensure distinct purchases
;

-- Example 2: First-Click Attribution
CREATE OR REPLACE VIEW `your_gcp_project.marketing_analytics.first_click_attributed_conversions` AS
SELECT
    uj.transaction_id,
    uj.revenue,
    FIRST_VALUE(uj.source IGNORE NULLS) OVER (PARTITION BY uj.transaction_id ORDER BY uj.event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS first_click_source,
    FIRST_VALUE(uj.medium IGNORE NULLS) OVER (PARTITION BY uj.transaction_id ORDER BY uj.event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS first_click_medium,
    FIRST_VALUE(uj.campaign IGNORE NULLS) OVER (PARTITION BY uj.transaction_id ORDER BY uj.event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS first_click_campaign
FROM
    `your_gcp_project.marketing_analytics.user_journey_events` AS uj
WHERE
    uj.event_name = 'purchase'
    AND uj.transaction_id IS NOT NULL
QUALIFY
    ROW_NUMBER() OVER(PARTITION BY uj.transaction_id ORDER BY uj.event_time DESC) = 1
;

-- Example 3: Linear Attribution (simplified, distribute revenue evenly across all touchpoints for a transaction)
-- This is often done at a session level or user level, so the example focuses on distributing revenue
-- for a single purchase event across all touchpoints that happened before it within a reasonable window.
CREATE OR REPLACE VIEW `your_gcp_project.marketing_analytics.linear_attributed_conversions` AS
WITH TouchpointsBeforePurchase AS (
    SELECT
        uj.user_pseudo_id,
        uj.event_time AS touchpoint_time,
        uj.source,
        uj.medium,
        uj.campaign,
        uj_conv.transaction_id,
        uj_conv.revenue,
        COUNT(DISTINCT uj.touchpoint_sequence_num) OVER(PARTITION BY uj_conv.transaction_id) AS total_touchpoints_for_conversion
    FROM
        `your_gcp_project.marketing_analytics.user_journey_events` AS uj
    INNER JOIN
        `your_gcp_project.marketing_analytics.user_journey_events` AS uj_conv
        ON uj.user_pseudo_id = uj_conv.user_pseudo_id
        AND uj_conv.event_name = 'purchase'
        AND uj_conv.transaction_id IS NOT NULL
        AND uj.event_time <= uj_conv.event_time -- Touchpoint happened before or at conversion
        AND uj.event_time >= TIMESTAMP_SUB(uj_conv.event_time, INTERVAL 30 DAY) -- Within a 30-day lookback window
)
SELECT
    transaction_id,
    source,
    medium,
    campaign,
    SUM(revenue / total_touchpoints_for_conversion) AS attributed_revenue
FROM
    TouchpointsBeforePurchase
GROUP BY
    transaction_id, source, medium, campaign
;

Note: Real-world attribution modeling can be significantly more complex, involving:

  • Markov Chains or Shapley values for data-driven attribution.
  • Sessionization logic to define specific touchpoint windows.
  • Handling of direct traffic or unidentifiable sources.
  • Dealing with (not set) values. These SQL examples provide a starting point.

6. Calculating Blended ROAS

Now, combine your attributed conversions with the ad cost data.

CREATE OR REPLACE VIEW `your_gcp_project.marketing_analytics.blended_roas_by_channel` AS
WITH AttributedRevenue AS (
    -- Choose your preferred attribution model here. Let's use Last-Click for simplicity
    -- In a real scenario, you might blend multiple models or use a data-driven one.
    SELECT
        event_date,
        last_click_source AS attributed_source,
        last_click_medium AS attributed_medium,
        last_click_campaign AS attributed_campaign,
        SUM(revenue) AS total_attributed_revenue
    FROM
        `your_gcp_project.marketing_analytics.last_click_attributed_conversions`
    GROUP BY 1,2,3,4
),
CombinedCosts AS (
    SELECT
        event_date,
        'google' AS source,
        'cpc' AS medium,
        google_campaign_name AS campaign,
        SUM(google_ad_cost) AS total_cost
    FROM `your_gcp_project.marketing_analytics.unified_marketing_performance`
    WHERE google_ad_cost IS NOT NULL AND google_ad_cost > 0
    GROUP BY 1,2,3,4
    UNION ALL
    SELECT
        event_date,
        'facebook' AS source,
        'cpc' AS medium,
        facebook_campaign_name AS campaign,
        SUM(facebook_ad_cost) AS total_cost
    FROM `your_gcp_project.marketing_analytics.unified_marketing_performance`
    WHERE facebook_ad_cost IS NOT NULL AND facebook_ad_cost > 0
    GROUP BY 1,2,3,4
)
SELECT
    COALESCE(ar.event_date, cc.event_date) AS report_date,
    COALESCE(ar.attributed_source, cc.source) AS source,
    COALESCE(ar.attributed_medium, cc.medium) AS medium,
    COALESCE(ar.attributed_campaign, cc.campaign) AS campaign,
    SUM(COALESCE(ar.total_attributed_revenue, 0)) AS total_revenue,
    SUM(COALESCE(cc.total_cost, 0)) AS total_cost,
    CASE
        WHEN SUM(COALESCE(cc.total_cost, 0)) > 0 THEN SUM(COALESCE(ar.total_attributed_revenue, 0)) / SUM(COALESCE(cc.total_cost, 0))
        ELSE 0
    END AS roas
FROM
    AttributedRevenue AS ar
FULL OUTER JOIN -- Use FULL OUTER JOIN to ensure all costs and revenues are included
    CombinedCosts AS cc
    ON ar.event_date = cc.event_date
    AND ar.attributed_source = cc.source
    AND ar.attributed_medium = cc.medium
    AND ar.attributed_campaign = cc.campaign
GROUP BY 1,2,3,4
ORDER BY report_date DESC, total_revenue DESC
;

7. Visualization: Looker Studio (formerly Google Data Studio)

Once your blended_roas_by_channel view is ready, connect it to Looker Studio:

  1. In Looker Studio, create a new report.
  2. Add a data source, select BigQuery.
  3. Choose your GCP project, dataset (marketing_analytics), and then select the blended_roas_by_channel view.
  4. You can now build interactive dashboards showing:
    • ROAS by Source/Medium/Campaign over time.
    • Total Revenue vs. Total Cost by channel.
    • Attribution model comparisons (if you create views for multiple models).

Benefits of This Approach

  • Holistic Marketing View: Gain a single source of truth for all your online marketing performance, integrating data that traditionally lives in silos.
  • Accurate ROAS: Move beyond platform-biased reporting to calculate true Return On Ad Spend based on your custom attribution models.
  • Data-Driven Decisions: Allocate marketing budget more effectively by understanding the true impact of each channel across the customer journey.
  • Custom Attribution Flexibility: Experiment with different attribution models in SQL to find what best reflects your business's understanding of value.
  • Scalability & Resilience: Leverage BigQuery's power to handle large datasets, and Cloud Run for reliable, scheduled data ingestion.
  • Data Ownership: You maintain full control over your valuable marketing data and the logic applied to it.
  • Agility: Easily adjust attribution windows, add new ad platforms, or refine models without complex vendor integrations.

Important Considerations

  • Data Latency: While GA4 provides near real-time data for some operations, its BigQuery export can have a daily delay. Ad platform data also syncs daily. Your ROAS reports will reflect data up to yesterday.
  • Cost Management: BigQuery costs scale with data scanned and stored. Partition and cluster your tables effectively. Optimize your SQL queries to avoid full table scans. Monitor BQ DTS and Cloud Run costs.
  • PII & Data Governance: Ensure PII handling in GA4 events is consistent (hashed user_ids, no raw PII). Ad platform data can also contain sensitive information. Apply BigQuery RLS, CLS, and Data Masking for secure access.
  • Attribution Model Choice: The "best" attribution model is highly business-dependent. Start simple (last-click, linear) and iterate. Data-driven models (e.g., using BigQuery ML) are the most advanced but require significant data history.
  • Cross-Device Stitching: This solution primarily relies on GA4's user_pseudo_id and user_id for stitching. For advanced cross-device tracking, ensure your user_id implementation is robust and consistent.
  • gclid and fbclid Capture: Double-check your server-side GTM setup to confirm these identifiers are reliably captured and passed as event parameters to GA4 conversions. Without them, linking to ad clicks is impossible.
  • Offline Conversions: For a truly comprehensive ROAS, consider incorporating offline conversions into your unified_marketing_performance model.

Conclusion

Achieving a holistic, accurate view of your marketing performance and calculating true ROAS is fundamental for optimizing ad spend and driving business growth. By building a BigQuery-centric data hub that combines server-side GA4 conversion data with Google Ads and Facebook Ads cost data, you empower your organization with unparalleled analytical capabilities. This robust, custom solution transcends the limitations of platform-specific reporting, enabling precise attribution modeling and actionable insights that will transform your marketing strategy. Embrace this server-side data engineering approach to unlock the full potential of your marketing investments.