User-provided data matching (UPDM) joins first-party data that you've collected about a user—such as information from your websites, apps, or physical stores—with that same user's signed-in activity across all Google ad data, including Google owned & operated data. This includes data bought through Google Marketing Platform (GMP) products, for example, YouTube bought using Display & Video 360. Other GMP products which are not Google owned & operated are not supported.
To be eligible for user-provided data matching, the ad event must be linked to a signed-in user in Google ad data.
This document describes the user-provided data matching feature, and provides guidance on setup and use.
Private Cloud Match Overview
Gaining valuable advertising insights often requires stitching together data from multiple sources. Building your own solution to this data pipeline problem requires significant time investment and engineering investment. Private Cloud Match in Ads Data Hub streamlines this process by providing a Ads Data Hub query template for creating a match table in BigQuery that can then be used in your Ads Data Hub queries to match your ads data with your first-party data. Enriching your queries with first-party data can deliver richer customer experiences, and is more resistant to industry-wide ad-tracking changes.
Because user-provided data matching is only available on Google owned and operated inventory for signed-in users, it is not impacted by the upcoming deprecation of third-party cookies. Since it's more resistant to industry changes than third-party data, it can provide richer insights, which can lead to higher customer engagement.
Process summary
- Setup of data ingestion and matching
- Make sure your first-party data is located in BigQuery and your service account has read access to it. See Set up data ingestion.
- First-party data ingestion and matching
- You format and upload your first-party data to your BigQuery dataset.
- You initiate a data-matching request by creating a Private Cloud Match analysis query and setting a schedule.
- Google joins data between your project and Google-owned data containing Google's user ID and hashed user-provided data to build and update match tables.
- See Ingest first-party data
- Ongoing queries in Ads Data Hub, based on matched data
- You run queries against the match tables in the same way you run regular queries in Ads Data Hub. See Query matched data.
Learn about privacy requirements
Collecting customer data
When using user-provided data matching, you must upload first-party data. This could be information you collected from your websites, apps, physical stores, or any information that a customer shared with you directly.
You must:
- Ensure that your privacy policy discloses that you share customer data with third parties to perform services on your behalf, and that you obtain consent for such sharing where legally required
- Only use Google's approved API or interface to upload customer data
- Comply with all applicable laws and regulations, including any self-regulatory or industry codes that may apply
First-party consent acknowledgement
To ensure you are able to use your first-party data in Ads Data Hub, you must confirm that you have obtained proper consent to share data from EEA end users with Google per the EU user consent policy and Ads Data Hub policy. This requirement applies to each Ads Data Hub account, and must be updated every time you upload new first-party data. Any one user can make this acknowledgement on behalf of the entire account.
Note that the same Google service query rules that apply to analysis queries also apply to UPDM queries. For example, you can't run cross-service queries on users in the EEA when you create a match table.
To learn how to acknowledge consent in Ads Data Hub, see Consent requirements for the European Economic Area.
Data size
To protect end-user privacy, user-provided data matching enforces these requirements regarding the size of your data:
- You must upload least 1,000 records in your user list.
- Your list must not exceed the maximum number of records. To learn about the maximum data limit, reach out to your Google representative.
Set up data ingestion
Before you start, make sure:
- Your first-party data must be in BigQuery. If you have a VPC-SC perimeter, this first party data must be located within your VPC-SC.
- Your Ads Data Hub service account must have read access to the first-party data.
- Your first-party data must be formatted and hashed correctly. See next section for more details.
Beyond that, Private Cloud Match has no additional onboarding. If you can run an analysis query, you can run a Private Cloud Match query.
Ingest and match first-party data
Format data for input
Your data must adhere to these formatting requirements to be correctly matched:
- Where indicated in the following input field descriptions, you must upload using SHA256 hashing.
- Input fields must be formatted as strings. For example, if you're using
BigQuery's SHA256 hash
function
with the Base64 encoding function
(TO_BASE64),
use the following transformation:
TO_BASE64(SHA256(user_data))
. - UPDM supports Base64 encoding. You must align encoding of your first-party data with the decoding used in your Ads Data Hub query. If you change your first-party data encoding, you must update your Ads Data Hub query to decode from the same base. The following examples use Base64 encoding.
User ID
- Plain text
- Hashing: None
- Remove leading and trailing whitespaces
- Lowercase all characters
- Include a domain name for all email addresses, such as gmail.com or hotmail.co.jp
- Remove accents—for example, change è, é, ê, or ë to e
- Remove all periods (.) that precede the domain name in
gmail.com
andgooglemail.com
email addresses - Hashing: Base64 encoded SHA256
Valid: TO_BASE64(SHA256("[email protected]"))
Invalid: TO_BASE64(SHA256(" Jéfferson.Lô[email protected] "))
Phone
- Strip whitespace
- Format in E.164 format - US example: +14155552671, UK example: +442071838750
- Remove all special characters except the "+" before the country code
- Hashing: Base64 encoded SHA256
Valid: TO_BASE64(SHA256("+18005550101"))
Invalid: TO_BASE64(SHA256("(800) 555-0101"))
First name
- Strip whitespace
- Lowercase all characters
- Remove all prefixes-for example Mrs., Mr., Ms., Dr.
- Don't remove accents—for example, è, é, ê, or ë
- Hashing: Base64 encoded SHA256
Valid: TO_BASE64(SHA256("daní"))
Invalid: TO_BASE64(SHA256("Mrs. Daní"))
Last name
- Strip whitespace
- Lowercase all characters
- Remove all suffixes-for example Jr., Sr., 2nd, 3rd, II, III, PHD, MD
- Don't remove accents—for example, è, é, ê, or ë
- Hashing: Base64 encoded SHA256
Valid: TO_BASE64(SHA256("délacruz"))
Invalid: TO_BASE64(SHA256("dé la Cruz, Jr."))
Country
- Include the country code even if all of your customer data is from the same country
- Don't hash country data
- Use ISO 3166-1 alpha-2 country codes
- Hashing: None
Valid: US
Invalid: United States of America
or USA
Zip code
- Don't hash zip code data
- Both US and international zip and postal codes are allowed
- For US:
- 5 digit codes are allowed—for example, 94043
- 5 digits followed by 4 digit extension are also allowed—for example, 94043-1351 or 940431351
- For all other countries:
- No formatting needed (No need to lowercase, or remove spaces and special characters)
- Leave out postal code extensions
- Hashing: None
Hash validation and data encoding
You can use the following hash validation scripts to ensure that your data is correctly formatted.
JavaScript
/**
* @fileoverview Provides the hashing algorithm, as well as some valid hashes of
* sample data for testing.
*/
async function hash(token) {
// Removes leading or trailing spaces and converts all characters to lowercase.
const formattedToken = token.trim().toLowerCase();
// Hashes the formatted string using the SHA-256 hashing algorithm.
const hashBuffer = await crypto.subtle.digest(
'SHA-256', (new TextEncoder()).encode(formattedToken));
// Converts the hash buffer to a base64-encoded string and returns it.
const base64Str = btoa(String.fromCharCode(...new Uint8Array(hashBuffer)));
return base64Str;
}
function main() {
// Expected hash for [email protected]:
// h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=
hash('[email protected]').then(result => console.log(result));
// Expected hash for +18005551212:
// YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=
hash('+18005551212').then(result => console.log(result));
// Expected hash for John: ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=
hash('John').then(result => console.log(result));
// Expected hash for Doe: eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=
hash('Doe').then(result => console.log(result));
}
main()
Python
"""Provides the hashing algorithm, as well as some valid hashes of sample data for testing.
Supports: Python 2, Python 3
Sample hashes:
- Email '[email protected]': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=
- Phone '+18005551212': YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=
- First name 'John': ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=
- Last name 'Doe': eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=
"""
import base64
import hashlib
def hash(token):
# Generates a base64-encoded SHA-256 hash of a normalized input string.
return base64.b64encode(
hashlib.sha256(
token.strip().lower().encode('utf-8')).digest()).decode('utf-8')
def print_hash(token, expected=None):
# Computes and displays the hash of a token, with optional validation.
hashed = hash(token)
if expected is not None and hashed != expected:
print(
'ERROR: Incorrect hash for token "{}". Expected "{}", got "{}"'.format(
token, expected, hashed))
return
print('Hash: "{}"\t(Token: {})'.format(hashed, token))
def main():
# Tests the hash function with sample tokens and expected results.
print_hash(
'[email protected]', expected='h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=')
print_hash(
'+18005551212', expected='YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=')
print_hash('John', expected='ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=')
print_hash('Doe', expected='eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=')
if __name__ == '__main__':
main()
Go
/*
Provides the hashing algorithm, as well as some valid hashes of sample data for testing.
Sample hashes:
- Email '[email protected]': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=
- Phone '+18005551212': YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=
- First name 'John': ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=
- Last name 'Doe': eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=
*/
package main
import (
"crypto/sha256"
"encoding/base64"
"fmt"
"strings"
)
// Hash hashes an email, phone, first name, or last name into the correct format.
func Hash(token string) string {
formatted := strings.TrimSpace(strings.ToLower(token))
hashed := sha256.Sum256([]byte(formatted))
encoded := base64.StdEncoding.EncodeToString(hashed[:])
return encoded
}
// PrintHash prints the hash for a token.
func PrintHash(token string) {
fmt.Printf("Hash: \"%s\"\t(Token: %s)\n", Hash(token), token)
}
func main() {
PrintHash("[email protected]")
PrintHash("+18005551212")
PrintHash("John")
PrintHash("Doe")
}
Java
package updm.hashing;
import static java.nio.charset.StandardCharsets.UTF_8;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Base64;
/**
* Example of the hashing algorithm.
*
* <p>Sample hashes:
*
* <ul>
* <li>Email '[email protected]': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=
* <li>Phone '+18005551212': YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=
* <li>First name 'John': ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=
* <li>Last name 'Doe': eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=
* </ul>
*/
public final class HashExample {
private HashExample() {}
public static String hash(String token) {
// Normalizes and hashes the input token using SHA-256 and Base64 encoding.
String formattedToken = token.toLowerCase().strip();
byte[] hash;
try {
hash = MessageDigest.getInstance("SHA-256").digest(formattedToken.getBytes(UTF_8));
} catch (NoSuchAlgorithmException e) {
throw new IllegalStateException("SHA-256 not supported", e);
}
return Base64.getEncoder().encodeToString(hash);
}
public static void printHash(String token) {
// Calculates and prints the hash for the given token.
System.out.printf("Hash: \"%s\"\t(Token: %s)\n", hash(token), token);
}
public static void main(String[] args) {
// Executes hash calculations and prints results for sample tokens.
printHash("[email protected]");
printHash("+18005551212");
printHash("John");
printHash("Doe");
}
}
SQL
/*
Provides the hashing algorithm, as well as some valid hashes of sample data for testing.
The following code uses Google Standard SQL and can be run on BigQuery to generate match tables from unhashed data.
Sample hashes:
- Email '[email protected]': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=
- Phone '+18005551212': YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=
- First name 'John': ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=
- Last name 'Doe': eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=
The unhashed input table schema is assumed to be:
- Column name: UserID, Type: String
- Column name: Email, Type: String
- Column name: Phone, Type: String
- Column name: FirstName, Type: String
- Column name: LastName, Type: String
- Column name: PostalCode, Type: String
- Column name: CountryCode, Type: String
*/
-- Creates a new table with Base64-encoded SHA-256 hashes of specified columns.
CREATE TABLE `your_project_name.your_dataset_name.output_hashed_table_name`
AS
SELECT
UserID,
TO_BASE64(SHA256(LOWER(Email))) AS Email,
TO_BASE64(SHA256(Phone)) AS Phone,
TO_BASE64(SHA256(LOWER(FirstName))) AS FirstName,
TO_BASE64(SHA256(LOWER(LastName))) AS LastName,
PostalCode,
CountryCode,
FROM
`your_project_name.your_dataset_name.input_unhashed_table_name`;
Join keys
Some combinations of user-provided data are stronger than others. Following is a list of different user-provided data combinations, ranked by relative strength. If you use an address, you must include: First name, Last name, Country, and Zip code.
- Email, Phone, Address (strongest)
- Phone, Address
- Email, Address
- Email, Phone
- Address
- Phone
- Email (weakest)
Create a match table
Click Reports > Create report > Private cloud match table generation > Use template Optional: You can select Private cloud match table generation with hashing if your data is not already hashed.
// Create a new match table using your first party data with this template. /* Parameters: Manually remove all the parameters tagged with @ prefix and replace them with column names from your first party table: * @user_id * @email * @phone * @first_name * @last_name * @country_code * @postal_code And your BigQuery table information: * @my_project: Your BigQuery project where the first party table is. * @my_dataset: Your dataset where the first party table is. * @my_first_party_table: Your first party table. */ CREATE OR REPLACE TABLE adh.updm_match_table AS ( SELECT CAST(@user_id AS BYTES) AS user_id, @email AS email, @phone AS phone, @first_name AS first_name, @last_name AS last_name, @country_code AS country, @postal_code AS zip_code FROM `@my_project.@my_dataset.@my_first_party_table` );
Replace the parameter names with your column names to provide proper aliasing.
Toggle the privacy noise setting to "Use diff checks".
Click Set schedule to set a frequency for how often you want your match table refreshed. Each run will overwrite the current match table.
Query matched data
Query the match tables
When your match tables contain enough data to satisfy privacy checks, you're ready to run queries against the tables.
The original table for first-party data (1PD) is represented by my_data
.
This includes both Personally Identifiable Information (PII) and non-PII data.
Using the original table can improve your reports with more insights, as it
represents all the 1PD data in scope, when compared to a match table.
Each table in the Ads Data Hub schema containing a user_id
field is
accompanied by a match table. For example, for the
adh.google_ads_impressions
table, Ads Data Hub also generates a match table
called adh.google_ads_impressions_updm
containing your user IDs.
Separate match tables are created for policy-isolated tables. For example, for
the adh.google_ads_impressions_policy_isolated_youtube
table, Ads Data Hub
also generates a match table called
adh.google_ads_impressions_policy_isolated_youtube_updm
containing
your user IDs.
These tables contain a subset of the users available in the original tables,
where there is a match on the user_id
. For example, if the original table
contains data for User A and User B, but only User A is matched, then User B
won't be in the match table.
The match tables contain an additional column called customer_data_user_id
, which
stores the user identifier as BYTES.
It's important to consider the field's type when writing your queries. SQL
comparison operators expect that the literals you're comparing are of the same
type. Depending on how the user_id
is stored in your table of first-party
data, you may need to encode the values in the table before matching the data.
You need to cast your join key into BYTES for successful matches:
JOIN ON
adh.google_ads_impressions_updm.customer_data_user_id = CAST(my_data.user_id AS BYTES)
Additionally, string comparisons in SQL are sensitive to capitalization, so you may need to encode strings on both sides of your comparison to ensure that they can be accurately compared.
Sample queries
Count matched users
This query counts the number of matched users in your Google Ads impressions table.
/* Count matched users in Google Ads impressions table */
SELECT COUNT(DISTINCT user_id)
FROM adh.google_ads_impressions_updm
Calculate match rate
Not all users are eligible for matching. For example, signed-out users,
children, and unconsented users are not matched through UPDM. You can use the
is_updm_eligible
field to calculate more accurate UPDM match rates. Note that
the is_updm_eligible
field was available beginning on October 1, 2024. You
cannot use this field to calculate match rates before that date.
/* Calculate the UPDM match rate */
CREATE TEMP TABLE total_events OPTIONS(privacy_checked_export=TRUE) AS
SELECT
customer_id,
COUNT(*) AS n
FROM adh.google_ads_impressions
WHERE is_updm_eligible
GROUP BY 1;
CREATE TEMP TABLE matched_events OPTIONS(privacy_checked_export=TRUE) AS
SELECT
customer_id,
COUNT(*) AS n
FROM adh.google_ads_impressions_updm
GROUP BY 1;
SELECT
customer_id,
SAFE_DIVIDE(matched_events.n, total_events.n) AS match_rate
FROM total_events
LEFT JOIN matched_events
USING (customer_id)
Join first-party and Google Ads data
This query shows how to join first-party data with Google Ads data:
/* Join first-party data with Google Ads data. The customer_data_user_id field
contains your ID as BYTES. You need to cast your join key into BYTES for
successful matches. */
SELECT
inventory_type,
COUNT(*) AS impressions
FROM
adh.yt_reserve_impressions_updm AS google_data_imp
LEFT JOIN
`my_data`
ON
google_data_imp.customer_data_user_id = CAST(my_data.user_id AS BYTES)
GROUP BY
inventory_type
UPDM FAQs
For a list of FAQs related to UPDM, see UPDM FAQs.