Scratch Schema

The Personalization data warehouse supports the use of "scratch" tables for analysis, reporting, or investigation. In addition, scratch tables can be used as a data source by the Personalization ETL system. Using the ETL system and scratch tables, data scientists can load machine learning model scoring outputs from their data science workbench back into the Personalization production database to enrich user profile attributes. Customers can add and drop scratch tables in their data warehouse that follow the scratch schema specifications described in this article.

The Data Warehouse scratch tables temporarily store Data Science workbench model score outputs for loading back into Personalization user profile attributes. Personalization Data Warehouse supports the ingestion of data written to customer's scratch schema. Tables in the scratch schema can be treated as data sources for the Personalization ETL system.

Multi-channel operation is enabled when user identities can be matched with the Personalization identity resolution system. Catalog data best support multi-channel operation when identities for online and offline product and dimension identities match.

Table Name Format: schema_user_{token}_YYYYMMDD_HHMMSS

Example: retail_user_20200101_093000. Optionally, you can include a descriptive token in the name: retail_user_email06_20200101_090000.

You must supply some form of user id, so at least one of the following optional user or email address fields is required to process a row.

In order to be able to process updates, include a created_ts field, as shown in the following table. This way, the ETL can process rows created since the last run. Instead of updating attributes in a scratch table, use a DELETE and an INSERT so that the created_ts field is reset, or include the created_ts in your UPDATE.

Requirements and Schema

Field NameMinimum RequirementsExample ValuesData Type
user_idOptional. Must match the format of the configured default user ID type for this feed. Since a user can have multiple identities, user ids can also be constructed from other identifying information in the record (email_address, attributes, or other user_id: fields). If none of these methods result in a user ID, this record is not loaded.jdoe, john.doe@example.com, c2e384084c8ac233VARCHAR(120)
user_id:Optional. Column titles are the token user_id followed by a colon followed by the user ID type. Values must match the format of the specified user ID type. Otherwise, processed like user_id."user_id:named": "jdoe", user_id:marketo_id: "29237102"VARCHAR(120)
account_idOptional. If provided, a string value of the account ID to associate with the user. Must be lowercase. This field is only available with Personalization B2B functionality enabled. Contact your customer success representative for details.example industries, example.comVARCHAR(120)
display_nameOptional. Display name for user.Bob NewhartVARCHAR(1023)
email_addressOptional. Valid email address. If provided, used as a user identity of the type emailAddress.user@example.com, bob.smith@example.comVARCHAR(1023)
attribute:Optional. The custom attributes for this user. Column titles are the word attribute followed by a colon followed by the attribute name. Attributes are singled-valued and always interpreted as strings. Attributes can be used to construct user IDs."attribute:lastname": Gina, "attribute:firstname: TorresVARCHAR(1023)
created_tsRequired. Records when the row was created.TIMESTAMP WITHOUT TIME ZONE NOT NULL DEFAULT getdate()