Home Google Scams Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention...

Google Scams

Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention (DLP)

November 11, 2022

Redacting Session Parameters, Webhook data, and Response Messages

For Session Parameters, Webhook data, and other data logged by Dialogflow CX, including Fulfillment Response Messages, the approach to redact such information relies on Cloud Data Loss Prevention (DLP) inspection templates.

Session Parameters are often used to personalize the conversation with user data from an upstream system. For example, an upstream contact center platform may fetch the user’s profile from a CRM, and pass in the first name, demographic data, and market segment information into Dialogflow CX. A conversation designer may then tailor the Flow design by changing Intent training phrases, Entity synonyms, and responses (e.g. different durations, volume, pitch, or rate of speech) to fit the user’s unique requirements.

Similarly, Webhook data is important in conversation design because it enables rich, dynamic responses to the user supported by backend systems. For example, let’s say a customer is moving to a new apartment, so your Dialogflow CX Virtual Agent asks the user to say their new street address. A Webhook would be used to validate the captured address against an external service like the Google Maps Places API, which may also autocomplete the city, state / province, zip / postal code, and country fields. It’s risky if we capture the wrong address, so the Virtual Agent says full address back to the end user for confirmation.

In both examples above, PII data is stored as one or more Session Parameters and Webhook payloads. Additionally, the Response Messages played back to the user are logged. If we don’t take action to identify and redact this data, it will make its way into Google Cloud Logging (formerly Stackdriver) and any listeners subscribing to the log stream.

Below, we demonstrate how we can configure security settings in Dialogflow CX to use a Cloud Data Loss Prevention Inspection Template to redact sensitive information before it gets into downstream logging systems (i.e. redaction at source). This ensures sensitive information will be unavailable downstream while still allowing the information to be used in the design of the Virtual Agent.

Key Components

Data Loss Prevention (DLP) Inspection Templates

Our solution uses Google Cloud Data Loss Prevention (DLP), which is a service that can identify, mask, obfuscate, de-identify, transform, or tokenize sensitive information in text using NLP- and rules-based methods. To leverage DLP to redact all log data from Dialogflow CX at source, we create configurations (also known as Inspection Templates) that can identify and transform unstructured text information in a document. In our case, the documents are the log messages that contain the Session Parameters, Webhook data, Fulfillment Response Messages and any other interaction data. To identify PII, PCI, PHI, or CI, we can set the configuration to use a pre-trained machine learning model (i.e. built-in infoTypes) or a custom string search (i.e. word lists or regex).

Speech Synthesis Markup Language (SSML)

Our solution uses Speech Synthesis Markup Language (SSML). A brief explanation of SSML is included in the paragraph below:

When working with Text-to-Speech (TTS) systems, it is difficult to know how the system will say the final utterance to a user. This is where SSML is useful. SSML is a WC3 standard that uses XML tags to describe, at various points, how the TTS system must say the phrase. You can change the pitch, pronunciation, speaking rate, and volume among many other properties. For example, if you have a phone number and it is written as “555-6666” then you likely would like it said as “five five five six six six six” instead of “five hundred and fifty five minus six thousand six hundred and sixty six”. You can give these precise instructions to the TTS system adding the following SSML:

<say-as interpret-as=”telephone”>555-6666</say-as>

Contact Center AI (CCAI) Security Settings

CCAI Security Settings allows you to apply a DLP Inspection Template between Dialogflow CX and Google Cloud Logging. The DLP system can then find and redact the sensitive information before being published to Stackdriver.

Solution

The required security settings can be applied in various ways such as through the Google Cloud Console, using Google Cloud API’s, and using Terraform.
Below, we outline two approaches: 1) using the Google Cloud Console and 2) using Terraform.

Important Considerations

The first seemingly obvious, but flawed solution is to use DLP or a similar system to redact sensitive information in the first downstream system that consumes the Dialogflow CX log messages. Perhaps there is a log sink flowing to a Cloud Storage bucket, BigQuery table, Pub/Sub topic, or other destination (e.g. Splunk) where such redaction will occur before any other consumers will have access to the data. In practice, data in Cloud Logging is easily viewable and propagates to other monitoring applications, this increases the surface area for unintentional or intentional privacy breaches by both internal and external parties. As such, please consider this an anti-pattern.

Another important note is that the solution we select should still enable sensitive information, including PII data, to be usable in responses to the end user and should remain compatible with SSML.

Instructions – Google Cloud Console

Now that we understand the requirements and all the components involved, the first step is to return all Session Parameter and Webhook data that is to be redacted with the SSML mark tag shown below. This is configured at the webhook level.

<mark name=”redact-start”/>123 Main Street<mark name=”redact-end”/>

This SSML tag is selected because it is a reserved tag in the SSML WC3 specifications which will not affect speech output by TTS systems. This ensures the data can be used in Response Messages by the Dialogflow CX Agent. Note that the “name” attribute can be anything and should match your convention.

Next, define a string pattern in a DLP inspection template as an infoType that will search for these tags. Below is the configuration with the search tag of “<mark name=”redact-start”/>.*<mark name=”redact-end”/>”.

Source link

Redacting PII data in Dialogflow CX with Google Cloud Data Loss Prevention (DLP)

Redacting Session Parameters, Webhook data, and Response Messages

Key Components

Solution

LEAVE A REPLY Cancel reply

Don't Miss

Cambodia deports 19 Japanese cybercrime scam suspects – Federal News Network

Criminals targeting seniors from other countries, even under the victim’s roof

What Made The Popular Kid At School The Unpopular Kid?

GBN News Live – 11th November, 2022

Liberal MP backtracks after calling for review of Trudeau's leadership

Latest Scam Videos

K. Madhavan Vs Harshad Mehta | Rajat Kapoor | Scam 1992 | Sony Liv

Scam warnings for job seekers

How To Annoy A Microsoft Tech Support Scammer!

Ye to scam ho Gaya bhai 😠🤬.. #shortsviral #bhiwandi

Mobile Phone Hack Trick से बच कर रहना | How Scammers Hacking Mobile in...

EVEN MORE NEWS

AI Voice Cloning and Bank Voice Authentication: A Recipe for Disaster?

Razer’s Kishi Ultra gaming controller brings haptics to your USB-C phone,...

EMAIL Is still the number one way people get scammed and...

POPULAR CATEGORY