Monday, September 16, 2024

Implement tag-based entry management to your knowledge lake and Amazon Redshift knowledge sharing with AWS Lake Formation


Information-driven organizations deal with knowledge as an asset and use it throughout completely different strains of enterprise (LOBs) to drive well timed insights and higher enterprise choices. Many organizations have a distributed instruments and infrastructure throughout numerous enterprise models. This results in having knowledge throughout many cases of information warehouses and knowledge lakes utilizing a fashionable knowledge structure in separate AWS accounts.

Amazon Redshift knowledge sharing permits you to securely share reside, transactionally constant knowledge in a single Amazon Redshift knowledge warehouse with one other Redshift knowledge warehouse throughout the similar AWS account, throughout accounts, and throughout Areas, without having to repeat or transfer knowledge from one cluster to a different. Prospects need to have the ability to handle their permissions in a central place throughout all of their belongings. Beforehand, the administration of Redshift datashares was restricted to solely inside Amazon Redshift, which made it tough to handle your knowledge lake permissions and Amazon Redshift permissions in a single place. For instance, you needed to navigate to a person account to view and handle entry data for Amazon Redshift and the information lake on Amazon Easy Storage Service (Amazon S3). As a corporation grows, directors need a mechanism to successfully and centrally handle knowledge sharing throughout knowledge lakes and knowledge warehouses for governance and auditing, and to implement fine-grained entry management.

We not too long ago introduced the mixing of Amazon Redshift knowledge sharing with AWS Lake Formation. With this function, Amazon Redshift clients can now handle sharing, apply entry insurance policies centrally, and successfully scale the permission utilizing LF-Tags.

Lake Formation has been a preferred alternative for centrally governing knowledge lakes backed by Amazon S3. Now, with Lake Formation help for Amazon Redshift knowledge sharing, it opens up new design patterns and broadens governance and safety posture throughout knowledge warehouses. With this integration, you need to use Lake Formation to outline fine-grained entry management on tables and views being shared with Amazon Redshift knowledge sharing for federated AWS Identification and Entry Administration (IAM) customers and IAM roles. Lake Formation additionally offers tag-based entry management (TBAC), which can be utilized to simplify and scale governance of information catalog objects reminiscent of databases and tables.

On this submit, we talk about this new function and the best way to implement TBAC to your knowledge lake and Amazon Redshift knowledge sharing on Lake Formation.

Resolution overview

Lake Formation tag-based entry management (LF-TBAC) permits you to group related AWS Glue Information Catalog assets collectively and outline the grant or revoke permissions coverage through the use of an LF-Tag expression. LF-Tags are hierarchical in that when a database is tagged with an LF-Tag, all tables in that database inherit the tag, and when a LF-Tag is utilized to a desk, all of the columns inside that desk inherit the tag. Inherited tags then might be overridden if wanted. You then can create entry insurance policies inside Lake Formation utilizing LF-Tag expressions to grant principals entry to tagged assets utilizing an LF-Tag expression. See Managing LF-Tags for metadata entry management for extra particulars.

To display LF-TBAC with central knowledge entry governance functionality, we use the state of affairs the place two separate enterprise models personal explicit datasets and must share knowledge throughout groups.

We’ve got a buyer care crew who manages and owns the shopper data database together with buyer demographics knowledge. And have a advertising crew who owns a buyer leads dataset, which incorporates data on potential clients and get in touch with leads.

To have the ability to run efficient campaigns, the advertising crew wants entry to the shopper knowledge. On this submit, we display the method of sharing this knowledge that’s saved within the knowledge warehouse and giving the advertising crew entry. Moreover, there are personally identifiable data (PII) columns throughout the buyer dataset that ought to solely be accessed by a subset of energy customers on a need-to-know foundation. This fashion, knowledge analysts inside advertising can solely see non-PII columns to have the ability to run nameless buyer phase evaluation, however a bunch of energy customers can entry PII columns (for instance, buyer e-mail tackle) to have the ability to run campaigns or surveys for particular teams of shoppers.

The next diagram reveals the construction of the datasets that we work with on this submit and a tagging technique to supply fine-grained column-level entry.

Past our tagging technique on the information assets, the next desk offers an summary of how we must always grant permissions to our two personas through tags.

IAM Position Persona Useful resource Kind Permission LF-Tag expression
marketing-analyst An information analyst within the advertising crew DB describe (division:advertising OR division:buyer) AND classification:personal
. Desk choose (division:advertising OR division:buyer) AND classification:personal
. . . . .
marketing-poweruser A privileged person within the advertising crew DB describe (division:advertising OR division:buyer) AND classification: personal
. Desk (Column) choose (division:advertising OR division:buyer) AND (classification:personal OR classification:pii-sensitive)

The next diagram offers a high-level overview of the setup that we deploy on this submit.

The next is a high-level overview of the best way to use Lake Formation to regulate datashare permissions:

Producer Setup:

  1. Within the producers AWS account, the Amazon Redshift administrator that owns the shopper database creates a Redshift datashare on the producer cluster and grants utilization to the AWS Glue Information Catalog in the identical account.
  2. The producer cluster administrator authorizes the Lake Formation account to entry the datashare.
  3. In Lake Formation, the Lake Formation administrator discovers and registers the datashares. They have to uncover the AWS Glue ARNs they’ve entry to and affiliate the datashares with an AWS Glue Information Catalog ARN. In case you’re utilizing the AWS Command Line Interface (AWS CLI), you may uncover and settle for datashares with the Redshift CLI operations describe-data-shares and associate-data-share-consumer. To register a datashare, use the Lake Formation CLI operation register-resource.
  4. The Lake Formation administrator creates a federated database within the AWS Glue Information Catalog; assigns tags to the databases, tables, and columns; and configures Lake Formation permissions to regulate person entry to things throughout the datashare. For extra details about federated databases in AWS Glue, see Managing permissions for knowledge in an Amazon Redshift datashare.

Shopper Setup:

  1. On the patron aspect (advertising), the Amazon Redshift administrator discovers the AWS Glue database ARNs they’ve entry to, creates an exterior database within the Redshift client cluster utilizing an AWS Glue database ARN, and grants utilization to database customers authenticated with IAM credentials to start out querying the Redshift database.
  2. Database customers can use the views SVV_EXTERNAL_TABLES and SVV_EXTERNAL_COLUMNS to search out all of the tables or columns throughout the AWS Glue database that they’ve entry to; then they will question the AWS Glue database’s tables.

When the producer cluster administrator decides to not share the information with the patron cluster, the producer cluster administrator can revoke utilization, deauthorize, or delete the datashare from Amazon Redshift. The related permissions and objects in Lake Formation usually are not robotically deleted.

Stipulations:

To observe the steps on this submit, you should fulfill the next conditions:

Deploy surroundings together with producer and client Redshift clusters

To observe alongside the steps outlined on this submit, deploy following AWS CloudFormation stack that features mandatory assets to display the topic of this submit:

  1. Select Launch stack to deploy a CloudFormation template.
  2. Present an IAM function that you’ve got already configured as a Lake Formation administrator.
  3. Full the steps to deploy the template and go away all settings as default.
  4. Choose I acknowledge that AWS CloudFormation may create IAM assets, then select Submit.

This CloudFormation stack creates the next assets:

  • Producer Redshift cluster – Owned by the shopper care crew and has buyer and demographic knowledge on it.
  • Shopper Redshift cluster – Owned by the advertising crew and is used to investigate knowledge throughout knowledge warehouses and knowledge lakes.
  • S3 knowledge lake – Accommodates the online exercise and leads datasets.
  • Different mandatory assets to display the method of sharing knowledge – For instance, IAM roles, Lake Formation configuration, and extra. For a full record of assets created by the stack, look at the CloudFormation template.

After you deploy this CloudFormation template, assets created will incur price to your AWS account. On the finish of the method, just be sure you clear up assets to keep away from pointless fees.

After the CloudFormation stack is deployed efficiently (standing reveals as CREATE_COMPLETE), pay attention to the next objects on the Outputs tab:

  • Advertising and marketing analyst function ARN
  • Advertising and marketing energy person function ARN
  • URL for Amazon Redshift admin password saved in AWS Secrets and techniques Supervisor

Create a Redshift datashare and add related tables

On the AWS Administration Console, swap to the function that you just nominated as Lake Formation admin when deploying the CloudFormation template. Then go to Question Editor v2. If that is the primary time utilizing Question Editor V2 in your account, observe these steps to configure your AWS account.

Step one in Question Editor is to log in to the shopper Redshift cluster utilizing the database admin credentials to make your IAM admin function a DB admin on the database.

  1. Select the choices menu (three dots) subsequent to the lfunified-customer-dwh cluster and select Create connection.

  2. Choose Database person identify and password.
  3. Depart Database as dev.
  4. For Person identify, enter admin.
  5. For Password, full the next steps:
    1. Go to the console URL, which is the worth of the RedShiftClusterPassword CloudFormation output in earlier step. The URL is the Secrets and techniques Supervisor console for this password.
    2. Scroll right down to the Secret worth part and select Retrieve secret worth.
    3. Pay attention to the password to make use of later when connecting to the advertising Redshift cluster.
    4. Enter this worth for Password.
  6. Select Create connection.

Create a datashare utilizing a SQL command

Full the next steps to create a datashare within the knowledge producer cluster (buyer care) and share it with Lake Formation:

  1. On the Amazon Redshift console, within the navigation pane, select Editor, then Question editor V2.
  2. Select (right-click) the cluster identify and select Edit connection or Create connection.
  3. For Authentication, choose Short-term credentials utilizing your IAM identification.

Seek advice from Connecting to an Amazon Redshift database to study extra concerning the numerous authentication strategies.

  1. For Database, enter a database identify (for this submit, dev).
  2. Select Create connection to hook up with the database.
  3. Run the next SQL instructions to create the datashare and add the information objects to be shared:
    create datashare customer_ds;
    ALTER DATASHARE customer_ds ADD SCHEMA PUBLIC;
    ALTER DATASHARE customer_ds ADD TABLE buyer;

  4. Run the next SQL command to share the shopper datashare to the present account through the AWS Glue Information Catalog:
    GRANT USAGE ON DATASHARE customer_ds TO ACCOUNT '<aws-account-id>' through DATA CATALOG;

  5. Confirm the datashare was created and objects shared by working the next SQL command:
    DESC DATASHARE customer_ds;

Pay attention to the datashare producer cluster identify area and account ID, which can be used within the following step. You possibly can full the next actions on the console, however for simplicity, we use AWS CLI instructions.

  1. Go to CloudShell or your AWS CLI and run the next AWS CLI command to authorize the datashare to the Information Catalog in order that Lake Formation can handle them:
    aws redshift authorize-data-share 
    --data-share-arn 'arn:aws:redshift:<aws-region>:<aws-account-id>:datashare:<producer-cluster-namespace>/customer_ds' 
    --consumer-identifier DataCatalog/<aws-account-id>

The next is an instance output:

 {
    "DataShareArn": "arn:aws:redshift:us-east-2:<aws-account-id>:datashare:cd8d91b5-0c17-4567-a52a-59f1bdda71cd/customer_ds",
    "ProducerArn": "arn:aws:redshift:us-east-2:<aws-account-id>:namespace:cd8d91b5-0c17-4567-a52a-59f1bdda71cd",
    "AllowPubliclyAccessibleConsumers": false,
    "DataShareAssociations": [{
        "ConsumerIdentifier": "DataCatalog/<aws-account-id>XX",
        "Status": "AUTHORIZED",
        "CreatedDate": "2022-11-09T21:10:30.507000+00:00",
        "StatusChangeDate": "2022-11-09T21:10:50.932000+00:00"
    }]
}

Pay attention to your datashare ARN that you just used on this command to make use of within the subsequent steps.

Settle for the datashare within the Lake Formation catalog

To simply accept the datashare, full the next steps:

  1. Run the next AWS CLI command to simply accept and affiliate the Amazon Redshift datashare to the AWS Glue Information Catalog:
    aws redshift associate-data-share-consumer --data-share-arn 'arn:aws:redshift:<aws-region>:<aws-account-id>:datashare:<producer-cluster-namespace>/customer_ds' 
    --consumer-arn arn:aws:glue:<aws-region>:<aws-account-id>:catalog

The next is an instance output:

{
 "DataShareArn": "arn:aws:redshift:us-east-2:<aws-account-id>:datashare:cfd5fcbd-3492-42b5-9507-dad5d87f7427/customer_ds",
 "ProducerArn": "arn:aws:redshift:us-east-2:<aws-account-id>:namespace:cfd5fcbd-3492-42b5-9507-dad5d87f7427",
 "AllowPubliclyAccessibleConsumers": false,
 "DataShareAssociations": [
 {
 "ConsumerIdentifier": "arn:aws:glue:us-east-2:<aws-account-id>:catalog",
 "Status": "ACTIVE",
 "ConsumerRegion": "us-east-2",
 "CreatedDate": "2023-05-18T12:25:11.178000+00:00",
 "StatusChangeDate": "2023-05-18T12:25:11.178000+00:00"
 }
 ]
}

  1. Register the datashare in Lake Formation:
    aws lakeformation register-resource 
     --resource-arn arn:aws:redshift:<aws-region>:<producer-aws-account-id>:datashare:<producer-cluster-namespace>/customer_ds

  2. Create the AWS Glue database that factors to the accepted Redshift datashare:
    aws glue create-database --region <aws-region> --cli-input-json '{
        "CatalogId": "<aws-account-id>",
        "DatabaseInput": {
            "Identify": "customer_db_shared",
            "FederatedDatabase": {
                "Identifier": "arn:aws:redshift:<aws-region>:<producer-aws-account-id>:datashare:<producer-cluster-namespace>/customer_ds",
                "ConnectionName": "aws:redshift"
            }
        }
    }'

  3. To confirm, go to the Lake Formation console and test that the database customer_db_shared is created.

Now the information lake administrator can view and grant entry on each the database and tables to the information client crew (advertising) personas utilizing Lake Formation TBAC.

Assign Lake Formation tags to assets

Earlier than we grant applicable entry to the IAM principals of the information analyst and energy person throughout the advertising crew, we have now to assign LF-tags to tables and columns of the customer_db_shared database. We then grant these principals permission to applicable LF-tags.

To assign LF-tags, observe these steps:

  1. Assign the division and classification LF-tag to customer_db_shared (Redshift datashare) primarily based on the tagging technique desk within the resolution overview. You possibly can run the next actions on the console, however for this submit, we use the next AWS CLI command:
    aws lakeformation add-lf-tags-to-resource --cli-input-json '{
        "CatalogId": "<aws-account-id>",
        "Useful resource": {
        "Database": {
        "CatalogId": "<aws-account-id>",
        "Identify": "customer_db_shared"
        }
        },
        "LFTags": [
        {
        "CatalogId": "<aws-account-id>",
        "TagKey": "department",
        "TagValues": [
        "customer"]
        },
        {
        "CatalogId": "<aws-account-id>",
        "TagKey": "classification",
        "TagValues": [
        "private"]
        }
        ]
        }'

If the command is profitable, you need to get a response like the next:

  1. Assign the suitable division and classification LF-tag to marketing_db (on the S3 knowledge lake):
    aws lakeformation add-lf-tags-to-resource --cli-input-json '{
        "CatalogId": "<aws-account-id>",
        "Useful resource": {
        "Database": {
        "CatalogId": "<aws-account-id>",
        "Identify": "lfunified_marketing_dl_db"
        }
        },
        "LFTags": [
        {
        "CatalogId": "<aws-account-id>",
        "TagKey": "department",
        "TagValues": [
        "marketing"]
        },
        {
        "CatalogId": "<aws-account-id>",
        "TagKey": "classification",
        "TagValues": [
        "private"]
        }
        ]
        }'

Notice that though you solely assign the division and classification tag on the database degree, it will get inherited by the tables and columns inside that database.

  1. Assign the classification pii-sensitive LF-tag to PII columns of the buyer desk to override the inherited worth from the database degree:
    aws lakeformation add-lf-tags-to-resource --cli-input-json '{
        "CatalogId": "<aws-account-id>",
        "Useful resource": {
        "TableWithColumns": {
        "CatalogId": "<aws-account-id>",
        "DatabaseName": "customer_db_shared",
        "Identify": "public.buyer",
        "ColumnNames":["c_first_name","c_last_name","c_email_address"]
        }
        },
        "LFTags": [
        {
        "CatalogId": "<aws-account-id>",
        "TagKey": "classification",
        "TagValues": [
        "pii-sensitive"]
        }
        ]
        }'

Grant permission primarily based on LF-tag affiliation

Run the next two AWS CLI instructions to permit the advertising knowledge analyst entry to the shopper desk excluding the pii-sensitive (PII) columns. Substitute the worth for DataLakePrincipalIdentifier with the MarketingAnalystRoleARN that you just famous from the outputs of the CloudFormation stack:

aws lakeformation grant-permissions --cli-input-json '{
    "CatalogId": "<aws-account-id>",
    "Principal": {"DataLakePrincipalIdentifier" : "<MarketingAnalystRoleARN-from-CloudFormation-Outputs>"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "DATABASE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private"]}]
    }
    },
    "Permissions": [
    "DESCRIBE"
    ],
    "PermissionsWithGrantOption": []
}'
aws lakeformation grant-permissions --cli-input-json '{
    "CatalogId": "<aws-account-id>",
    "Principal": {"DataLakePrincipalIdentifier" : "<MarketingAnalystRoleARN-from-CloudFormation-Outputs>"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "TABLE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private"]}]
    }
    },
    "Permissions": [
    "SELECT"
    ],
    "PermissionsWithGrantOption": []
}'

We’ve got now granted advertising analysts entry to the shopper database and tables that aren’t pii-sensitive.

To permit advertising energy customers entry to desk columns with restricted LF-tag (PII columns), run the next AWS CLI command:

aws lakeformation grant-permissions --cli-input-json '{
    "CatalogId": "<aws-account-id>",
    "Principal": {"DataLakePrincipalIdentifier" : "<MarketingPowerUserRoleARN-from-CloudFormation-Outputs>"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "DATABASE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private"]}]
    }
    },
    "Permissions": [
    "DESCRIBE"
    ],
    "PermissionsWithGrantOption": []
}'
aws lakeformation grant-permissions --cli-input-json '{
    "CatalogId": "<aws-account-id>",
    "Principal": {"DataLakePrincipalIdentifier" : "<MarketingPowerUserRoleARN-from-CloudFormation-Outputs>"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "TABLE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private", "pii-sensitive"]}]
    }
    },
    "Permissions": [
    "SELECT"
    ],
    "PermissionsWithGrantOption": []
}'

We are able to mix the grants right into a single batch grant permissions name:

aws lakeformation batch-grant-permissions --region us-east-1 --cli-input-json '{
    "CatalogId": "<aws-account-id>",
 "Entries": [
 {  "Id": "1",
    "Principal": {"DataLakePrincipalIdentifier" : "arn:aws:iam:: <aws-account-id>:role/Blog-MarketingAnalystRole-1CYV6JSNN14E3"},
    "Resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "DATABASE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private"]}]
    }
    },
    "Permissions": [
    "DESCRIBE"
    ],
    "PermissionsWithGrantOption": []
    },
    {  "Id": "2",
    "Principal": {"DataLakePrincipalIdentifier" : "arn:aws:iam:: <aws-account-id>:function/Weblog-MarketingAnalystRole-1CYV6JSNN14E3"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "TABLE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private"]}]
    }
    },
    "Permissions": [
    "SELECT"
    ],
    "PermissionsWithGrantOption": []
    },
     {  "Id": "3",
    "Principal": {"DataLakePrincipalIdentifier" : "arn:aws:iam:: <aws-account-id>:function/Weblog-MarketingPoweruserRole-RKKM0TWQBP0W"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "DATABASE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private", "pii-sensitive"]}]
    }
    },
    "Permissions": [
    "DESCRIBE"
    ],
    "PermissionsWithGrantOption": []
    },
    {  "Id": "4",
    "Principal": {"DataLakePrincipalIdentifier" : "arn:aws:iam:: <aws-account-id>:function/Weblog-MarketingPoweruserRole-RKKM0TWQBP0W"},
    "Useful resource": {
    "LFTagPolicy": {
    "CatalogId": "<aws-account-id>",
    "ResourceType": "TABLE",
    "Expression": [{"TagKey": "department","TagValues": ["marketing","customer"]},{"TagKey": "classification","TagValues": ["private", "pii-sensitive"]}]
    }
    },
    "Permissions": [
    "SELECT"
    ],
    "PermissionsWithGrantOption": []
    }
    ]
 }'

Validate the answer

On this part, we undergo the steps to check the state of affairs.

Eat the datashare within the client (advertising) knowledge warehouse

To allow the shoppers (advertising crew) to entry the shopper knowledge shared with them through the datashare, first we have now to configure Question Editor v2. This configuration is to make use of IAM credentials because the principal for the Lake Formation permissions. Full the next steps:

  1. Sign up to the console utilizing the admin function you nominated in working the CloudFormation template step.
  2. On the Amazon Redshift console, go to Question Editor v2.
  3. Select the gear icon within the navigation pane, then select Account settings.
  4. Beneath Connection settings, choose Authenticate with IAM credentials.
  5. Select Save.

Now let’s hook up with the advertising Redshift cluster and make the shopper database obtainable to the advertising crew.

  1. Select the choices menu (three dots) subsequent to the Serverless:lfunified-marketing-wg cluster and select Create connection.
  2. Choose Database person identify and password.
  3. Depart Database as dev.
  4. For Person identify, enter admin.
  5. For Password, enter the identical password you retrieved from Secrets and techniques Manger in an earlier step.
  6. Select Create connection.
  7. As soon as efficiently related, select the plus signal and select Editor to open a brand new Question Editor tab.
  8. Just be sure you specify the Serverless: lfunified-marketing-wg workgroup and dev database.
  9. To create the Redshift database from the shared catalog database, run the next SQL command on the brand new tab:
    CREATE DATABASE ext_customerdb_shared FROM ARN 'arn:aws:glue:<aws-region>:<aws-account-id>:database/customer_db_shared' WITH DATA CATALOG SCHEMA "customer_db_shared"

  10. Run the next SQL instructions to create and grant utilization on the Redshift database to the IAM roles for the ability customers and knowledge analyst. You will get the IAM function names from the CloudFormation stack outputs:
    CREATE USER IAMR:"lf-redshift-ds-MarketingAnalystRole-XXXXXXXXXXXX" password disable;
    GRANT USAGE ON DATABASE ext_customerdb_shared to IAMR:"lf-redshift-ds-MarketingAnalystRole-XXXXXXXXXXXX";
    
    CREATE USER IAMR:"lf-redshift-ds-MarketingPoweruserRole-YYYYYYYYYYYY" password disable;
    GRANT USAGE ON DATABASE ext_customerdb_shared to IAMR:"lf-redshift-ds-MarketingPoweruserRole-YYYYYYYYYYYY";

Create the information lake schema in AWS Glue and permit the advertising energy function to question the lead and internet exercise knowledge

Run the next SQL instructions to make the lead knowledge within the S3 knowledge lake obtainable to the advertising crew:

create exterior schema datalake from knowledge catalog
database 'lfunified_marketing_dl_db' 
iam_role 'SESSION'
catalog_id '<aws-account-id>';
GRANT USAGE ON SCHEMA datalake TO IAMR:"lf-redshift-ds-MarketingAnalystRole-XXXXXXXXXXXX";
GRANT USAGE ON SCHEMA datalake TO IAMR:"lf-redshift-ds-MarketingPoweruserRole-YYYYYYYYYYYY";

Question the shared dataset as a advertising analyst person

To validate that the advertising crew analysts (IAM function marketing-analyst-role) have entry to the shared database, carry out the next steps:

  1. Sign up to the console (for comfort, you need to use a distinct browser) and swap your function to lf-redshift-ds-MarketingAnalystRole-XXXXXXXXXXXX.
  2. On the Amazon Redshift console, go to Question Editor v2.
  3. To connect with the patron cluster, select the Serverless: lfunified-marketing-wg client knowledge warehouse within the navigation pane.
  4. When prompted, for Authentication, choose Federated person.
  5. For Database, enter the database identify (for this submit, dev).
  6. Select Save.
  7. When you’re related to the database, you may validate the present logged-in person with the next SQL command:
    choose current_user;

  8. To search out the federated databases created on the patron account, run the next SQL command:
    SHOW DATABASES FROM DATA CATALOG ACCOUNT '<aws-account-id>';

  9. To validate permissions for the advertising analyst function, run the next SQL command:
    choose * from ext_customerdb_shared.public.buyer restrict 10;

As you may see within the following screenshot, the advertising analyst is ready to efficiently entry the shopper knowledge however solely the non-PII attributes, which was our intention.

  1. Now let’s validate that the advertising analyst doesn’t have entry to the PII columns of the identical desk:
    choose c_customer_email from ext_customerdb_shared.public.buyer restrict 10;

Question the shared datasets as a advertising energy person

To validate that the advertising energy customers (IAM function lf-redshift-ds-MarketingPoweruserRole-YYYYYYYYYYYY) have entry to pii-sensetive columns within the shared database, carry out the next steps:

  1. Sign up to the console (for comfort, you need to use a distinct browser) and swap your function to lf-redshift-ds-MarketingPoweruserRole-YYYYYYYYYYYY.
  2. On the Amazon Redshift console, go to Question Editor v2.
  3. To connect with the patron cluster, select the Serverless: lfunified-marketing-wg client knowledge warehouse within the navigation pane.
  4. When prompted, for Authentication, choose Federated person.
  5. For Database, enter the database identify (for this submit, dev).
  6. Select Save.
  7. When you’re related to the database, you may validate the present logged-in person with the next SQL command:
    choose current_user;

  8. Now let’s validate that the advertising energy function has entry to the PII columns of the shopper desk:
    choose c_customer_id, c_first_name, c_last_name,c_customer_email from customershareddb.public.buyer restrict 10;

  9. Validate that the ability customers throughout the advertising crew can now run a question to mix knowledge throughout completely different datasets that they’ve entry to with a view to run efficient campaigns:
    SELECT
        emailaddress as emailAddress,  buyer.c_first_name as firstName, buyer.c_last_name as lastName, leadsource, contactnotes, usedpromo
    FROM
        "dev"."datalake"."lead" as lead
    JOIN ext_customerdb_shared.public.buyer as buyer
    ON lead.emailaddress = buyer.c_email_address
    WHERE lead.donotreachout="false"

Clear up

After you full the steps on this submit, to wash up assets, delete the CloudFormation stack:

  1. On the AWS CloudFormation console, choose the stack you deployed to start with of this submit.
  2. Select Delete and observe the prompts to delete the stack.

Conclusion

On this submit, we confirmed how you need to use Lake Formation tags and handle permissions to your knowledge lake and Amazon Redshift knowledge sharing utilizing Lake Formation. Utilizing Lake Formation LF-TBAC for knowledge governance helps you handle your knowledge lake and Amazon Redshift knowledge sharing permissions at scale. Additionally, it permits knowledge sharing throughout enterprise models with fine-grained entry management. Managing entry to your knowledge lake and Redshift datashares in a single place permits higher governance, serving to with knowledge safety and compliance.

You probably have questions or ideas, submit them within the feedback part.

For extra data on Lake Formation managed Amazon Redshift knowledge sharing and tag-based entry management, discuss with Centrally handle entry and permissions for Amazon Redshift knowledge sharing with AWS Lake Formation and Simply handle your knowledge lake at scale utilizing AWS Lake Formation Tag-based entry management.


In regards to the Authors

Praveen Kumar is an Analytics Resolution Architect at AWS with experience in designing, constructing, and implementing fashionable knowledge and analytics platforms utilizing cloud-native providers. His areas of pursuits are serverless know-how, fashionable cloud knowledge warehouses, streaming, and ML functions.

Srividya Parthasarathy is a Senior Large Information Architect on the AWS Lake Formation crew. She enjoys constructing knowledge mesh options and sharing them with the group.

Paul Villena is an Analytics Options Architect in AWS with experience in constructing fashionable knowledge and analytics options to drive enterprise worth. He works with clients to assist them harness the ability of the cloud. His areas of pursuits are infrastructure as code, serverless applied sciences, and coding in Python.

Mostafa Safipour is a Options Architect at AWS primarily based out of Sydney. He works with clients to comprehend enterprise outcomes utilizing know-how and AWS. Over the previous decade, he has helped many giant organizations within the ANZ area construct their knowledge, digital, and enterprise workloads on AWS.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
3,912FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

Latest Articles