Skip to content

CREATE CATALOG INTEGRATION

The CREATE CATALOG INTEGRATION command defines a named external catalog integration that can be used to access metadata in external catalog systems like AWS Glue, Iceberg REST catalogs, or object storage-based catalogs.

Syntax

CREATE CATALOG INTEGRATION [IF NOT EXISTS] integration_name
CATALOG_SOURCE = source_type
TABLE_FORMAT = format_type
[CATALOG_NAMESPACE = 'namespace']
ENABLED = [TRUE|FALSE]
[GLUE_AWS_ROLE_ARN = 'aws_role_arn']
[GLUE_CATALOG_ID = 'catalog_id']
[GLUE_REGION = 'region']
[REST_CONFIG.CATALOG_URI = 'catalog_uri']
[REST_CONFIG.CATALOG_API_TYPE = api_type]
[REST_CONFIG.WAREHOUSE = 'warehouse']
[REST_AUTHENTICATION.TYPE = auth_type]
[REST_AUTHENTICATION.OAUTH_CLIENT_ID = 'client_id']
[REST_AUTHENTICATION.OAUTH_CLIENT_SECRET = 'client_secret']
[REST_AUTHENTICATION.OAUTH_ALLOWED_SCOPES = ('scope1', 'scope2', ...)]
[REST_AUTHENTICATION.BASIC_USERNAME = 'username']
[REST_AUTHENTICATION.BASIC_PASSWORD = 'password']
[REST_AUTHENTICATION.SIGV4_IAM_ROLE = 'role_arn']
[REST_AUTHENTICATION.SIGV4_SIGNING_REGION = 'region']
[STORAGE_PROVIDER = provider_type]
[STORAGE_LOCATION = 'location']
[STORAGE_AWS_ROLE_ARN = 'aws_role_arn']
[STORAGE_AWS_EXTERNAL_ID = 'external_id']
[AZURE_TENANT_ID = 'tenant_id']
[AZURE_APPLICATION_ID = 'application_id']
[GCP_SERVICE_ACCOUNT = 'service_account_email']
[COMMENT = 'comment_string']
  • IF NOT EXISTS: (Optional) Prevents an error if an integration with the same name already exists.
  • integration_name: The name of the catalog integration to create.
  • CATALOG_SOURCE = source_type: The type of catalog source. Valid values are:
    • GLUE: AWS Glue Data Catalog
    • ICEBERG_REST: Iceberg REST catalog
    • OBJECT_STORAGE: Object storage-based catalog (e.g., S3, GCS, Azure)
  • TABLE_FORMAT = format_type: The table format used by the catalog. Currently, only ICEBERG is supported.
  • CATALOG_NAMESPACE = 'namespace': (Optional) The namespace to use within the catalog.
  • ENABLED = [TRUE|FALSE]: Whether the integration is enabled.
  • COMMENT = 'comment_string': (Optional) A comment describing the integration.

AWS Glue-Specific Parameters

  • GLUE_AWS_ROLE_ARN = 'aws_role_arn': (Required for Glue) The ARN of the AWS IAM role that ScramDB will assume to access Glue.
  • GLUE_CATALOG_ID = 'catalog_id': (Optional for Glue) The AWS account ID that owns the Glue catalog.
  • GLUE_REGION = 'region': (Optional for Glue) The AWS region where the Glue catalog is located.

Iceberg REST-Specific Parameters

  • REST_CONFIG.CATALOG_URI = 'catalog_uri': (Required for Iceberg REST) The URI of the REST catalog endpoint.
  • REST_CONFIG.CATALOG_API_TYPE = api_type: (Optional for Iceberg REST) The API type of the REST catalog. Valid values are PUBLIC or PRIVATE.
  • REST_CONFIG.WAREHOUSE = 'warehouse': (Optional for Iceberg REST) The warehouse location.
  • REST_AUTHENTICATION.TYPE = auth_type: (Required for Iceberg REST) The authentication type. Valid values are OAUTH, BASIC, or SIGV4.
  • REST_AUTHENTICATION.OAUTH_CLIENT_ID = 'client_id': (Required for OAuth) The OAuth client ID.
  • REST_AUTHENTICATION.OAUTH_CLIENT_SECRET = 'client_secret': (Required for OAuth) The OAuth client secret.
  • REST_AUTHENTICATION.OAUTH_ALLOWED_SCOPES = ('scope1', 'scope2', ...): (Optional for OAuth) A list of allowed OAuth scopes.
  • REST_AUTHENTICATION.BASIC_USERNAME = 'username': (Required for Basic auth) The username for basic authentication.
  • REST_AUTHENTICATION.BASIC_PASSWORD = 'password': (Required for Basic auth) The password for basic authentication.
  • REST_AUTHENTICATION.SIGV4_IAM_ROLE = 'role_arn': (Required for SigV4) The ARN of the AWS IAM role to use for SigV4 signing.
  • REST_AUTHENTICATION.SIGV4_SIGNING_REGION = 'region': (Required for SigV4) The AWS region to use for SigV4 signing.

Object Storage-Specific Parameters

  • STORAGE_PROVIDER = provider_type: (Required for Object Storage) The type of storage provider. Valid values are S3, GCS, or AZURE.
  • STORAGE_LOCATION = 'location': (Required for Object Storage) The location of the catalog in object storage (e.g., s3://bucket/catalog/).
  • STORAGE_AWS_ROLE_ARN = 'aws_role_arn': (Required for S3) The ARN of the AWS IAM role that ScramDB will assume to access S3.
  • STORAGE_AWS_EXTERNAL_ID = 'external_id': (Optional for S3) The external ID to use when assuming the AWS IAM role.
  • AZURE_TENANT_ID = 'tenant_id': (Required for Azure) The Azure tenant ID.
  • AZURE_APPLICATION_ID = 'application_id': (Required for Azure) The Azure application ID.
  • GCP_SERVICE_ACCOUNT = 'service_account_email': (Required for GCS) The email address of the GCP service account.

Source-Specific Examples

AWS Glue Catalog

CREATE CATALOG INTEGRATION glue_catalog_int
CATALOG_SOURCE = GLUE
TABLE_FORMAT = ICEBERG
CATALOG_NAMESPACE = 'my_namespace'
GLUE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/my-role'
GLUE_CATALOG_ID = '123456789012'
GLUE_REGION = 'us-east-1'
ENABLED = TRUE
COMMENT = 'AWS Glue catalog integration'

Iceberg REST Catalog

CREATE CATALOG INTEGRATION rest_catalog_int
CATALOG_SOURCE = ICEBERG_REST
TABLE_FORMAT = ICEBERG
CATALOG_NAMESPACE = 'my_namespace'
REST_CONFIG.CATALOG_URI = 'https://rest-catalog-endpoint.example.com/iceberg'
REST_CONFIG.CATALOG_API_TYPE = PUBLIC
REST_AUTHENTICATION.TYPE = OAUTH
REST_AUTHENTICATION.OAUTH_CLIENT_ID = 'my_client_id'
REST_AUTHENTICATION.OAUTH_CLIENT_SECRET = 'my_client_secret'
REST_AUTHENTICATION.OAUTH_ALLOWED_SCOPES = ('PRINCIPAL_ROLE:ALL')
ENABLED = TRUE
COMMENT = 'Iceberg REST catalog integration'

Object Storage-Based Catalog (S3)

CREATE CATALOG INTEGRATION s3_catalog_int
CATALOG_SOURCE = OBJECT_STORAGE
TABLE_FORMAT = ICEBERG
CATALOG_NAMESPACE = 'my_namespace'
STORAGE_PROVIDER = S3
STORAGE_LOCATION = 's3://my-bucket/iceberg-catalog/'
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/my-s3-role'
ENABLED = TRUE
COMMENT = 'S3-based Iceberg catalog integration'

Usage

After creating a catalog integration, you can use it to query tables in the external catalog:

-- Set the current catalog to use the integration
USE CATALOG glue_catalog_int;
-- Query a table from the catalog
SELECT * FROM my_database.my_table;

Information Schema

You can view information about catalog integrations using the information_schema.catalog_integrations view:

SELECT * FROM information_schema.catalog_integrations;