CREATE CATALOG INTEGRATION
The CREATE CATALOG INTEGRATION
command defines a named external catalog integration that can be used to access metadata in external catalog systems like AWS Glue, Iceberg REST catalogs, or object storage-based catalogs.
Syntax
CREATE CATALOG INTEGRATION [IF NOT EXISTS] integration_name CATALOG_SOURCE = source_type TABLE_FORMAT = format_type [CATALOG_NAMESPACE = 'namespace'] ENABLED = [TRUE|FALSE] [GLUE_AWS_ROLE_ARN = 'aws_role_arn'] [GLUE_CATALOG_ID = 'catalog_id'] [GLUE_REGION = 'region'] [REST_CONFIG.CATALOG_URI = 'catalog_uri'] [REST_CONFIG.CATALOG_API_TYPE = api_type] [REST_CONFIG.WAREHOUSE = 'warehouse'] [REST_AUTHENTICATION.TYPE = auth_type] [REST_AUTHENTICATION.OAUTH_CLIENT_ID = 'client_id'] [REST_AUTHENTICATION.OAUTH_CLIENT_SECRET = 'client_secret'] [REST_AUTHENTICATION.OAUTH_ALLOWED_SCOPES = ('scope1', 'scope2', ...)] [REST_AUTHENTICATION.BASIC_USERNAME = 'username'] [REST_AUTHENTICATION.BASIC_PASSWORD = 'password'] [REST_AUTHENTICATION.SIGV4_IAM_ROLE = 'role_arn'] [REST_AUTHENTICATION.SIGV4_SIGNING_REGION = 'region'] [STORAGE_PROVIDER = provider_type] [STORAGE_LOCATION = 'location'] [STORAGE_AWS_ROLE_ARN = 'aws_role_arn'] [STORAGE_AWS_EXTERNAL_ID = 'external_id'] [AZURE_TENANT_ID = 'tenant_id'] [AZURE_APPLICATION_ID = 'application_id'] [GCP_SERVICE_ACCOUNT = 'service_account_email'] [COMMENT = 'comment_string']
IF NOT EXISTS
: (Optional) Prevents an error if an integration with the same name already exists.integration_name
: The name of the catalog integration to create.CATALOG_SOURCE = source_type
: The type of catalog source. Valid values are:GLUE
: AWS Glue Data CatalogICEBERG_REST
: Iceberg REST catalogOBJECT_STORAGE
: Object storage-based catalog (e.g., S3, GCS, Azure)
TABLE_FORMAT = format_type
: The table format used by the catalog. Currently, onlyICEBERG
is supported.CATALOG_NAMESPACE = 'namespace'
: (Optional) The namespace to use within the catalog.ENABLED = [TRUE|FALSE]
: Whether the integration is enabled.COMMENT = 'comment_string'
: (Optional) A comment describing the integration.
AWS Glue-Specific Parameters
GLUE_AWS_ROLE_ARN = 'aws_role_arn'
: (Required for Glue) The ARN of the AWS IAM role that ScramDB will assume to access Glue.GLUE_CATALOG_ID = 'catalog_id'
: (Optional for Glue) The AWS account ID that owns the Glue catalog.GLUE_REGION = 'region'
: (Optional for Glue) The AWS region where the Glue catalog is located.
Iceberg REST-Specific Parameters
REST_CONFIG.CATALOG_URI = 'catalog_uri'
: (Required for Iceberg REST) The URI of the REST catalog endpoint.REST_CONFIG.CATALOG_API_TYPE = api_type
: (Optional for Iceberg REST) The API type of the REST catalog. Valid values arePUBLIC
orPRIVATE
.REST_CONFIG.WAREHOUSE = 'warehouse'
: (Optional for Iceberg REST) The warehouse location.REST_AUTHENTICATION.TYPE = auth_type
: (Required for Iceberg REST) The authentication type. Valid values areOAUTH
,BASIC
, orSIGV4
.REST_AUTHENTICATION.OAUTH_CLIENT_ID = 'client_id'
: (Required for OAuth) The OAuth client ID.REST_AUTHENTICATION.OAUTH_CLIENT_SECRET = 'client_secret'
: (Required for OAuth) The OAuth client secret.REST_AUTHENTICATION.OAUTH_ALLOWED_SCOPES = ('scope1', 'scope2', ...)
: (Optional for OAuth) A list of allowed OAuth scopes.REST_AUTHENTICATION.BASIC_USERNAME = 'username'
: (Required for Basic auth) The username for basic authentication.REST_AUTHENTICATION.BASIC_PASSWORD = 'password'
: (Required for Basic auth) The password for basic authentication.REST_AUTHENTICATION.SIGV4_IAM_ROLE = 'role_arn'
: (Required for SigV4) The ARN of the AWS IAM role to use for SigV4 signing.REST_AUTHENTICATION.SIGV4_SIGNING_REGION = 'region'
: (Required for SigV4) The AWS region to use for SigV4 signing.
Object Storage-Specific Parameters
STORAGE_PROVIDER = provider_type
: (Required for Object Storage) The type of storage provider. Valid values areS3
,GCS
, orAZURE
.STORAGE_LOCATION = 'location'
: (Required for Object Storage) The location of the catalog in object storage (e.g.,s3://bucket/catalog/
).STORAGE_AWS_ROLE_ARN = 'aws_role_arn'
: (Required for S3) The ARN of the AWS IAM role that ScramDB will assume to access S3.STORAGE_AWS_EXTERNAL_ID = 'external_id'
: (Optional for S3) The external ID to use when assuming the AWS IAM role.AZURE_TENANT_ID = 'tenant_id'
: (Required for Azure) The Azure tenant ID.AZURE_APPLICATION_ID = 'application_id'
: (Required for Azure) The Azure application ID.GCP_SERVICE_ACCOUNT = 'service_account_email'
: (Required for GCS) The email address of the GCP service account.
Source-Specific Examples
AWS Glue Catalog
CREATE CATALOG INTEGRATION glue_catalog_int CATALOG_SOURCE = GLUE TABLE_FORMAT = ICEBERG CATALOG_NAMESPACE = 'my_namespace' GLUE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/my-role' GLUE_CATALOG_ID = '123456789012' GLUE_REGION = 'us-east-1' ENABLED = TRUE COMMENT = 'AWS Glue catalog integration'
Iceberg REST Catalog
CREATE CATALOG INTEGRATION rest_catalog_int CATALOG_SOURCE = ICEBERG_REST TABLE_FORMAT = ICEBERG CATALOG_NAMESPACE = 'my_namespace' REST_CONFIG.CATALOG_URI = 'https://rest-catalog-endpoint.example.com/iceberg' REST_CONFIG.CATALOG_API_TYPE = PUBLIC REST_AUTHENTICATION.TYPE = OAUTH REST_AUTHENTICATION.OAUTH_CLIENT_ID = 'my_client_id' REST_AUTHENTICATION.OAUTH_CLIENT_SECRET = 'my_client_secret' REST_AUTHENTICATION.OAUTH_ALLOWED_SCOPES = ('PRINCIPAL_ROLE:ALL') ENABLED = TRUE COMMENT = 'Iceberg REST catalog integration'
Object Storage-Based Catalog (S3)
CREATE CATALOG INTEGRATION s3_catalog_int CATALOG_SOURCE = OBJECT_STORAGE TABLE_FORMAT = ICEBERG CATALOG_NAMESPACE = 'my_namespace' STORAGE_PROVIDER = S3 STORAGE_LOCATION = 's3://my-bucket/iceberg-catalog/' STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/my-s3-role' ENABLED = TRUE COMMENT = 'S3-based Iceberg catalog integration'
Usage
After creating a catalog integration, you can use it to query tables in the external catalog:
-- Set the current catalog to use the integrationUSE CATALOG glue_catalog_int;
-- Query a table from the catalogSELECT * FROM my_database.my_table;
Information Schema
You can view information about catalog integrations using the information_schema.catalog_integrations
view:
SELECT * FROM information_schema.catalog_integrations;