Skip to content

INFORMATION SCHEMA Extensions

In addition to the standard information_schema.tables and information_schema.columns views, ScramDB provides additional views in the information_schema schema to query metadata about storage and catalog integrations.

Storage Integrations View

The information_schema.storage_integrations view provides information about all storage integrations defined in the system.

Columns

  • name: The name of the storage integration.
  • provider: The storage provider type (e.g., S3, GCS, AZURE).
  • enabled: Whether the integration is enabled (YES or NO).
  • storage_aws_role_arn: (For S3) The ARN of the AWS IAM role.
  • storage_aws_external_id: (For S3) The external ID for the AWS IAM role.
  • gcp_service_account: (For GCS) The GCP service account email.
  • azure_tenant_id: (For Azure) The Azure tenant ID.
  • azure_application_id: (For Azure) The Azure application ID.
  • storage_allowed_locations: The list of allowed storage locations.
  • catalog_type: The type of catalog associated with the storage integration.
  • catalog_namespace: The namespace used with the catalog.
  • comment: The comment provided when creating the integration.

Example Queries

-- List all storage integrations
SELECT * FROM information_schema.storage_integrations;
-- List only enabled S3 integrations
SELECT * FROM information_schema.storage_integrations
WHERE provider = 'S3' AND enabled = 'YES';
-- Find integrations with a specific allowed location
SELECT * FROM information_schema.storage_integrations
WHERE storage_allowed_locations LIKE '%my-bucket%';

Catalog Integrations View

The information_schema.catalog_integrations view provides information about all catalog integrations defined in the system.

Columns

  • name: The name of the catalog integration.
  • catalog_source: The catalog source type (e.g., GLUE, ICEBERG_REST, OBJECT_STORAGE).
  • catalog_namespace: The namespace used within the catalog.
  • table_format: The table format used by the catalog (e.g., ICEBERG).
  • enabled: Whether the integration is enabled (TRUE or FALSE).
  • comment: The comment provided when creating the integration.

Glue-Specific Columns

  • glue_aws_role_arn: The ARN of the AWS IAM role for Glue access.
  • glue_catalog_id: The AWS account ID that owns the Glue catalog.
  • glue_region: The AWS region where the Glue catalog is located.

REST-Specific Columns

  • rest_catalog_uri: The URI of the REST catalog endpoint.
  • rest_catalog_api_type: The API type of the REST catalog.
  • rest_warehouse: The warehouse location.
  • rest_auth_type: The authentication type for the REST catalog.

Object Storage-Specific Columns

  • storage_provider: The type of storage provider.
  • storage_location: The location of the catalog in object storage.
  • storage_aws_role_arn: (For S3) The ARN of the AWS IAM role.
  • storage_aws_external_id: (For S3) The external ID for the AWS IAM role.
  • azure_tenant_id: (For Azure) The Azure tenant ID.
  • azure_application_id: (For Azure) The Azure application ID.
  • gcp_service_account: (For GCS) The GCP service account email.

Example Queries

-- List all catalog integrations
SELECT * FROM information_schema.catalog_integrations;
-- List only enabled Glue catalog integrations
SELECT * FROM information_schema.catalog_integrations
WHERE catalog_source = 'GLUE' AND enabled = TRUE;
-- Find catalog integrations for a specific namespace
SELECT * FROM information_schema.catalog_integrations
WHERE catalog_namespace = 'my_namespace';