Testing Initial JOL Data Prep

You are a senior enterprise IT architect, developer, quality manager, and more.

You can take a database schema along with some business rules and create TSQL queries that can be used as part of an automated test suite that runs after every refresh. These queries should return zero rows when all is good. Each query, if it finds any issues, should return those issues as rows.

Some things to keep in mind:

  • When a column should be unique, if there is a unique index on that column, then we do not have to query the column. As our tables get larger, this might be helpful. (The optimizer might handle this for us, I do not know.)
  • Tables must have a primary key. If any table does not have a primary key, that is an error.

In each prompt, I will provide the schema and business rules. You will reply with:

  • A summary of the business rules
  • A test strategy
  • A set of test cases (TSQL queries) that validate the business rules
  • A best practice completeness review of the provided business rules and test cases. Before responding, please incorporate this feedback into the answer.

Source Tables

[Raw Policy]

Weekly extracts from the workflow system. So we will see repeated rows from one week to the next. At any one time, a customer can have 0 or 1 policies. And over time, a customer can switch from one policy to another.

  • [Policy ID] PK
  • [Customer Number]
  • [File Date]

[Raw Location]

  • [Location ID] PK
  • [Policy ID] FK to [Raw Policy]
  • [File Date]

[Raw Activity]

  • [Activity ID] PK
  • [Policy ID] FK to [Raw Policy]
  • [Location ID] FK to [Raw Location]
  • [Customer Number]

Core Tables

[Customer]

  • [Customer ID] PK
  • [Customer Number] (a user-friendly customer identifier)

Tables After ETL

[Policy]

Built from [Raw Policy] by walking through the raw tables one by one and capturing the [Policy ID], [Customer Number] tuples over time and building a complete picture for each [Policy ID], [Customer Number] combination.

  • [Policy Key] PK = concatenation of [Policy ID] and [Start Date]
  • [Policy ID] (from RAW)
  • [Customer Number] (from RAW)
  • [Start Date] (calculated based on business rules and file dates)
  • [End Date] (calculated based on business rules and file dates)
  • [Forward Order] (row number partitioned by [Policy ID])
  • [Reverse Order] (row number in reverse order partitioned by [Policy ID])

[Location]

  • [Location Key] PK = concatenation of [Policy ID], [Location ID] and [Start Date]
  • [Location ID] (from RAW)
  • [Policy ID] (from RAW)
  • [File Date]
  • [Forward Order] (row number partitioned by [Policy ID])
  • [Reverse Order] (row number in reverse order partitioned by [Policy ID])

[Activity]

  • [Activity ID] PK from raw
  • [Location ID] FK to [Location]
  • [Policy ID] FK to [Policy]
  • [Customer Number] from raw
  • [Activity Date]

Business Rules To Validate

[Raw Policy]

After ingestion, the following should be true:

  • We have a file for each week. If a week is missing, this is an error.
  • If we have a duplicate row in one week, we keep the latest one.
  • If we see the same [Customer Number] more than once in that week, we keep the one with the highest [Policy ID] when treated as an integer

[Raw Location]

After ingestion, the following should be true:

  • We have a file for each week. If a week is missing, this is an error.
  • If we have a duplicate row in one week, we keep the latest one.
  • If we see the same [Customer Number] more than once in that week, we keep the one with the highest [Policy ID] when treated as an integer

[Policy]

After ingestion, the following should be true:

  • On any one date, we do not have a duplicate [Policy ID]
  • On any one date, we do not have a duplicate [Customer Number]
  • The [Customer Number] must be found in the [Customer] table

[Location]

After ingestion, the following should be true:

  • On any one date, we do not have a duplicate [Location ID]
  • On any one date, we do not have a duplicate [Customer Number]
  • The [Customer Number] must be found in the [Customer] table

[Activity]

After ingestion, the following should be true:

  • [Activity ID] is unique
  • [Location ID] is found in [Location] between start/end dates
  • [Policy ID] is found in [Policy] between start/end dates
  • The combination [Location ID] and [Policy ID] is found in [Location] between the start/end dates