SQL Server test databases

As mentioned in my previous post, I often need to run the tooling I’m working on against databases to test it. Part of this testing is to ensure that the tooling can handle schemas with odd naming. Over the years working on database tooling, I’ve collated a script full of odd cases that can easily trip up tooling. It’s now become a standard script that I run on my local SQL Server testing instance when I first set it up.

CREATE DATABASE [💩]
GO

USE [💩]
GO

CREATE SCHEMA [💰]
GO

CREATE TABLE [dbo].[🍺]
(
    [🐈] INT NULL
)
GO

CREATE TABLE [💰].[💾]
(
    [📌] INT NULL
)
GO

------------------------------------------------------------------------------------
CREATE DATABASE [.]
GO

USE [.]
GO

CREATE SCHEMA [.]
GO

CREATE SCHEMA [..]
GO

CREATE TABLE [.].[.]
(
    [.] [int] NULL,
    [..] [int] NULL,
    [...] [int] NULL
)
GO

CREATE TABLE [.].[..]
(
    [.] [int] NULL,
    [..] [int] NULL,
    [...] [int] NULL
)
GO

CREATE TABLE [..].[.]
(
    [.] [int] NULL,
    [..] [int] NULL,
    [...] [int] NULL
)
GO

CREATE TABLE [..].[..]
(
    [.] [int] NULL,
    [..] [int] NULL,
    [...] [int] NULL
)
GO

------------------------------------------------------------------------------------
CREATE DATABASE [')) group by groupcir; select 1;GO--%']
GO

------------------------------------------------------------------------------------
CREATE DATABASE [_Burma ဗမာ (မြန်မာ) my-mm]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [_Georgia _ქართული (საქართველო) ka-ge]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [_Saudi Arabia العربية (المملكة العربية السعودية) ar-sa]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [_Tamil தமிழ் (இலங்கை) ta-lk]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [<script>alert("hi there");</script>]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [a-database-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name]
GO

USE [a-database-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name]
GO

CREATE SCHEMA [a-schema-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name]
GO

CREATE TABLE [a-schema-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name].[a-table-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name]
(
    [a-column-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name] INT NULL
)
GO

CREATE TABLE [another-table-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name]
(
    [another-column-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name] INT NULL
)
GO

CREATE VIEW [a-view-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name]
AS
    SELECT 1 AS V
GO

CREATE PROCEDURE [a-procedure-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name] @a_parameter_with_a_very_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_name INT 
AS
    SELECT 1
GO

CREATE FUNCTION [a-function-with-a-very-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-long-name](@another_parameter_with_a_very_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_name INT)
RETURNS INT AS
BEGIN
    RETURN 1
END
GO

------------------------------------------------------------------------------------
CREATE DATABASE [Empty]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name with spaces]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name'with'single'quotes]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name.with.dots]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name/with/forward/slashes]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name[with[opening[brackets]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name\with\backslashes]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name]]with]]closing]]brackets]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name-with-dashes]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [name]
GO

------------------------------------------------------------------------------------
CREATE DATABASE [Offline]
GO

ALTER DATABASE [Offline] SET OFFLINE
GO

------------------------------------------------------------------------------------
CREATE DATABASE [ReadOnly]
GO

ALTER DATABASE [ReadOnly] SET READ_ONLY WITH NO_WAIT
GO

Reference databases

As part of my day-to-day work, I often need to run the tooling I’m working on against databases to test it. The databases need to have a variaty of different schemas and data in them to do a representative test. A big problem with this is getting hold of an array of different databases in the first place. This problem is compounded further by the fact that the tooling I work on supports five different database platforms, so there is a need for different databases for each platform.

A useful source of ready-made databases is the reference databases that are provided by the vendor or community around each database. The ones I’ve found so far are listed below, with a bit of detail about what each one represents.

SQL Server

Name Description Install
AdventureWorks A fictional bicycle manufacturer “Adventure Works Cycles”, containing data including manufacturing, sales, purchasing, product management, contact management, and human resources. Available in Online transactional (OLTP), Data warehousing (DW), and Lightweight (LT) formats. backups
scripts
Chinook A fictional digital media store, containing data including artists, albums, tracks, invoices, and customers. script
Contoso “Contoso Corporation”, a fictional multinational business (more info here), containing data including manufacturing, sales, and products. scripts
Northwind “Northwind Traders”, a fictional importer/exporter of speciality foods from around the world, containing sales data including customers, orders, inventory, purchasing, suppliers, shipping, employees, and accounting. scripts
WideWorldImporters “Wide World Importers”, a fictional wholesale novelty goods importer and distributor operating from the San Francisco bay area (more info here), containing data including purchasing, sales and stock. Available in Online transactional (OLTP) and Data warehousing (DW) formats. backups

Stack Exchange / StackOverflow

The databases behind the many sites in the Stack Exchange network can be downloaded here. The database for each Stack Exchange site can be downloaded, including that for StackOverflow. However, the StackOverflow database is quite big (~65GB), so it has been split up into several files. That makes it non-trivial to get it into a SQL Server database.

Brent Ozar has simplified this process by providing downloads of the database files in SQL Server 2016 format.

PostgreSQL

Name Description Install
AdventureWorks A port of the SQL Server version of AdventureWorks, a fictional bicycle manufacturer “Adventure Works Cycles”, containing data including manufacturing, sales, purchasing, product management, contact management, and human resources. script
Chinook A port of the SQL Server version of Chinook, a fictional digital media store, containing data including artists, albums, tracks, invoices, and customers. script
Pagila A database of a fictional DVD rental store, containing data including films, actors, customers, staff, and payments. scripts

Bluebox

Ryan Booz has created Bluebox, a database based on Pagila that aims to make it more full-featured.

NYC Census

If using PostGIS (an extension for managing spatial data), the introduction to PostGIS tutorial contains a sample database of census data for New York City.

Other Databases

The PostgreSQL website has a long list of other sample databases, including IMDB, the UK land registry of sales, and OpenStreetMap.

MySQL / MariaDB

Name Description Install
Bureau of Transportation Statistics A database of US commercial airline flight data including airlines, airports, and flights. scripts
Employees A database containing generated data including employees, departments, employees, and salaries. scripts
Sakila A fictional DVD rental store, containing data including films, actors, customers, staff, and payments. scripts
World A database containing information about the countries and cities of the world, containing data including countries, cities, and languages spoken. script

Other Databases

The MySQL website has a list of other sample databases, including some with large data sets.

Oracle

Oracle themselves provide some interlinked sample schemas:

  • Customer Orders (CO)
  • Human Resources (HR)
  • Online Catalog (OC)
  • Order Entry (OE)
  • Product Media (PM)
  • Sales History (SH)

These are documented on the Oracle site, and can be obtained from the Oracle samples GitHub repository.

Name Description Install
OT A global fictitious company that sells computer hardware including storage, motherboard, RAM, video card, and CPU. scripts
O7 A schema for a fictitious bank, containing customers, accounts, products, branches, departments, and employees. scripts (see section 2)

Gerenal data sources

There are many sites that provide data suitable for using as a reference database. While many of these sites provide the data, few of them provide it in a format that can be loaded straight into the relevant database platform. They can however be imported fairly trivially.

Name Description
Google Dataset Search A search engine for datasets from across the internet.
Kaggle Datasets for data science and machine learning. Interesting datasets include NBA Basketball and Spotify.
UC Irvine Machine Learning Repository Well-documented and clean data sources, mainly geared towards machine learning.
Awesome Public Data Sets A list of high quality, topic-centric public data sources.
Data Is Plural A weekly newsletter and archive of real-world datasets.
Hugging Face A huge set of data, mainly targeted at machine learning.

DDD South West 14

I’ve just got back from another DDD South West. It’s the last time at the Engine Shed, the venue it’s been at for several years. It’s a shame, as the venue is perfect for an event like this, and the food is excellent.

This year I was a speaker, doing a session on sketchnoting. It’s the first time I’ve given this talk, and I got the whole audience sketching along with me during the session. Several people also took my challenge to sketchnote a session during the rest of the day. Hopefully they will continue to do so at other events.

2026: The Year Of The Software Factory

by Rich Woollcott

2026: The Year Of The Software Factory by Rich Woollcott

Using Claude To Access Data And Make Smarter Decisions

by Elizabeth Hanson

Using Claude To Access Data And Make Smarter Decisions by Elizabeth Hanson

Playing With AI Locally

by Paul Michaels

Playing With AI Locally by Paul Michaels

From Developer To Architect, The Career Path Nobody Explained

by John Kilmister

From DeveloperTo Architect, The Career Path Nobody Explained by John Kilmister

Do We Really Need A New Playbook For AI?

by Stuart Caborn

Do We Really Need A New Playbook For AI? by Stuart Caborn

Why Don’t People Seem To Be Able To Diagnose Problems These Days?

by Richard Fennell

Why Don't People Seem To Be Able To Diagnose Problems These Days? by Richard Fennell


Extracting a DACPAC

SQL Server has the concept of data-tier applications. This is a way of defining the structure and data of an entire database within a single file, enabling it to be moved between machines. There are two types of files that can be used:

  • BACPAC - This contains the database schema and the data from a database.
  • DACPAC - This contains the just the database schema by default, but data from some tables can be included.

Data-tier applications have benefits over backup files in that they are more portable. Backups are tied to the version of SQL Server that they are created with, meaning that only that version or later can restore them.

I use DACPACs a lot to get database schemas between different servers. Deploying a DACPAC will sometimes fail due to it containing logins, users, etc… from the database it was created from that don’t exist on the server it is being deployed to. When this happens, SQL Server Management Studio simply fails with an error with no way around the issue.

It is possible to deploy however, but it takes a bit of command line trickery. To do this, first install the SqlPackage dotnet tool:

 dotnet tool install -g microsoft.sqlpackage

Once installed, you can run it from the command line, passing the option to exclude certain object types. Something like:

sqlpackage
  /Action:Publish
  /TargetServerName:.\SQL2025
  /TargetDatabaseName:Northwind
  /Properties:ExcludeObjectTypes="Users;RoleMembership;Logins;ServerRoles;ServerRoleMembership;Permissions"
  /SourceFile:northwind.dacpac

It’s the /Properties:ExcludeObjectTypes=... parameter that does the magic. In this case, the security-related objects causing the deploy to fail will be excluded from the deployed database schema.

More details can be found in the docs for the SqlPackage publish action.


nor(DEV):con 2026

I’ve just returned from nor(DEV):con, an event I previously went to a few years ago.

Designing Services That Stand The Test Of Time

by Paul Grenyer

Designing Services That Stand the Test Of Time by Paul Grenyer

Stop Testing! (My Patience)

by Ryan Healey

Stop Testing! (My Patience) by Ryan Healey

Modular Monoliths And Other Facepalms

by Kevlin Henney

Modular Monoliths And Other Facepalms by Kevlin Henney

Debugging Your Imposter Syndrome

by Vickie Allen-Collier

Debugging Your Imposter Syndrome by Vickie Allen-Collier

Learn How To Learn

by Frances Buontempo

Learn How To Learn by Frances Buontempo

Translating Business Metrics Into Meaningful User Outcomes

by Elisabeth Dubus

Translating Business Metrics Into Meaningful User Outcomes by Elisabeth Dubus

How We Support Deploying Multi-Cloud Apps At Stack Overflow

by Chris O’Dell

How We Support Deploying Multi-Cloud Apps At Stack Overflow by Chris O'Dell