SQL Clone 5

Why do I need SQL Clone to copy databases?

SQL Clone accelerates the delivery of database changes by providing a lightweight way of duplicating databases that is easily automated and allows a controlled self-service approach to database provisioning. It can incorporate data masking to protect personal and sensitive data.

Any small database can be copied by restoring it from a backup, attaching the data and log files, or by building it and then importing data. However, these methods become less and less practical both as the database size, and the number of copies required, increases.

Network admins will be obliged to increase the available storage, and to support a growing number of SQL Server instances. The physical process of copying the data will take several hours, every time. Also, unless the data can be effectively anonymized beforehand, the backup/restore method is risky, because the copies cannot be restricted, terminated, or audited appropriately. This means that data protection and privacy concerns will restrict the free use of at least part of the business data to the development activities that need it.

The inevitable compromise is that teams will be forced to work with a single, shared database, often refreshed infrequently, and so containing 'stale' copies of the data. Their tests will often be performed only with small sets of artificial test data. This is fine for some types of database testing, such as unit testing, but not for others, such as user-acceptance tests or performance tests, because it doesn't reflect the characteristics of the 'real' data.

Developers also lose the freedom to make changes and run tests that, on a shared server, would affect other developers or even overwrite their work. If a test is destructive, and a shared database needs to be reverted to a previous state, you'd generally restore a backup to that point in time or detach the current version and re-attach to the original. Both approaches take everyone offline, and the ensuing delay affects everybody's work.

These compromises often limit the number and quality of database tests that can be performed during the development cycle, especially for integration, scalability, or performance testing, which make more demands on resources, and so on other team members who are responsible for the admin work. If we curtail testing, or perform it later in the process, when problems are harder to fix, we reduce the quality of releases as well as their frequency.

SQL Clone was designed to minimize the problems associated with duplicating databases for development work, and so remove the need for these compromises. With SQL Clone, the database is only stored once, as an image on a network share, and the data can, if necessary, be masked prior to making it available to users as a source of clones. Each user only stores on their local machine any differences caused by modifying their clone directly, on their local database instance. This means that the cloning process is very quick, because it doesn't involve copying around large data files. In effect, it allows the team to set up several clones of a large database, with the same data, very quickly, using a single network file image to hold the actual database data. As the databases get larger, the savings become very significant.

Each developer can then make changes to his or her own individual clone, migrating it to a different version or with different data, without affecting anyone else.

What problems does it solve for the Operations team?

Database provisioning, especially when supporting DevOps process such as Continuous Integration, creates a logistical and administrative burden for the Operations team. The problem is not just the administrative burden created by the frequent copying of large databases, and the time, disk and network resources this requires, but also in data governance.

For the operations team, SQL Clone provides a PowerShell-scriptable system that allows for a canteen system for databases that doesn't overload the network or disk resources and allows data governance and oversight:

  • Protects sensitive data and code. only one full copy (image) of the source database is created, and any anonymization of sensitive data is done before the image is published
  • Fine-grained access control: use SQL Clone's access control system to assign privileges to team members, allowing granular control over the SQL Server instances, images and clones to which each group of users has access.
  • Repeatable, auditable, and visible database provisioning: The whole process of creating an image, and then deploying multiple clones from this image, can be automated using built-in PowerShell integration. SQL Clone administrators can control make database copies (images) available for development work, securely, control where clones van be deployed, and can always see what clones exist on which servers, and who created them.

Once SQL Clone is installed, administrators can assign permissions to team members to enable them to run the system themselves, picking the times when they create, refresh, or delete databases. It thereby allows teams to self-manage the spinning up and removal of databases responsibly, and without either blocking the development and release processes, or introducing an unrealistic administrative burden.

What problems does SQL Clone solve for the Development team?

With SQL Clone, developers and testers can spin up clones with the data they require, when they require it, and only for as long as they require it. A 'real' database becomes just another standard resource that can be selected, used, reset, and removed as required, and without the usual overheads that tend to slow these tasks down.

The real 'magic' behind working with clones is that the team can all work on the same database, with the same data, but suddenly each developer can make changes to the data or even the structure of their copy of the database, without affecting anyone else. They can then instantly roll back these changes, so that, as far as the database is concerned, they never happened.

For development and testing work, this has the following advantages:

Isolated development database instances

SQL Clone makes it viable for to work on their own 'sandbox' databases, complete with data that corresponds in scale and appearance to the real data. They then have the freedom to make changes and run tests that, on a shared server, would affect other developers or even overwrite their work.

Faster, more efficient database testing

At certain points in a database deployment, there are tasks that take time. Database testing is the typical example of this. The tests can cause disruption when run on a shared database server, so tend to be performed only towards the end of a cycle. If tests are done one after another, they dominate the time that it takes to deploy the required changes, especially if you need to test routines that have large repercussions for the data, because of the time taken to reset the database back to its starting state would after each test run.

Due to its light footprint (See How SQL Clone works), SQL Clone makes possible the rapid duplication of the database in the version under test, for short periods, and so frees up a team of developers to create multiple local database copies for experimentation, testing, bug fixing and so on. It acts like a 'database service' that gives every developer or team their own, ephemeral view of an identical database, very quickly, that then reflects just the changes made by that person, or team.

Developers can spin up a clone, amend the clone to reflect local branch changes, run the test, save their work, then revert those changes instantly, ready for another round of tests. Unit testing, basic build and integration testing can be more comprehensive, within existing time constraints, because database test setup will be fast and easy, and tests can be run in parallel across multiple identical clones. As such, DevOps practices such as Continuous Database Integration suddenly become practical, even for very large databases.

See: How SQL Clone Improves Database DevOps for more details.

What problems does it solve for the business?

SQL Clone speeds up the process of database build and 'tear down', minimizes the demand on time and network resources, and make the process secure, controlled, and auditable.

This gives an organization the freedom to make full use of its existing data when engaged in development, testing, and training. Pseudonymized data can be exported safely to the BI team for reporting and analysis, as well as to the development and test teams.

By allowing the Operations team to set up an integration environment that reflects more closely the production environment, and the development teams to incorporate realistic data sets into their testing cycles, bugs and performance problems are caught much sooner, and don't creep into the production applications. All of this reduces expensive maintenance work.

By reducing the overhead of the database provisioning process, there is more time for the important development tasks of optimizing the support, monitoring, compliance, security and performance characteristics of the functionality they plan to deliver to production. It also means that the development work is deployed earlier and more often, so it is more visible to other teams within IT. The net result is far fewer unpleasant surprises.

Didn't find what you were looking for?