Moving Your Databases to The Cloud

Are you thinking about moving your databases to the cloud or making the transition to database as a service (DBaaS)?

With cloud computing vendors offering more services at lower prices, the barriers to spinning up cloud resources are diminishing. But there are few black-and-white questions in technology and even fewer in business, which is why smart companies look at all the shades of gray in an innovation like the cloud-based database before they commit on a large scale.

In this ebook, we'll examine the what, why, when, where and how of database cloud computing. This overview of the current cloud landscape includes answers to the most commonly asked questions and a number of important but frequently overlooked points.

Database administrators (DBAs) and managers will gain a better view of the path to the cloud, whether they are preparing to migrate two dozen web pages or two dozen years of transaction history.

Next, we'll walk you through some fantastic technologies that can help you actually get where you want to go.

What is the cloud?

First, a definition:

Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.

When prominent computer scientists take several years and 15 drafts to arrive at a one-sentence definition, you can rely on it as an accurate yardstick. You can use it to measure not only your needs and the best ways to fulfill them but also the products and services your organization is considering for its move to the cloud.

The closer your organization adheres to this definition, the more likely it will realize the benefits of the cloud, such as lower costs, faster deployment, lower power consumption and better service for internal and external customers.

NIST has also set out a number of essential characteristics, deployment models and service models covered in the following pages that apply to true cloud computing. The NIST perspective sets expectations and describes the best ways organizations can profitably use the cloud.

Although cloud computing continues to evolve, the definition of its characteristics and benefits is stable.

Essential characteristics of the cloud

Typical cloud resources and capabilities include storage, processing, memory, network bandwidth and virtual machines. True cloud computing applies all of the following characteristics to those resources:

  • Broad network access — If you can't get to it, you can't use it. Cloud resources must be available in most reasonable places and users must be able to access them with standard, connected computing devices.
  • Rapid elasticity — One of the most appealing features of the cloud is the ability to use — and pay for — only the resources needed for only as long as needed. The ability to scale cloud resources up and down on short notice, or even automatically, is a far cry from the traditional, hardware-constrained IT model.
  • Resource pooling — Because cloud providers pool resources, users share them with other users. The exact physical resources matter little, varying as users' needs scale up and down. Pooling and sharing are preferable to buying entire servers and using only a small fraction of their capacity.
  • On-demand — Users can provision resources now, without the need to wait weeks or months for a vendor to ship hardware. Storage, computing power, memory and other resources are available when and as you need them.
  • Self-service — Some providers make it as easy to use cloud resources as entering a credit card number. Self-service can eliminate the need to fill out paperwork, then wait for IT to rack and stack new equipment.
  • Measured — Cloud resource usage is continually metered so that both provider and user can track consumption. That allows users to pay as they go for only the resources they use.

With vendors and offerings that cover all of those bases, you will be more likely to realize the long-term promise of cloud computing. The absence of any one of those characteristics indicates a partial solution.

Cloud computing means not having to pay for a large, resource-intensive server that you're not using in its entirety.

Cloud deployment models

Some organizations, departments and functions are more amenable to sharing cloud resources than others are. Different tolerances for resource sharing have led to several different ways of deploying to the cloud:

  • Private — The hardware, software and network of cloud infrastructure shares resources with nobody and is isolated for security. The infrastructure may be on premises or anywhere in the world.
  • Community — When groups have similar security needs and criteria, they can use a community cloud to share resources exclusively with one another. They may manage it themselves or have a third party manage it. Intelligence agencies were among the first to deploy community clouds, sharing both information and the resources for using it.
  • Public — The physical resources belong to the cloud provider, who enables wide-open sharing among all users. Security is enforced, but users rarely know which resources they're sharing or where the resources reside.
  • Hybrid — The cloud infrastructure consists of multiple private, community or public clouds connected by software that allows data to move among them. Since the hybrid model is commonly used by organizations in the process of migrating to the cloud, the model usually includes at least some on-premises hardware.

None of these models defines where the infrastructure is located. It can be anywhere in the world.

Cloud service models

Because not all organizations want to do the same thing in the cloud, not all organizations want the same type and level of service from it. As resource sharing varies by deployment model, cloud providers offer computing capability in multiple service models:

  • Software as a service (SaaS) — The provider makes its software applications available on cloud infrastructure and users access it from ordinary client devices, such as computers or smartphones, often through a browser. Users log in to the provider's applications from anywhere. Examples include Salesforce.com and Office 365.
  • Platform as a service (PaaS) — The provider makes available on its cloud infrastructure an environment that supports certain tools and programming languages, then users install and run applications based on those tools and languages. Users control whatever applications they deploy. PaaS is a common service model for running databases in the cloud, sometimes called DBaaS. Examples include Amazon Relational Database Service (RDS), Microsoft SQL Azure and Oracle Cloud.
  • Infrastructure as a service (IaaS) — The provider makes available on its cloud infrastructure basic computing resources like memory, processing, storage and network bandwidth. Users effectively subscribe to the resources as remote hardware, deploying any software or systems they would deploy locally. Examples include Amazon Elastic Compute Cloud (EC2), Microsoft Azure and Amazon Simple Storage Service (S3).

In all cloud service models, the provider owns and maintains the infrastructure.

PaaS is the most common service model for organizations migrating databases such as Oracle to the cloud.

Prominent cloud vendors and service models

Provider

Software as a service

Platform as a service

Infrastructure as a service

Amazon

Amazon Web Services (AWS)

Amazon Web Services (AWS)

Elastic Compute Cloud (EC2), Simple Storage Service (S3)

Google

Google Apps (Gmail, etc.)

App Engine

Cloud Storage

IBM

IBM Cloud

Cloud Application Delivery

Smart Cloud Enterprise, Smart Cloud Object Storage

Microsoft

Office 365

Private cloud, Azure

Private cloud, Azure

Oracle

Oracle Cloud

Oracle Cloud

Oracle Cloud

Rackspace

Email

Cloud sites

Cloud servers, cloud file

Salesforce

Salesforce.com, HP Software as a Service

Force.com, cloud application delivery

Services Cloud

Additional vendors include Dell and HP, as well as Joyent, which provisions IoT sensors.

Why move your databases to the cloud?

It's true that a lot of IT is moving to the cloud. While DBAs may find security in following a dominant trend, the notions of "management by magazine" and "everybody else is doing it" are rarely the best rationales for making such a move.

Lower maintenance costs — especially for databases — are often the first priority in adopting cloud computing. It's very appealing to permanently do away with a big chunk of capital expenses for hardware and software. It's even more appealing to do away with the operating expenses of installing, maintaining, updating, patching and end-of-lifeing databases without the additional administration overhead. But it's important to keep track of the expenses that take their place. Besides licenses and subscriptions, companies often encounter indirect costs such as steep charges for network usage during migration, loss of customer information during the outages and loss of revenue from unexpected application downtimes during database migration.

Reliability and redundancy are important considerations for cloud adoption. With tens or hundreds of data centers worldwide, most cloud providers tout high reliability, and their customers value that. Providers hire large numbers of administrators to run their data centers and ensure that there is no single point of failure.

Flexibility is the fringe benefit of working in the cloud, where the ability to scale up and back down quickly keeps needs and resources closely matched. When managers want to stand up a development environment with a database, they can create it immediately, get their developers working on it in no time and scale it up and down. Online retailers take ample advantage of this flexibility, especially around the holidays. Many handle normal transaction volume on premises, then supplement with cloud resources during spikes.

Security in the cloud is stronger than security on premises, although it's rarely the primary motivation for moving to the cloud. Entrusting data to other people seems like the opposite of security, but cloud providers have armies of people tracking security bulletins and sometimes doing white-hat penetration testing on their own servers for security assurance. Few companies have the resources or technical depth for that.

When should I move to the cloud?

All enterprise data and development and testing work will likely be in the cloud by 2025, according to one prominent source. Use that time horizon as a benchmark for staying in step with peers in database administration. Just be mindful of the 5 Ps: Proper Planning Prevents Poor Performance. Enjoying the full benefits of cloud computing is the reward for a deliberate, committed approach.

Except for brand-new companies that need computing resources for the first time, the move to the cloud is an ongoing journey and not a destination. Most organizations will spend the foreseeable future gradually moving on-premises systems to the cloud. That's the argument for starting small, moving the development environment to the cloud or moving discrete applications with few hooks into other systems.

Moving one app or part of the business (e.g., helpdesk to ServiceNow or Zendesk) reduces risk. A function like payroll, on the other hand, is a candidate for later migration; no company doing payroll in house wants to cut its cloud teeth on a business function with that many internal processes.

Of course, organizations running SaaS applications like Salesforce and Office 365 have already moved to the cloud, at least partially. Even virtual desktop infrastructure (VDI) reduces the dependence on on-premises infrastructure by allowing users to log on to a client and work entirely on a preconfigured, virtual desktop without local storage or installed software.

What should I move to the cloud?

In principle, moving every one of your 1,000-plus databases to the cloud is a fine idea, if you realize all the promised benefits without any business downside. As we mentioned, however, unexpected costs arise.

Suppose you're moving from Oracle multicore processor licensing (MPL) on premises, where you're accustomed to a 50-percent Oracle Core Factor. You figure you have eight physical cores in your on-premises server, so to license eight cores from Oracle, you've been paying for only four cores. If you move to an Amazon Relational Database Service (RDS) environment with the same eight physical cores, Oracle has said that the Core Factor doesn't apply in the cloud, so you'll have to pay for licensing on eight cores instead of the four you were accustomed to in your on-premises Oracle implementation.

Development and quality assurance are good candidates for the cloud, which scales up and down easily with individual projects. It's a big advantage to quickly spin up multiple instances for writing and testing apps, as long as you remember to remove them when they're no longer needed.

Cloud infrastructure costs money whenever it's running, even when it's not in use. For instance, if you have a cloud database for development and your developers use it only a few months out of the year, the meter is still running. The absolute costs are low, but they add up over time.

Top web vendors like Amazon have built entire businesses on their own eCommerce cloud, then in turn productized it. They offer specific tools for companies that want to move their eCommerce to the cloud and scale it up and down.

How do I move to the cloud?

It's important to start small.

Identify low-impact tables and schemas such as development or QA databases and start with those. Before switching an entire on-premises database to a database in the cloud, identify use cases like data integration, disaster recovery and offloaded reporting that require data availability but do not interfere with application uptime.

To make things even easier, some companies start from zero, with no historical data. They take the same software, such as Oracle E-Business Suite, set it up in the cloud with all their customizations and flexfields, and start anew at the beginning of a new fiscal year or quarter. The on-premises version persists in case of queries on historical data. Not migrating the old database makes it easy to move into the cloud.

"Big bang" is an approach for moving to the cloud in one fell swoop; for example, over a weekend. It involves some interruption and risk, but moving the system while it is not being used can work for applications with small databases and regular downtime, like brick and mortar companies with regular business hours. DBAs back up the database and applications, restore them into the cloud and start users on the new

system effective the next business day. But if the database measures in the terabytes, it will take much longer than a weekend to restore to the cloud.

A database replication product is a safer, less-drastic way to ease migration to the cloud.

Planning your cloud migration

Moving databases to the cloud and making the transition to DBaaS can be like having one foot on the boat and the other on the dock.

Cloud migration of databases should not affect the applications running on those databases before, during or after migration. Users should be able to execute tasks like reporting, querying and analysis throughout the migration process. DBAs should be able to roll back a production database in the event of a problem during migration, with no impact on user activity.

As we've described, you have four options when moving to a cloud-based database:

  • Start with tables and schemas that support applications that are not critical to the business (low-risk, but slow).
  • Develop brand-new applications using databases in the cloud without migrating old databases at all (clean, but hard on history).
  • Take the big-bang approach by backing up and restoring before users are affected (comprehensive, but requires protracted downtime).
  • Replicate data from a source database on premises to a target database in the cloud (smart and innovative).

So how to take the smart and innovative route? Having some help from smart technologies is a great place to start. The SharePlex® database replication solution tracks and preserves database updates to keep your source and target instances synchronized. Let's look at how DBAs and system administrators can implement SharePlex in their own migration projects.

What is SharePlex?

SharePlex is an enterprise-grade database replication solution based on redo logs. It is designed for operations such as database migrations, high availability, disaster recovery, change history tracking and metadata management at less than half the cost of native replication products.

SharePlex supports database migration from on-premises and cloud databases by enabling users to continue accessing up-to-the-second data without affecting database uptime.

SharePlex scans the redo log for transactional changes to the source database, then applies those changes to the target database in near-real time. That approach keeps the latest data from the source database available on the target, with latency typically as low as a few seconds, instead of hours or days.

Using SharePlex also greatly reduces the risk of downtime associated with database migration. On a migration project for a multiterabyte CRM database, for example, sysadmins would ordinarily have to stop all user input, export the existing database and import it to the new database in the cloud, which could take several days to complete. With SharePlex, once the source and target databases were in sync and database replication was started, users could go from working on the old database to working on the new one in less than an hour. (See "Example: Migration Using a Physical Copy" on p. 21.)

Finally, because SharePlex maintains synchronization between databases, it's invaluable for offloading analysis work from production to reporting databases. If you run an on-premises database or DBaaS for decision support or analytics, you can use SharePlex to keep either the entire database or a subset of its tables synchronized.

Configurations and use cases

SharePlex fits into most use cases where IT managers want to minimize the impact of database migration on data availability and user productivity.

Consider these common configurations for availability, scalability and integration:

Availability

  • High availability and disaster recovery — SharePlex works well alongside or in place of Data Guard. The systems in this scenario are usually identical, and the difference lies in whether the precipitating event is planned (high availability) or unplanned (disaster recovery).
  • Migrations, patches and upgrades — The goal is to maintain availability with failback when migrating, for example, from Oracle to Oracle, Oracle to SQL Server or Oracle to SAP ASE. (See "Reduced Downtime for Migrations, Patches and Upgrades" on p. 17 for more details on this use case.)
  • Active-Active or load balancing — SharePlex allows you to keep two or more production servers in use and in sync, with replication among all instances. If the same record is updated on more than one instance, SharePlex can use conflict resolution routines to determine which record wins.

Scalability

  • Operational reporting and archiving — Queries for customer-facing transactions take precedence over queries for reporting. Instead of bogging down your online transaction processing (OLTP) database, replicate it to a separate, dedicated target database and offload your reporting to it. The target can be a subset of the source, with indexes optimized for reports.
  • Data distribution and distributed processing — To create a data warehouse or data mart, replicate data out to subset target databases.
  • Cascading — Similarly, you can start with a database, replicate it to another server, add value to it or enhance it, then move it onto final target databases. SharePlex works at each step in this process of creating a network of replication or replicating globally for distribution in local regions.

Integration

  • Data integration and data warehousing — Business intelligence, analytics and reporting benefit from near-real-time data integration between Oracle and other supported databases. SharePlex replication plays a role in that integration and in data warehousing, where the target is structured differently from the source.
  • Change history and metadata repository — Collect before- and after-images of the database changes you capture and add metadata indicating who made what kind of change when. Then put it all into a centralized repository in the cloud. That gives you auditing without the need to turn on Oracle's auditing features or manage log files and tables.
  • Centralized reporting and consolidation — Going in the opposite direction from distributed processing, you can replicate multiple separate databases to separate schemas in a single instance. You can also configure multiple servers to replicate to a single server hosting multiple databases.

SharePlex is designed to replicate remote and on-premises databases, as well as databases in the cloud.

Reduced downtime for migrations, patches and upgrades

As noted, SharePlex can reduce downtime when you're migrating a database, application, hardware appliance, operating system or storage device. Whether you're patching or upgrading a legacy platform, migrating to a new platform or moving to the cloud, SharePlex provides a low-impact way of modernizing your infrastructure affordably. It reduces downtime and risk in common IT activities such as the following:

  • Upgrading an Oracle database (e.g., Oracle 10/11 to Oracle 12)
  • Migrating from Oracle database Standard Edition to Enterprise Edition, or vice versa
  • Migrating from an Oracle or SQL Server database to a different relational database such as Microsoft SQL Server, SAP ASE or any other database supported by SharePlex
  • Upgrading or migrating from one operating system to another, such as from Unix to Linux or from Windows to Linux
  • Moving an on-premises or remote database to a supported private or public cloud
  • Upgrading storage

Unlike traditional utilities for database upgrade or platform migration, SharePlex helps you maintain business productivity by keeping the existing system's data accessible while DBAs set up the new system.

Users continue to analyze, move, store and process data in parallel with the migration.

Control of the migration schedule remains with your IT staff instead of being subject to the amount of time required to back up and restore the data. SharePlex allows IT staff the opportunity for unlimited, dry-run practice on real production data with minimal impact on the business and users. It can also eliminate surprises at go-live by flushing problems out early in the migration project.

SharePlex replication technology gives IT staff the leeway to perform migrations on their own schedule instead of worrying about keeping source and target synchronized.

In Oracle-to-Oracle database migrations, SharePlex offers the added benefit of failback by using replication from the target back to the source, which allows you to revert to the source system should something go wrong with the target system. Regardless of how much work users have done since the migration, DBAs can fail back to the original version with no loss of their data.

How SharePlex works

The suitability of SharePlex to all those configurations, applications and networks means that it is flexible enough for almost all business requirements.

The diagram above depicts the SharePlex architecture with replication to a cloud-based database.

SharePlex architecture (gray elements are processes; blue elements are queue operations)

Source

On the source system, the Capture process reads the Oracle redo logs (and, if necessary, the archive logs) usually from either the disk or operating system buffers, which minimizes physical I/O. For even greater efficiency, SharePlex looks strictly for changes and needs to capture only about 30 percent of what goes into the redo logs. SharePlex then makes a copy of the updates and sends them to the capture queue.

The Read process acts like a traffic cop, reading the contents of the capture queue, deciding where the data will go, preparing the data for transport across the network and placing the data in the export queue.

Lastly on the source side, the Export process sends the replicated data across the network to the target system, with options for compression or encryption. (This process is proprietary; it is not an Oracle export/import operation.)

Target

The target system can consist of remote or on-premises databases, as well as databases in the cloud. The Import process receives the data and writes it to the post queue. SharePlex is capable of maintaining multiple post queues, as shown by multiple arrows in Figure 1. The Post process takes the data, constructs a SQL statement and applies it to the target tables within the target Oracle or other supported database. Posting continues until replication is shut down intentionally. If a network or other failure occurs, data will remain in the queues and posting will resume when the failure is resolved.

Optimistic view of commit

As another way of reducing latency, we've built an optimistic view of commit into SharePlex. We assume that if you've made a change to the database, it's much more likely that the change will be committed than rolled back undone.

That's why, even though the system hasn't yet committed the change, SharePlex takes the record from the redo or archive log, moves it to the target system right away and posts it. The change remains uncommitted by Oracle, so you don't see the data immediately, but it's there. If SharePlex sees the commit, it commits the change; if not, SharePlex rolls it back.

Compare and repair

Earlier, we pointed out that a big problem in moving a database is maintaining and protecting it as it goes from source to target. Keeping source and target synchronized goes a long way toward avoiding that problem.

Included in the licensing fee for SharePlex is a compare-and-repair feature. It compares a table on the source to the same table on the target and notifies you if they are out of sync. If so, the utility generates the SQL needed to repair them through resynchronization.

In tens of thousands of installations, IT teams have used SharePlex in a wide variety of configurations, applications and networks.

SharePlex cloud support

SharePlex is compatible with many common DBaaS targets in Amazon Web Services (AWS) and Microsoft Azure, as shown here:

 

AWS RDS

AWS EC2

Azure Marketplace

Azure SQL Database

Source

Target

Source

Target

Source

Target

Source

Target

Oracle

 

 

   

Microsoft SQLServer

 

 

 

 

PostgreSQL (Community Edition)

 

 

 

   

EnterpriseDB Postgres

     

 

   

FUJITSU Enterprise Postgres

     

 

   

Support for cloud-based databases (as of SharePlex version 8.6.6)

Note that with Oracle, you can deploy SharePlex on AWS Elastic Compute Cloud (EC2) as a source, a target or both. No special setup is needed.

Example: Migration using a physical copy

Based on the architecture shown on p. 18, here we see an example of migration from current production to a database in the cloud. It illustrates SharePlex database replication in combination with backup/restore.

From the current production environment, you start replication. SharePlex begins continuously capturing the data from the redo logs and exporting it to the post queue.

Then you take a current backup, which is consistent to a particular system change number (SCN). You restore from backup to instantiate the cloud database. Once the restore operation is finished, the target database in the cloud is synchronized with the source database up to that SCN.

Finally, SharePlex runs a reconcile process that discards transactions prior to that SCN, and then it starts posting from the post queue.

How long does that take? Everyone's mileage may vary, of course, but a 500 GB/hour site with an 8TB database could expect that backup and instantiation from the source to the target would take on the order of 60 hours. Without SharePlex, that would be unacceptable, but with SharePlex running continuous replication and reconciling to the given SCN, the impact to users and the business would be far lower. Users could begin working on the target database right away. Within another 30 to 35 hours, source and target would be fully synchronized.

Migration steps using data replication and backup/restore

Conclusion

First, we hope you now have a solid understanding of cloud computing and what it can do for you. The NIST definition from the beginning of this ebook is a useful yardstick for companies of all sizes as they evaluate cloud vendors.

  • If the offering isn't ubiquitous, it's not real cloud.
  • If the resources can't be rapidly provisioned, it's not real cloud.
  • If the service provider alone can spin them up and down, it's not real cloud.

The offering may work in the short run, but only a real cloud offering has resource pooling, broad network access, rapid elasticity and self-service on demand with measured usage for the long haul.

Second, of the multiple options for maintaining and protecting your databases during migration to the cloud, database replication offers the highest accessibility of current data with the lowest impact on user productivity.

Designed to meet a wide variety of IT needs in availability, scalability and integration, SharePlex reduces latency along the path from source database to target database in the cloud. It puts control over the migration schedule in the hands of IT managers and DBAs and allows them to fail back to the original version of the database with no loss of data.

Its architecture, compare-and-repair feature, cloud database support and optimistic view of commit make SharePlex the golden alternative for Oracle database management teams with no migration time to lose.

About the author

Clay Jackson is a database systems consultant for Quest, specializing in database performance management and replication tools. Prior to joining Quest, Jackson was the DBA manager at Darigold. He also spent over 10 years managing Oracle, SQLServer and DB2 databases and DBAs at Washington Mutual. While at WaMu, Jackson was the enterprise database compliance officer, with responsibility for database security and disaster recovery. He also worked at Microsoft and Starbucks, is a CISM and has a master's degree in software engineering from Seattle University.

About Quest

At Quest, our purpose is to solve complex problems with simple solutions. We accomplish this with a philosophy focused on great products, great service and an overall goal of being simple to do business with. Our vision is to deliver technology that eliminates the need to choose between efficiency and effectiveness, which means you and your organization can spend less time on IT administration and more time on business innovation.