Moving to a modern infrastructure can be a confusing and challenging task. So, how can your business best use and understand the cloud? How do leverage the benefits of cloud delivered technology while maintaining controls over performance, security and cost? To start, there are many cloud providers. The leading providers are Amazon’s AWS, Microsoft’s Azure and Oracle Cloud. Almost never mentioned is Google’s GCP, which I like to refer to as “The Windows Phone of cloud providers.” — OUCH. It’s great but lacks a ton of features and has a very small percentage of the market. If choosing a provider wasn’t a big enough task, you also have to figure out where your applications belong. To best understand the cloud, let’s look at how cloud providers define their services. The cloud can be summed into three services: SaaS, IaaS and PaaS. There’s also a hybrid approach, which Microsoft and Oracle offer. This allows you to keep some components on-premise and migrate others to the cloud.
SaaS (Software as a Service) is the most familiar to users when trying to understand the cloud. Office 365, SalesForce and Oracle ERP Cloud are excellent examples of SaaS – based services used by millions of users world-wide. These solutions are fully deployed and managed in the provider’s data centers with access being provided via remote connection or web browser. Cost is based on user count or consumption which allows predictive cost modeling. With the strength of their SaaS products Microsoft and Oracle are best positioned for the next generation of technical architectures which will require tight integration of all application and data processes.
IaaS (Infrastructure as a Service) is the closest thing to on-premise systems as you’ll find. Clients create virtual machines and they manage all aspects of the system. These VMs are created based on the size of machine you believe is needed based on compute and memory. You also specify the type of storage needed, network performance and security. Since this service is the closest to existing on-premise installations, it’s often chosen as a first step to transition into the cloud using the “Lift-and-shift” migration method.
The “Lift-and-shift” migration is simplistic, accounting for 1 cloud machine for each on-premise (1:1) and generally doesn’t include any advanced optimizations or license consolidation.
PaaS (Platform as a Service) is a truly native cloud offering. In database talk, Azure SQL Database and Azure Managed Instances are both PaaS offerings from Microsoft. Oracle provides Autonomous Database Cloud and Autonomous Data Warehouse while Amazon offers RDS and Aurora. There are other cloud native services as well such as Oracle Autonomous Analytics Cloud, Cosmos DB and Amazon RedShift.
PaaS offerings have the ability to scale up and down on demand. You can do this by enabling elastic pools. PaaS offerings also have more rapid release schedules so that you are always on the latest and greatest software. This allows your developers to take advantage of new features much faster than if you were running SQL Server on premise. So far, all this sounds like a huge win; however, some PaaS services such as Amazon’s Aurora can make it much more difficult to migrate to a different cloud provider. This is because the Aurora is optimized for the AWS platform and may not easily be transferred to postgres or mysql on a different provider. As for Azure, running in PaaS is great but since you’re always on a build that is newer than on-premise, it’s a bit more difficult to move away from this offering as well. Don’t get me wrong. If you’re moving to the cloud, a PaaS offering is probably the best choice if your applications support it.
On a final note, when trying to understand the cloud and how to leverage cloud delivered technology for your business, I’d like to compare cost. In the long run I believe everything will have migrated to the cloud but it’s not a necessity today. If you’re a technologist I strongly suggest you hit the books and start exploring if you haven’t already. If you’re a business owner, you will want to weigh the benefits and costs. Truly the cloud brings many performance and coding benefits but compare it to leasing a car: It’s a payment that will never go away for a product you’ll never own. On the flip side you won’t be stuck with a 10-year old car with 300,000 miles.
A new feature introduced in Oracle 11gR2 is the way the Direct Path Read operation is done. A direct path read is where the data is read directly from the data files into the PGA rather than into the buffer cache in the SGA. Direct Path reads have been around for a while when doing parallel operations; however, serial direct path reads are new in Oracle 11gR2. The direct path read limits the effect of a tablescan on the buffer cache by reading directly into PGA memory. The direct path read is a feature that benefits both standard hardware and the Exadata Storage Cell, which performs table scans with amazing efficiency.
The direct path read is available only when the Oracle optimizer chooses a full table scan. At run time, a direct path read can be chosen based on internal criteria. Unlike previous releases of Oracle where a serial table scan used the buffer cache, the direct path read does not use the buffer cache, or in the case of an Exadata, the cell flash cache. This is done to avoid the problem where a large table scan can place a large amount of data in the buffer cache. Although Oracle does this intelligently, it still can take a lot of space in the buffer cache.
The direct path read operation is asynchronous, thus the session does not necessarily wait for them to complete. It is only when the session stalls waiting to make sure that all asynchronous reads/writes have taken place that you will notice the direct path read waits.
This feature is designed to help avoid the problem where a large table scan ejects all other data from the buffer cache. The keep and recycle caches helped with this, but could be difficult to tune. The decision to perform a direct path read is based on the size of the buffer cache, the size of the table and various statistics.
In general, the direct path read can be helpful, but in some cases, where the same data is constantly being re-read for application purposes, the direct path read can generate huge amounts of unnecessary reads, thus bogging down the I/O subsystem. Because of the nature of the direct path wait events, this effect of the direct path read operation will often show up as an I/O wait, not a direct path wait.
So, in some cases, a table that is frequently accessed via a table scan might hit the threshold where the direct path read operation is met where it is inappropriate. In these applications, the same table scan is run over and over again with no benefit of the buffer cache, thus causing excessive I/O operations.
I have also seen this be a problem with Exadata where a TABLE ACCESS STORAGE FULL is downgraded to cell single block reads, due to a chained row or fragmentation issue. Since it is a table scan, the direct path read was chosen, but the storage cell was unable to do a cell smart scan and downgraded to the cell single block read, thus resulting in un-optimized, un-cached reads.