Maintenance is a fundamental aspect of keeping any environment running smoothly. Within Oracle Cloud Infrastructure (OCI), infrastructure maintenance ensures the optimal performance of virtual machine instances while minimizing disruptions for users. Let’s delve into the various strategies OCI employs for infrastructure maintenance.
Life Migration: Minimizing Disruptions
During infrastructure maintenance events, OCI employs live migration for supported virtual machine instances. This process seamlessly shifts instances from the physical VM host undergoing maintenance to a healthy host, with minimal disruption to running instances. It’s important to note that live migration is compatible only with Linux operating systems and specific shapes of virtual machines.
Reboot Migration: Scheduled and Controlled
For instances that don’t support live migration, OCI schedules a reboot migration within a designated timeframe, typically between 14 to 16 days. Notifications are sent to users informing them of the impending reboot. If the instance isn’t rebooted proactively before the due date, OCI automatically stops and migrates it to a healthy host, initiating a restart. While this process incurs a short downtime, users can manage when this occurs by proactively rebooting the instance before the scheduled maintenance date. It’s crucial to differentiate between operating system reboots and instance reboots initiated via the Console, CLI, or API, as the latter triggers the migration process.
Manual Migration: Ensuring Continuity
Despite automated migration efforts, manual intervention may be necessary for instances without a scheduled maintenance date. In such cases, instances must be manually moved, necessitating termination and recreation. Users must remember to preserve the boot volume during this process to ensure data integrity.
Automatic Recovery: Swift Response to Failures
When underlying infrastructure failures occur, OCI automatically attempts to recover virtual machine instances. Standard instances undergo reboot migration, restoring them on a healthy host seamlessly. Dense I/O VMs, however, are rebooted on the same physical host whenever possible. If recovery on the same host isn’t viable, OCI notifies users to delete or terminate the instance within 14 days. Failure to do so results in OCI disabling the instance and subsequent deletion within the next seven days. Throughout this process, boot volumes and remote-attached data volumes remain preserved, safeguarding critical data.
Conclusion
Oracle Cloud Infrastructure employs a comprehensive approach to infrastructure maintenance, ensuring continuous performance and minimal disruptions for users. Through strategies like live migration, reboot migration, and manual intervention when necessary, OCI prioritizes reliability and data integrity, even in the face of infrastructure failures. By understanding these maintenance procedures, users can confidently harness the power of OCI for their business needs.