As the pandemic is still playing a key role in our day-to-day life, many companies are trying their best to adapt to the new situation. There are then many extensive reports focusing on the impact of Covid-19 on multiple industries.
As a common factor, many are seeing the use of Public cloud services, as potential remediation of the current situation. Obviously, there is a long-running and extensive debate between On-Premise VS Cloud-based architecture. There are many aspects to consider such as security/cost-effectiveness/vendor-lock-in scalability etc.
Those who can afford it, may choose a hybrid solution to keep their options open and benefit from most worlds… However, this is extremely expensive, especially if you starting it from scratch.
The current situation undeniably changed the odds to the favor of Public cloud services for many users due to its dynamic and cost-effective nature and due to its extensive managed-service offerings which offer the box solution to most small/medium/large companies and enterprises who do not want to invest heavily into this area.
Hence, the amount of “Cloud Migration” projects in the industries are skyrocketed, and as it’s not as straightforward, and there are many things to consider, I hope to shed light on some aspects of cloud migration so you can make the best decision.
If you ask Google what is Cloud migration is? – it will tell you that it
“… is the process of moving data, applications, or other business elements to a cloud computing environment. There are various types of cloud migrations an enterprise can perform. One common model is the transfer of data and applications from a local, on-premises data center to the public cloud.”
As you can guess from the above, there is more than just one way to achieve cloud-migration.
You can do a :
- Lift and shift ( where you literally doing a bare minimum to move your estate from on-prem to Cloud )
- Improve and Move ( where you take what you currently have but try to optimize by leveraging services and offerings which you have not used before )
- Rip and Replace ( where you decide to re-architect your existing estate to be optimized for the cloud vendor so that you can enjoy ALL the benefit it may provide without compromises )
When it comes to Public Cloud, as it’s a very competitive market, there are many BIG players you can choose from:
- Amazon AWS
- Microsoft Azure
- Google Cloud
- IBM Cloud
- Oracle Cloud
I’ve been asked a few times before :
“ … but cloud is cloud, right? What is the difference between Private and Public?”
This is something good to understand and take into account when you planning your migration. There are two main types of Cloud environments: Private and Public.
Public cloud is what’s delivered via the internet and shared across organizations. Private cloud is cloud computing that is dedicated solely to your organization. Hybrid cloud environment that uses both public and private clouds
Try not to confuse these two things:
- Hybrid cloud strategy – which is when you use more than one cloud provider in an active-active or active-passive manner
- Hybrid Cloud Computing – when you use both on-prem and off-prem(=cloud) computing in an active-active or active-passive manner
Based on my personal experience, when you get started, these are the stages you need to go through from 0 to 100:
During your first and second stage – you need to make a lot of decision, so hope that my personal preferences might help you make the best decision which suits you the best :
Lift and Shift Or Cloud Optimised?
For a shallow cloud integration (sometimes called “lift-and-shift”), you move the on-premise application to the cloud, and make no—or limited—changes to the servers you instantiate in the cloud for the purpose of running the application. Any application changes are just enough to get it to run in the new environment. You don’t use cloud-unique services. This model is also known as lift-and-shift because the application is lifted “as is” and moved, or shifted, to the cloud intact.
Choosing this option will minimize the effort and involvement, however, you can be sure this will still not going to be an easy sale.
Depending on the complexity of your existing estate this might be a rather counter-productive choice (if its about medium complexity or above)
For a really simple footprint, however, this is your perfect choice.
For a deep cloud integration, you may modify your application during the migration process to take advantage of key cloud capabilities. This might be nothing more advanced than using auto-scaling and dynamic load balancing, or it might be as sophisticated as utilizing serverless computing capabilities such as AWS Lambda for portions of the application. It might also involve using a cloud-specific data store such as Amazon S3 or DynamoDB.
In case you feel fairly confident and have the right team at your disposal to execute this project, this is the recommended choice. This might be the best of both worlds and with the relatively minimal investment, you can gain a lot in the long-run.
Rip and Replace (or RIP &R) is only recommended if you have the army of engineers (and the findings) available for you, and you confident you can build everything up from scratch.
This would deliver the best results in the long term – however, it most likely will be a really expensive investment with zero or negative ROI in the short-or-medium run. However, it is undeniably the best way to ensure you utilize the cloud provider to its full.
Single or Multi-Cloud Strategy?
Single Cloud means simpler design, and you can optimize heavily utilizing vendor-specific services, however, you may risk vendor lock-in.
Multi-cloud means you will never be locked into one cloud provider and have a greater sense of freedom with high availability across providers. However, it greatly increases complexity and therefore, the cost of the migration as well.
DevOps and IaC
During the migration, my personal recommendation would be to put special focus on DevOps best practices and Infrastructure as Code (IaC).
DevOps is a practice that unifies the building and running of systems with an emphasis on automation and monitoring at all stages.
If you want to be successful in your cloud migration, you need to practice this culture in everything you do. Because of this, you should consider SRE engineering to lead and complete the migration.
“SRE is what you get when you treat operations as if it’s a software problem” – Google definition
SREs are basically a mixture of software engineers, operation engineers, and Infrastructure architecture all at once, living by DevOps best practices, and owning the product from the top to bottom.
But why is this important – you may ask?
Because if you want the following in your arsenal, then you should choose to do the migration properly with :
In software engineering, CI/CD or CICD generally refers to the combined practices of continuous integration and either continuous delivery or continuous deployment. CI/CD bridges the gaps between development and operation activities and teams by enforcing automation in building, testing, and deployment of applications. (Wikipedia)
2] Immutable Infrastructure
Immutable infrastructure makes “configuration changes by completely replacing servers” (Morris, 2016:70) so rather than working on updates to live service with an immutable approach you ensure that a “deployed server will remain intact, with no changes made” (BMC, 2020). If an update is needed to be made then the existing instance is retired and a new one takes its place.
Having a templated approach to infrastructure and its implementation “increases predictability as there is little variance between servers as tested and servers in production.” (Horowitz[Netflix], 2019)
3] Infrastructure as Code
Infrastructure as Code in a nutshell: “every server could be automatically rebuilt from scratch, and configuration tooling would run continuously” (Morris).
Google’s take is similar: Automate repeatable tasks like provisioning, configuration, and deployments for one machine or millions.
4] Monitoring as Code
Monitoring As Code, first mentioned at AWS Re:Invent in 2017, is about taking the IaC approach to monitoring and alerting. This can be seen in Google’s SRE handbook where they recommend “treating your monitoring configuration as code” (Beyer et al., 2018:67). To put this into context it would enable all monitoring and alert could be automatically rebuilt from scratch and configuration tooling to run continuously. This would mean that the same monitoring and alerting could be applied to lower environments and also should we ever choose to change vendor the scope of work is easily known.
This is super important if you want to be able to release any changes quickly and securely (and if you do this right – every tiny modification against your estate defines as a release)
What Challenges you may face? A LOT…!
- Data management
- Resource Management
- Cultural element ( Ownership / Processes / Tooling / Ways of working )
- Cost prediction / Cost tracking
But why am I considering doing this …?!
- AWS claims an avg of 31% infrastructure cost savings and on avg 62% more efficient IT infrastructure management compare to on-premise.
- Lower Costs. You already know that subscription costs are a money saver, because your first monthly payment gets you instant, vastly improved infrastructure capabilities, which would have taken a hefty up-front investment in the past.
- Streamlined Performance.
- Innovative Agility.
In short, it’s not a simple task …
However, if you already on the mission to make this a reality for your company (OR for yourself), you should aim to do this right at the first time, as it will be more costly to correct any unintentional mistakes at a later stage.
If you have to make a decision to do a greater investment at the begging of your journey (choose between Lift and shift /Improve and Move/Rip and Replace) go for the 2nd option! Or if you like a challenge – option 3 might be just for you!
Lift and shift (based on my own personal experience) is never as simple as it sounds, and has greater cost-implication, in the long run, to keep things alive and well.
If you are on this journey, best of luck and be proud, as you are literally paving your business’s future for many years to come!
Article written by David Jambor, Head of Systems Engineering at Vodafone
Beyer, B, et al (2018) The Site Reliability Workbook. 2nd edition. Sebastopol, CA: O’Reilly. Available at:https://lrita.github.io/images/posts/com/the-site-reliability-workbook-next18.pdf
AWS Re:Invent 2017 – https://www.youtube.com/watch?v=JLS6fdiiL_0