DevOps and Infra Challenges

Subramani Sundaram (Subu)
5 min readDec 5, 2022
Challenges in Infra and DevOps

Problem Statements:

1. How to effectively use the Infra as Code concept?

2. How to make use of the CI CD pipelines in a better way?

3. How to achieve governance in our project?

4. How to implement the Release Mgmt. concepts in the PROD?

5. How to create a quality code for the customers?

6. How to increase the time to market for our products to customers?

7. How to implement the DevSecOps concepts in our way of working?

8. How to implement the SRE / NRE / GitOps in our process?

9. How to calculate the metrics in our process and improvise it further?

10. How to reuse our existing code and methods for better and effective workforce utilizations.

Solutions for Infra (CloudOps):

1. Implement the GitOps concept for the Infra as Code so that no code is allowed to check in without the peer review and testing, by this we can prevent the wrong infra been setup.

2. Follow the NRE (Network Reliability Engineering) process by which we are creating a Zero trust architecture and all the ports are being monitored and IPs are being allocated with proper care.

3. Follow the SRE (Site Reliability Engineering) concept by which we need to define the proper SLI’s and SLOs for all our services there by we are not breaching any SLA in case of any downtime, we need to calculate it according to the error budgets allocated to our company.

4. Create a proper RPO and RTO from the business since if there is any DR situation or any circumstances where we need to replicate the primary setup, we need to know the exact RPO and RTO and based on that only we can decide if its active-active or active-passive

5. Implement the Chaos Engineering concept, we need to check the sustainability and stability of our infra by inducing some changes to the environment and there by checking how much it can withstand in case of failures. Such as stress test, network failure, LB, HA, pen test etc.

6. Implement the Infra vulnerability scanner as part of the pipeline to check if there are any flaws in the code that is written. This is as part of the Threat Analysis on architectural side.

7. Try to create modules for more often used parts on infra side so that we can reuse it for new projects.

8. We need to use the variables and parameters as most since the main code should not be touched at all and only the variables files should make all the changes for us.

9. Implement the cloud policies as part of the code so that it prevents to deploy it in wrong regions and wrong configurations.

10. Do not execute this infra manually, it must go via pipeline with proper approval and tickets and auditing process only.

Solutions for CICD (DevOps):

1. Implement the PR pipeline concept for the dev code so that no code is allowed to check in without the peer review and testing, by this we can have the good quality code in prod.

2. Create a CI templates in Azure Devops so that we can reuse it for each new project by just changing the variables alone and we don’t need to rewrite it again.

3. Try to use the task groups as much as we can so that we don’t need to change the settings on each pipeline whereas we can change in 1 place, and it reflects on all other pipelines.

4. Create a template for the CD pipeline as well so that we can clone and reuse it for each project.

5. Create multiple stages in the CD so that we can use the same CD for many environments, and we don’t need to create each pipeline for it.

6. Implement the release gates as per the RM process there by we can setup a governance and checks before deploying the code into the PROD.

7. Implement the auto redeploy options so that if the new code fails the tool automatically redeploys the old code without the customer impacts.

8. Use the libraries to store the secrets and keys by linking it to the key vault so that we don’t need to change the variables on each pipeline and thereby we can change it in 1 place.

9. Create proper groups for each project so that the permissions are given as per the standard and not all people will get same access across.

10. Implement the Dev Sec Ops pipeline for better governance and to follow strict guidelines according to the industry standards.

Solutions for Code (Dev):

1. Implement the Git Secrets scanning in the IDE so that when the developers are trying to check in the code with any secrets or passwords or credit card details it will block them to do so.

2. Follow the proper coding standards and we need to implement the static code analysis and code vulnerability checking to see all this and only then it will be allowed to move to next stages.

3. Implement the code quality checks and we need to make sure that the code coverages are also good else the CI should fail automatically until they are fixing it.

4. We must make them follow the Shift Left approach there by we are trying to restrict all the bugs, vulnerabilities on the dev itself instead of taking it to the PROD.

5. Implement the modules and shared libraries concept so that we can reuse the functions or modules in case of new projects.

Tools to use for the above:

1. Azure DevOps

2. Git-Secrects / Yelp

3. Veracode SCA

4. Veracode SAST / DAST

5. Terraform

6. DataDog

7. Service Now / JIRA Service Mgmt

--

--

Subramani Sundaram (Subu)

Azure MCT | Certified DevSecOps/SRE Practitioner | SAFe4 DevOps Practitioner | Azure 9x Certified | DevOps Institute Trainer | DevOps/Azure Cloud Architect