Automating Server Discovery in a Hybrid Cloud Migration — A Junior DevOps Journey

As part of our organization’s hybrid cloud adoption, we are gradually moving non-critical workloads from GCP to our on-premises environment. During this transition, one of the key requirements was to gather detailed information about applications running on each server — including the tech stack and runtime details.
What initially looked like a simple task quickly exposed a classic scalability problem.
🎯 The Challenge
We needed to collect application and tech stack details from multiple GCP virtual machines.
❌ The old (manual) approach
SSH into each server individually
Execute a shell script manually
Copy and consolidate outputs
Repeat for every VM
This approach was:
Time-consuming
Error-prone
Not scalable
Operationally inefficient
With a growing fleet, manual work was clearly not sustainable.
💡 The DevOps Solution
To eliminate repetitive manual effort, I automated the entire workflow using Ansible.
Key improvements implemented:
✅ Centralized execution from Ansible control node
✅ Passwordless SSH using GCP project metadata
✅ Fleet-wide script execution
✅ Parallel server processing
✅ Foundation for future configuration management
🔐 Step 1 — Enabled Secure, Scalable Access
Instead of configuring SSH access VM by VM, I followed the GCP best practice.
What I did:
Generated SSH key on the Ansible master
Added the public key to:
GCP → Compute Engine → Metadata → SSH Keys
🚀 Why this matters
By adding the key at the project metadata level:
The key propagates automatically to all VMs in the project
No need for per-VM key distribution
Future servers inherit access automatically
Enables true fleet management
This was a major efficiency win.
🧾 Step 2 — Built Scalable Ansible Inventory
I organized all servers under a common group in /etc/ansible/hosts, allowing centralized management.
This ensured:
Clean structure
Easy expansion
Environment-wide consistency
⚡ Step 3 — Automated Shell Script Execution
Previously, engineers had to log in to each machine and run the discovery script manually.
With Ansible, the same script can now be executed across the fleet in parallel from a single command.
Benefits achieved:
⏱️ Massive time savings
🔁 Repeatable execution
📉 Reduced human error
📈 Better operational visibility
🔧 Ready for future automation
📊 Real-World Outcome
During validation:
Running VMs responded successfully
Stopped VMs showed as unreachable (expected behavior)
SSH access worked seamlessly via metadata keys
Fleet connectivity was verified using Ansible ping
The environment is now ready for large-scale configuration management.
🧠 Key Learnings
🔹 Hybrid cloud migrations expose manual process gaps
🔹 Metadata-based SSH is the correct approach in GCP
🔹 Ansible dramatically reduces operational overhead
🔹 Inventory design matters for scalability
🔹 Parallel execution is critical in real environments
🔹 Automation should be built with future growth in mind
🚀 What’s Next
Next steps in this journey include:
Automating application inventory collection
Building reusable Ansible playbooks
Implementing dynamic inventory for GCP
Extending automation to on-prem servers
Standardizing configuration management
🏁 Final Thoughts
This exercise reinforced an important DevOps principle:
If you have to do it more than twice, automate it.
By replacing manual SSH work with Ansible-driven automation and project-level SSH key management, we now have a scalable foundation that supports both our current GCP environment and the upcoming hybrid cloud model.




