Principal Infrastructure Engineer
Neo Psychiko, Attiki, Greece |
Engineering
Sthenos AI is the AI developer of EFA Group, building intelligent, mission-ready solutions for defense and aerospace. With deep expertise in Command-and-Control (C2), cyber defense, computer vision, and autonomous systems, we design and deploy secure, field-proven AI that enhances operational efficiency and situational awareness. As part of a leading European defense ecosystem, we bring scalable innovation where it matters most — in the theater of operations.
We are looking for a Principal Infrastructure Architect responsible for designing, implementing and maintaining the distributed infrastructure backbone for a customer of Sthenos AI.
Responsibilities
- You architect and manage a highly available, distributed and virtualized data center infrastructure that serves as the foundation for a next generation AI-based intelligence platform.
- You design and implement the network architecture, compute, storage and virtualization layers of an on-premises environment to ensure scalability, performance and resilience.
- You design and implement a Kubernetes-based platform for running distributed workloads, ensuring optimal integration with networking, storage and hardware resources.
- You plan, deploy and operate virtualized infrastructure environments including compute clusters, hypervisors and software-defined networking.
- You define and implement hardware configurations and infrastructure standards, including server architectures, GPUs, networking equipment and storage systems.
- You design high-performance networking architectures, including routing, switching, load balancing and secure interconnection between distributed infrastructure components.
- You operate a variety of different databases and storage solutions and make sure that they run smoothly and according to the performance standards set by the platform.
- You manage the different environments of the infrastructure and monitor continuously their capacity for taking proactive measures and ensuring their future growth.
- You ensure the reliability, performance and security of the infrastructure by defining operational standards, monitoring systems and incident response processes.
- You automate infrastructure provisioning and lifecycle management using Infrastructure as Code and configuration management tools.
- You collaborate closely with the platform vendor, ML engineers and data scientists to ensure the infrastructure supports demanding AI workloads and distributed computing requirements.
- You continuously evaluate and introduce new infrastructure technologies, architectures and best practices to improve efficiency, scalability and resilience.
- You act as a technical authority and mentor for the infrastructure and platform engineering teams.
Your Skills
- Graduate degree in Computer Science, Informatics, Electrical Engineering or a related field.
- Hands-on engineering mindset with a strong passion for designing, building and operating data center infrastructure.
- 8+ years of experience designing and operating complex infrastructure systems, preferably in high-performance or distributed environments.
- Deep expertise in networking, including routing, switching, VLAN/VXLAN, firewalls, load balancing and software-defined networking.
- Strong experience with virtualization technologies such as VMware or similar hypervisor platforms.
- Extensive experience designing and operating Kubernetes clusters for large-scale distributed workloads.
- Strong knowledge of Linux operating systems and system internals.
- Solid understanding of server hardware architecture, including CPU, memory, storage and networking configurations.
- Experience with storage systems such as distributed storage, SAN/NAS, and software-defined storage.
- Strong experience with relational and NoSQL databases but also modern object stores.
- Proven experience with automation and Infrastructure as Code, preferably using Terraform, Ansible or similar tools.
- Experience with observability, monitoring and performance tuning of infrastructure systems.
- Experience with DevOps best practices and the implementation of CI/CD pipelines.
- Strong understanding of high availability, fault tolerance and disaster recovery strategies.
Your Competencies
- You have good analytical skills and you can split a complex problem down to its individual parts.
- You excel in collaborative environments, working effectively alongside highly skilled engineers and scientists, while confidently taking ownership and making bold, well‑reasoned decisions when technical challenges arise.
- You implement new ideas independently and show perseverance when it comes to defending agreed concepts.
- Your communication skills allow you to conduct knowledge transfer of complicated technical topics in an easy manner to other audiences.
- You bring perseverance and constructive assertiveness.
We Offer
- Competitive remuneration package
- Continuous learning & development opportunities
- Participation in cultural and team-building activities
- Exposure to a growing environment with cutting-edge technologies
- Corporate wellness initiatives