logo

View all jobs

Sr. Systems Engineer

Chantilly, VA · Information Technology
We are seeking an experienced Sr. Systems Engineer to support a small standalone system dedicated to high-performance computing (HPC) and artificial intelligence (AI) workloads. This role demands a blend of operational expertise and strategic technical vision, focusing on the management and optimization of our Partner's standalone HPC/AI system. The ideal candidate will manage the technical operation of their infrastructure, develop standardized procedures for hardware, network, and software management across the system, and expertly oversee cluster management (including provisioning, optimization, and monitoring of clustered resources for HPC/AI workloads, such as NVIDIA BCM). 

This position requires broad expertise in HPC/AI system administration, with a focus on:
  • Refining infrastructure management frameworks
  • Traditional infrastructure management (hardware, networking, directory services)
  • Modern HPC/AI support (Linux/Ubuntu, Proxmox, NVIDIA BCM, WEKA storage)
  • Designing scalable, secure, and highly available system architectures

Requirements
  • TS/SCI FSP Clearance on day one
  • Bachelor's degree in engineering, computer science, or related technical field, or equivalent experience
  • 7+ years' experience in systems engineering or related fields
  • Operating Systems & Infrastructure:
    • Expert-level Linux systems engineering
    • Windows client operating systems deployment/maintenance
    • Linux (Ubuntu) server operating systems deployment/maintenance
  • Hardware & Networking:
    • Server hardware
    • Network hardware, wiring, and switching configurations 
  • Virtualization & Containerization:
    • Virtualization (ideally Proxmox)
    • Containerization (ideally Docker/Podman with Ray or Kubernetes)
  • Management & Orchestration:
    • Directory services and PKI infrastructure deployment/maintenance
    • Configuration management (ideally Ansible, Puppet, Chef, or DSC)
    • Cluster orchestration (ideally NVIDIA Base Cluster Management (BCM))
  • Development Support & Software Management:
    • Development support services (Gitlab, Jenkins, Nexus)
    • Operating system software repository synchronization (Apt, Snap, Yum)
Desired Skills
  • Supporting or developing on standalone networks
  • Supporting or developing HPC or AI workloads using hardware acceleration
  • Experience with compliance with enterprise data policy
  • Experience with system security policy and accreditation processes

About Us
For more than 20 years, NewGen Technologies has solved our clients’ toughest IT challenges with integrity, security, and outstanding service by delivering both technology and talent. We have helped secure borders, have used artificial intelligence (AI) to fight terror, aided the identification of criminals, and have helped to prevent crime through the introduction of biometrics. Our team of Highly Cleared Specialists have hard-to-find skills and expertise in a wide spectrum of technologies to provide solutions that transform business processes and solve problems of national significance. #CJ

Share This Job

Powered by