Join NVIDIA as a Senior Engineer on the Infrastructure Specialists team. Help redefine deep learning, data analytics, and power data centers worldwide using NVIDIA products. Collaborate on building the world's largest and fastest data centers and supercomputers. We are seeking a candidate who can lead the planning and deployment of AI data centers, focusing on infrastructure aspects like power/cooling systems, telemetry and control systems, and design and construction processes.
In this role, your main focus will be to support customers in the areas of data center planning, design, construction, and deployment. You will ensure the integrity of the NVIDIA platform infrastructure by meticulously planning, implementing, and validating all aspects of the data center's physical infrastructure. This includes architectural systems, power distribution, cooling systems, integration of telemetry and control systems, and all other physical infrastructure. Collaboration with product and engineering teams, customers, and the partner/provider ecosystem will be crucial to achieving successful deployments.
What you will be doing:
- NVIS Data Center deployment planning: Collaborate with product and engineering teams to understand NVIDIA’s reference architectures for data center infrastructure including power distribution, cooling systems, controls and monitoring, and network/cabling architecture. Support customers and partners in quickly implementing this architecture into advanced and reliable data center designs.
- Design and construction oversight: Review and appraise customers' and partners' infrastructure design plans, verifying their compliance with NVIDIA reference architecture, industry standards, and regulatory requirements. Deliver guidance, expertise and suggestions to optimize performance, scalability, and cost-effectiveness.
- Assess the operational efficiency, reliability, and readiness of data center infrastructure components before deploying AI/HPC clusters. Develop and implement comprehensive audit plans and conduct pre-deployment audits to identify potential issues, risks, and areas for improvement.
- Partner ecosystem: Develop and sustain a strong ecosystem of colocation providers, service providers and partners as needed, to ensure customers can deploy NVIDIA solutions rapidly and reliably.
- Be the key liaison for customers and partners on matters of data center infrastructure.
- Act as the NVIS mentor providing guidance, mentorship, and support to ensure the team's success in their respective roles.
- Quality Assurance: Implement and make quality assurance processes to ensure that deployments meet established specifications and performance benchmarks. Conduct detailed bring-up, testing, and commissioning to validate the functionality and reliability of infrastructure components.
- Continuous Improvement: Drive continuous improvement initiatives to improve data center infrastructure reliability, resilience, and sustainability. Find opportunities to streamline processes, automate repetitive tasks, and apply new technologies to optimize infrastructure operations.
- Collaboration and Communication: Collaborate and communicate across internal teams, external vendors, and customers to facilitate the flawless integration of data center infrastructure solutions. Serve as a domain authority and point of contact for infrastructure-related inquiries and critical issues.
What we need to see:
- Bachelor's degree or equivalent experience in Engineering, Computer Science, Information Technology, or a related field. Advanced degree or equivalent experience or relevant certifications are desirable.
- We need an expert professional with a background in the design and construction of enterprise and/or hyper-scale data centers. The ideal candidate will have at least 6 years of experience, preferably in sophisticated, high-density AI/HPC data centers.
- Proven experience in data center engineering, operations, or infrastructure management roles, focusing on large-scale data center deployments.
- Strong technical knowledge and experience in data center systems - power distribution, liquid cooling, rack/server chassis, and cabling.
- Proven technical and project leadership under fluid situations, and ability to adapt to change.
- Excellent analytical, problem-solving, and decision-making skills, keen attention to detail, and a dedication to quality.
- Strong record of excellent partnership and putting the mission's success first.
- Effective communication and interpersonal skills with the ability to interact authoritatively with diverse collaborators including customers and facilitate productive discussions.
- Coordination & Time Management – proficient at planning, scheduling, and coordinating tasks related to the job to accomplish objectives within or ahead of designated time frames.
- Willingness to travel (40%).
Way to stand out from the crowd:
- Experience in data center deployment, operations process, safety, and security measures.
- Solid understanding of the whole data center Infrastructure stack.
- Outstanding social skills.
NVIDIA is widely considered one of the world's most desirable employers in technology. We have some of the world's most forward-thinking and passionate people working for us. If you're creative and autonomous, we want to hear from you.
The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.