We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Lead Developer

Intercontinental Exchange
United States, Georgia, Atlanta
5660 New Northside Drive Northwest (Show on map)
Dec 16, 2025
Overview

Job Purpose

Intercontinental Exchange, Inc. (ICE) presents an opportunity for a full-time AI Platform Technical Lead to join and lead a team responsible for architecting and managing the enterprise-wide platform for AI model training, deployment, and inference at scale. The candidate will serve as a Technical Lead within the AI Center of Excellence team, playing a pivotal role in advancing the firm's strategic initiative to integrate Generative AI technologies responsibly and sustainably across the enterprise through robust training and inference infrastructure.

The ideal candidate must possess deep expertise in AI/ML training pipeline architecture, inference optimization, and production model serving platforms leveraging the latest advancements in Generative AI, distributed computing, GPU clusters, model optimization techniques, and high-performance inference systems.

This position demands advanced technical proficiency in training orchestration, model deployment pipelines, inference scaling, and performance optimization, innovative problem-solving capabilities, strong leadership qualities, and the ability to mentor and guide MLOps and platform engineering teams effectively. The role requires strategic vision for AI training and inference infrastructure roadmaps, including compute resource management, model lifecycle optimization, and real-time serving architectures. Exceptional professionalism, proactive collaboration, and outstanding communication skills are essential.

The candidate will actively engage and influence diverse stakeholders across the organization to align training and inference platform capabilities with AI model requirements and business SLAs, ensuring efficient resource utilization, optimal model performance, and cost-effective scaling. Strong written and verbal communication skills are imperative, given the candidate's responsibility to articulate training efficiency metrics, inference latency optimizations, resource allocation strategies, and platform ROI clearly and persuasively to both technical teams and executive audiences, including presenting model performance benchmarks, infrastructure cost optimization, and platform scalability roadmaps to senior leadership.

Responsibilities

  • Architecting, implementing, and managing enterprise-wide AI inference and training platform infrastructure.
  • Driving innovation, operational excellence, and scalability within AI/ML model serving and training environments.
  • Leading technical strategy for AI platform development and optimization across the organization.

Knowledge and Experience

  • Advanced degree in Computer Science, Machine Learning, Data Engineering, or related field.
  • Strong programming skills in Python with deep knowledge of ML libraries (scikit-learn, TensorFlow, PyTorch, Transformers).
  • Proficiency in ML model deployment frameworks, inference engines, and real-time serving APIs.
  • Working knowledge of vector databases, model registries, and feature stores (e.g., Feast, Tecton).
  • Experience with distributed computing frameworks (Spark, Ray) and GPU programming (CUDA) is highly beneficial.
  • Experience with AI model monitoring, performance tracking, and observability tools (Prometheus, Grafana, MLflow).
  • Extensive experience in cloud ML platforms (AWS SageMaker, Azure ML, Google AI Platform).
  • Deep experience with Kubernetes for ML workloads, Helm charts, and container orchestration for training pipelines.
  • Experience leading AI platform development in cross-functional teams of data scientists and ML engineers.
  • Expertise in CI/CD pipelines specifically for ML model deployment and automated retraining workflows.
  • Excellence in explaining complex AI infrastructure solutions to technical teams and business stakeholders.
  • Experience in enterprise AI/ML environments, working with governance, compliance, and responsible AI practices.
  • Extensive experience and demonstrated leadership in designing and managing AI/ML training and inference platforms using cloud infrastructure (AWS, Azure, GCP).
  • Deep expertise in ML model serving frameworks (e.g., TensorFlow Serving, TorchServe, MLflow, Kubeflow).
  • Proficiency with GPU cluster management, distributed training and model optimization techniques.
  • Strong experience with AI/ML orchestration platforms, particularly Kubernetes for ML workloads and container technologies including Docker.
  • Comprehensive knowledge of MLOps pipelines, model versioning, A/B testing frameworks, and continuous integration for ML models.
  • Experience with high-performance computing, inference optimization, and real-time model serving architectures.
  • Exceptional problem-solving skills in AI infrastructure challenges and strategic thinking for platform scalability.
  • Proven leadership abilities in guiding cross-functional AI/ML engineering teams and mentoring MLOps engineers.
  • Excellent written and verbal communication skills for technical and executive audiences.
  • Ability to effectively collaborate with data scientists, ML engineers, and business stakeholders to align AI platform capabilities with strategic objectives.

#LI-MA1

Intercontinental Exchange, Inc. is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to legally protected characteristics.
Applied = 0

(web-df9ddb7dc-zsbmm)