SRE - DataPlatform

Veepee

CDI , France, , France IT / Digital
Publiée le
20/04/2026
Contrat
CDI
Localisation
, France, , France
Taille équipe
250-2000 emp.
Rémunération
Inconnue
Télétravail occasionel 3-5 ans exp. Anglais

Avantages

Horaires de travail flexiblesCongés payésFormation en leadership
Missions clés Assurer la fiabilité et la performance des services de la plateforme de données. · Contribuer à la migration vers VeepeeCloud. · Améliorer les services fonctionnant sur Kubernetes. · Collaborer pour optimiser l'utilisation des ressources. · Contribuer à la définition et à la mise en œuvre du plan de reprise après sinistre.
Profil recherché 3-5 ans d'expérience · Collaboration · Autonomie · Adaptabilité
Outils & compétences Kubernetes, Trino, Iceberg, S3, Kafka, Flink, Terraform, Prometheus, Grafana, GitOps

Le poste en détail

Being an SRE at VeepeeTech means being part of a transversal SRE community while integrating a product-oriented Data Platform team.

You will contribute to the reliability, scalability, and operability of critical data services by applying SRE and DevOps practices, while sharing knowledge across teams.

The Data Platform is currently evolving toward a modern lakehouse architecture deployed on VeepeeCloud (our on-prem platform), based on technologies such as Trino, Iceberg, and object storage, with strong ambitions around performance, cost efficiency, and platform ownership.

You will work in a distributed environment (France & Spain), within a team of 40–50 data professionals across engineering, analytics, data science, and governance.

You will play a key role in ensuring the reliability and scalability of this next-generation data platform, while supporting the transition from public cloud to hybrid/on-prem architectures.

\n


🎯 TASKS 

Platform Reliability & Operations
  • Ensure reliability and performance of our data platform services (Trino, Iceberg, S3, Kafka, Flink)

  • Define and implement SRE best practices: SLIs/SLOs, error budgets, observability

  • Build and maintain monitoring, alerting, and incident response frameworks (Prometheus, Grafana, etc.)

    Cloud Migration & Architecture
    • Contribute to the migration from public datawarehouse cloud to VeepeeCloud lakehouse stack

    • Support coexistence between cloud and on-prem systems and ensure consistency and reliability

    • Help design resilient architectures for ingestion, transformation, and serving layers

      Kubernetes & Infrastructure
      • Operate and improve services running on Kubernetes (GKE/EKS & on-prem clusters)

      • Automate infrastructure provisioning using Terraform, Atlantis, and/or Crossplane

      • Improve GitOps workflows for platform deployment and configuration

        FinOps & Performance Optimization
        • Collaborate with teams to optimize compute/storage usage (Trino queries, BigQuery slots, etc.)

        • Build tools and dashboards to track cost, usage, and efficiency

        • Support the transition toward cost-efficient on-prem workloads

          Developer Enablement
          • Improve self-service capabilities for data teams (e.g., provisioning Trino/Iceberg resources)

          • Help teams adopt best practices in reliability, observability, and deployment

          • Write clear technical documentation and runbooks

            Resilience & DRP
            • Contribute to Disaster Recovery Plan (DRP) definition and implementation

            • Ensure multi-DC resilience (FR1 / NL1) and data replication strategies

            • Participate in incident management and postmortems


              👉 MUST HAVE skills

              • Strong experience with Kubernetes in production environments

              • Experience with distributed data systems (or strong willingness to learn)

              • Solid understanding of SRE principles (monitoring, alerting, SLAs/SLOs)

              • Experience with Infrastructure as Code (Terraform or similar)

              • Familiarity with GitOps workflows

              • Experience with observability tools (Prometheus, Grafana, logging systems)

              • Comfortable working in cloud environments 

              • Strong collaboration mindset and ability to work across teams

              • Fluent in English 


                👉 NICE TO HAVE skills