Site Reliability Engineer

Company:  IXOPAY
Location: Lehi
Closing Date: 03/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

About IXOPAY, a TokenEx Company


Our mission at IXOPAY, a TokenEx Company is to secure and optimize payments for global commerce. We’re building an integrated platform that optimizes payment transactions and protects payments data. For merchants who understand that payments are now a strategic function, IXOPAY, a TokenEx Company is a complete payments optimization platform that delivers best-in-class tokenization and transaction routing. Unlike point solutions, IXOPAY, a TokenEx Company delivers omnichannel tokenization, card lifecycle management, and smart routing via any payments service provider — giving merchants unprecedented control over their revenue and the competitive edge to thrive in global commerce.


We believe our people are our most valuable asset and that our culture is defined by our core values that align the organization with our mission and strategy.


Position Overview

As an SRE Network Engineer, you will be responsible for applying development skills and mindset to IT operations, with the goal of improving the reliability of IXOPAY’s systems through automation and continuous integration and delivery.


Position Responsibilities

  • Effectively manage troubleshooting and recovery of complex production incidents, ranging from low to critical impacts.
  • Drive incident resolution through a systematic problem-solving approach, coupled with a strong sense of ownership and drive.
  • Actively participate in teams’ Agile stories (project work) to streamline and enhance day to day operations of the team.
  • Create, manage, and utilize appropriate technical procedural documentation (run books).
  • Proactively monitor all applications and infrastructure behind IXOPAY’s external and internal customer-facing services, including availability, latency, performance, and capacity.
  • Influence resiliency and scalability in production environments in Azure and Amazon Web Services (AWS).
  • Assist with conducting Root Cause Analysis (RCA) on critical production outages, develop and implement mitigation strategies
  • Utilize production support expertise to influence and support new designs, architectures, standards, and methods, maintaining stability and availability for large-scale distributed systems
  • Proactively identify and implement opportunities for automation of routine maintenance tasks, data gathering, and resolution of common issues.
  • Continuously seek to develop new skills and technical expertise, as well as proactively share knowledge with others.
  • Build software and systems to manage platform infrastructure and applications to improve reliability, quality, and time-to-market of our suite of software solutions.
  • Gather and analyze operating systems/applications metrics to assist in performance tuning and fault finding.
  • Participate in system design consulting, platform management, capacity planning, testing & release procedures.
  • Create sustainable systems and services through automation and uplifts.
  • Balance feature development speed and reliability with well-defined service level objectives.
  • Perform disaster recovery operations, monitor network performance, and troubleshoot, diagnose, and resolve hardware, software, and other network and system problems.



Position Qualifications

  • Bachelor’s Degree in Computer Science preferred but not required or relevant experience
  • In-depth understanding of web service protocols and REST API design and consumption
  • Experience with both container and serverless computing
  • Microsoft Azure/AWS developer/architecture certifications preferred
  • Skilled in Cloud/PaaS Environments (e.g., AWS, Azure), LAN, WAN, Network Security
  • Proficient, collaborative, & experienced in building reliable, scalable, enterprise systems
  • Ability to identify root-cause sources of instability in a high-traffic, large-scale distributed systems
  • Linux administration, troubleshooting, and performance tuning experience
  • Understanding of observability principles (monitoring, logging, tracing, alerting), tools and practices that promote observability
  • Experience with continuous integration tools (e.g., GitLab, AWS CodeBuild, CodeDeploy, CodePipeline, Azure DevOps)
  • Trouble-shooting skills that span systems, network, and code Strong understanding of network infrastructure and network hardware
  • Ability to implement, administer, and troubleshoot network infrastructure devices, including firewalls and load balancers
  • Configuration management and orchestration (e.g., Terraform, Cloud Formation, Ansible, Chef)


Location: This position will be hybrid in the Lehi, Utah area.

Apply Now
Share this job
IXOPAY
  • Similar Jobs

  • Site Reliability Engineer

    Lehi
    View Job
  • Site Reliability Engineer

    Lehi
    View Job
  • Staff Site Reliability Engineer

    Lehi
    View Job
  • Staff SRE (Site Reliability Engineer)

    Lehi
    View Job
  • Senior Engineer

    Lehi
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙