Site Reliability Engineer

Company:  Aegistech
Location: New York
Closing Date: 20/10/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

We are looking to hire an employee for a Cloud SRE – Windows Hybrid role located in NYC.


The role is located within the client’s Cloud and Platform group, a global team responsible for the maintenance and support of infrastructure systems used within the client, a global international investment bank. The team plays a critical role and works closely with global counterparts in maintaining the production infrastructure. The candidate should have strong technical, functional, and analytical skills with good experience in automation and supporting critical infrastructure and troubleshooting on Windows systems. This position will contribute towards supporting and driving infrastructure implementations to completion and serving as subject matter expert to the user community, across Infrastructure, Platform, and Software as a Service (Iaas/PaaS/SaaS). The team operates in a follow the sun support model.


The function provides a variety of services to our stakeholders including hardware specification advice, Operational Readiness of new solutions and implementation of new Windows servers.



THE DAY-TO-DAY RESPONSIBILITIES:

  • Support/manage MS Windows systems and implementation of Change requests.
  • The candidate will create scripts to increase the efficiency of daily support. This includes updating runbooks and support procedures.
  • Active collaboration with the Global Operations and Engineering teams to implement key projects within the Cloud environments.
  • Provide assistance and support to transformation programs for application and services looking to move to the cloud environment.
  • Responsible for looking at ways to improve/automate SRE items - availability, latency, performance, efficiency, and capacity planning.
  • Troubleshoot system performance issues.
  • Handle trouble tickets, user requests, proactive maintenance.
  • Support weekend BCP / DR tests and weekend on call production support on a rotation basis.
  • Assist application teams in post-configuration of new servers deployed.
  • A good understanding of ITIL and Change Management policies is desired.
  • Coordination with Infrastructure teams and Business IT managers to deliver projects on schedule.
  • Work with the Engineering team on Operational Readiness and implement engineered solutions to improve efficiency and stability of the infrastructure.
  • Provide documentation for 1st line Operations team and maintain run books.
  • Investigate and determine root causes for major incidents with the help of vendors and internal infrastructure teams, providing a detailed RCA and plan for remediation.
  • Attend to escalations during Follow-The-Sun support hours.
  • Contribute towards BAU Projects.
  • Work with Incident Management team to provide RCAs for Incidents


THE SKILLS YOU NEED TO GET THE ROLE:

  • Solid experience as a Windows Systems Administrator in a large-scale, global and distributed environment
  • Cloud tools such as Ansible, GIT, Kubernetes, Terraform
  • Virtual (VMware), physical networking configuration is a plus
  • Ability to deploy and support MS Windows Clusters
  • Ability to create scripts using PowerShell, Python, VBScript, Jscript/JavaScript is a plus
  • Knowledge of Site Reliability Engineering components are a must.
  • Experience working with VMware virtualization
  • Understanding of Active Directory and how enterprise class identity and access management (IAM) is extended from on-premises environment to public cloud is a plus
  • Ability to troubleshoot issues and provide resolution
  • Written and verbal communication skills are a must
  • Work independently as well as in a team
  • Previous experience with supporting a banking infrastructure is preferred
  • Prior experience of global enterprise
  • Experience of working with offshore IT teams
Apply Now
Share this job
Aegistech
  • Similar Jobs

  • Site Reliability Engineer

    New York
    View Job
  • Site Reliability Engineer

    Secaucus
    View Job
  • Site Reliability Engineer

    Jersey City
    View Job
  • Site Reliability Engineer

    Jersey City
    View Job
  • Site Reliability Engineer

    Jersey City
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙