We have one requirement for SRE Bigdata (Hadoop). Location is St. Louis. It will be hybrid mode with one or two days in office every week. Please find the JD below:
Sr System Reliability Engineer (Big Data)
Location: St. Louis, MO
Required Skills: ITSM, Production Support, Hadoop, Hive, Spark, Impala, Automation
The Role
• Plan, manage, and oversee all aspects of a Production Environment for Big Data Platforms.
• Define strategies for Application Performance Monitoring, Optimization in Prod environment
• Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
• Ensures that batch production scheduling and process are accurate and timely.
• Able to create and execute queries to big data platform and relational data tables to identify process issues or to perform mass updates, preferred.
• Performs ad hoc requests from users such as data research, file manipulation/transfer, research of process issues, etc.
• Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize mean time to recover.
• Engage in and improve the whole lifecycle of services-from inception and design, through deployment, operation and refinement.
• Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
• Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
• Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead Mastercard in DevOps automation and best practices.
• Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
• Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
• Work with a global team spread across tech hubs in multiple geographies and time zones.
• Ability to share knowledge and explain processes and procedures to others.
Requirements
• Experience in Linux
• Experience in the Big Data technologies (Hadoop, Spark, Nifi, Impala)
• Knowledge on ITSM/ITIL.
• 5+ years of Experience in running Big Data production systems.
• Good to have experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, Chef.
• Solid grasp of SQL or Oracle fundamentals
• Experience with scripting, pipeline management, and software design.
• Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
• Ability to help debug and optimize code and automate routine tasks.
• Ability to support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
• Appetite for change and pushing the boundaries of what can be done with automation.
• Experience in working across development, operations, and product teams to prioritize needs and to build relationships are a must.
• Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort is desired.
• Good Handle on Change Management and Release Management aspects of Software