Our mission is to service with Integrity as both an employer of choice to our associates and a strategic partner to our clients.
IRI’s vision is to become an industry-leading staffing services organization by maintaining ethical business practices, a passion for customer service, a commitment to quality, and our continued efforts to exceed expectations.
Job Description
Job Title: Big Data Platform Staff Engineer
Duration: 6 Months
Location: Santa Clara, CA
Job Description: As a core developer, you will be part of the team that focuses on the next generation Clients’ Big Data Platform. Your primary responsibility includes developing big data tools that facilitate various consumers (users or programs), in terms of optimizing or automating the system. In particular, you will be developing tools for handling big data metadata, extracting metadata and importing them into Clients’ big data platform, and synchronizing metadata across several metastores. You will also be developing algorithms to analyze query workload and generate recommendations (advisors) on optimizing the various system components and structures. You will also perform testing, analyzing, and optimizing the Big Data Platform performance. The position involves strong problem-solving and analytical nature, and excellent verbal and written communication skills.
Requirements:
BS or higher degree in Computer Science or related field.
5+ years of software development experience, including performance analysis.
Strong SQL knowledge (JDBC) in query processing, optimization and execution, query performance, Explain, database tooling.
Spark/SparkSQL is a big plus.
Excellent coding skills in Scala/Java and scripting languages; UI skill a big plus.
Hands-on experience in Big Data Components/Frameworks, such as Hadoop, Spark, HBase, HDFS, Hive, NoSQL databases etc.
Experience in other distributed system components, such as Solr, Elasticsearch, Logstash, Kibana, Kafka, REST API, etc. is a plus.
Key Responsibilities:
Develop tools that extract/import/synchronize metadata.
Develop tools that analyze query workload, derive common workload characteristics, generate recommendations for optimizing the system usage, and measure the benefits (advisors).
Analyze performance and scalability of the system, typically across a variety of software configurations.
Testing (and automated testing) of developed code.