Big Data Engineer (Closed)
SkillStorm is seeking a Big Data Engineer for our client in Addison, TX OR Charlotte, NC. Candidates must be able to work on SkillStorm's W2; not a C2C position. EOE, including disability/vets.
- Our organization is modernizing our HR data & analytics technology services inclusive of self-serve & API-driven access to Big Data. The Data Engineer-Data Lake will join the Bank's Global Human Resources Technology-Data, Analytics, & Reporting Development Team. We are looking for a well-rounded data engineer who ""gets it"" and sets the example for effecive collaboration in a team environment. He/She must demonstrate excellent communication skills and critical thinking to ensure all assumptions, constraints, behaviors are well thought through. The individual ensures the systems design and requirements are aligned to achieve the desired business outcomes, and that team practices and coding/quality principles are aligned to achieve the desired technology outcomes. This individual will deliver developing complex solutions in Data Lake environment, involving technical design, development, testing.
- The desired individual will have demonstrated experience in standing up Big Data operational & analytics platforms, translating business requirements to scalable & sustainable technical solutions. Qualified candidates must be well-versed in data warehousing and ‘Big Data’ distributed processing and storage technologies as well as Data Lake design pattern. In addition, the selected candidate should have knowledge and experience with data management techniques such as Meta data management , Data Quality (DQ) management, Data Governance, Data Integration/Ingestion, Data Architecture, and Data Profiling.
- developing and scripting in Python programming as well as SQL in Linux environments.
- integrating data with Sqoop and ingest files of multi-record types with various data formats Parquet, Avro, and Json.
- Create and maintain optimal data pipeline architecture in Cloudera CDH or similar platform with application development skills in hive, Sqoop, Pyspark
- Participate in status meetings to track progress, resolve issues, articulate & mitigate risks and escalate concerns in a timely manner.
- understands and evangelizes great design, engineering, & organizational practices (unit testing, release procedures, coding design and documentation protocols) that meet/exceed the Bank's change management procedures
- sets the bar for team communications and interactions-is an excellent teammate to peers, influencing them in a postive direction
- uses versioning tools such as GIT/Bit bucket
- sets up jobs using autosys for automation
- 10+ years of overall IT experience with 5+ years on Big Data technologies such as Hadoop, Hive, Spark, Pyspark, Sentry, Hbase, Sqoop, Impala, Kafka.
- 5+ years Data Warehousing experience, including manipulation/ transformation of millions of rows with optimal methods
- 5 + years developer experience building software in Python/Scala/Java, PL/SQL
- Experience with Jenkins Pipeline, Bit bucket, Python Unit Test code development tools.
- Advanced-level analytical, debugging, and critical thinking skills.
- Knowledge of search engine technologies such as Apache Solr, or Elastic Search and able to define schema, create collections, ingest data into search engine and retrieve data using streaming APIs is plus
- Experience in Master Data Management (MDM) concept, tools, and best practices
- Enterprise HR Data Domain experience
- Technology Development experience in Financial Services/Banking Industry
- Reference Data Management Experience
- Zaloni or similar Data Governance tool experience
- Advanced experience in Unix Shell Scripts, Autosys
- Experience with traditional relational database management systems (e.g., SQL Server, MySQL, PostgreSQL, Oracle) and SQL skills, including developing stored procedures, functions, and triggers
- Working knowledge of Machine learning, Artificial Intelligence, and other Big Data Analytics technologies
- Experience in Agile Scrum, DevOps ,SAFe or Disciplined Agile ways of working & CI/CD Processes