Designing and implementing fine-tuned production ready data/ML pipelines in Hadoop platform.
Driving optimization, testing and tooling to improve quality.
Reviewing and approving high level & detailed design to ensure that the solution delivers to the business needs and align to the data & analytics architecture principles and roadmap.
Understanding business requirement and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
Following proper SDLC (Code review, sprint process).
Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users
Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in Hadoop platform.
Supporting and contributing to development guidelines and standards for data ingestion
Working with data scientist and business analytics team to assist in data ingestion and data related technical issues.
Designing and documenting the development & deployment flow.
Experience -Must have:
Scala: Minimum 2 years of experience
Spark: Minimum 2 years of experience
Hadoop: Minimum 2 years of experience (Security, Spark on yarn, Architectural
Hbase: Minimum 2 years of experience
Hive - Minimum 2 years of experience
RDBMS (MySql / Postgres / Maria) - Minimum 2 years of experience
CI/CD Minimum 1 year of experience
Bachelor's degree in IT, Computer Science, Software Engineering, Business Analytics or equivalent with at-leas