Location: Remote Duration: Long-term Experience Required: Minimum 5 years Working Hours: 1:30 PM – 10:30 PM IST Dual Employment: Not allowed (must be terminated immediately if applicable)
Budget: 1.3 LPM
The vendor will be migrating data from the Teradata database to Google BigQuery. Existing jobs built in BTEX and Informatica will be converted to GCP Dataflow jobs. The Data Engineer will be responsible for:
Validating and ensuring data accuracy post-migration.
Supporting new development initiatives within the Cerebro GCP Platform.
Design, develop, and maintain scalable ETL/ELT pipelines for structured and unstructured data.
Optimize data storage and retrieval using BigQuery and GCP Dataflow.
Work with cross-functional teams to define data architecture and model design.
Support data validation, quality assurance, and troubleshooting of pipeline issues.
Collaborate with the DevOps team to automate deployments and implement CI/CD pipelines.
Ensure data security and compliance following organizational and GCP best practices.
Strong proficiency in SQL, Python, PySpark, and Spark SQL for data manipulation and transformation.
Deep understanding of data modeling, normalization, and schema design.
Experience in data lake, data warehouse, and data mart design and management.
Knowledge of batch and streaming data processing architectures.
BigQuery: Data warehouse design, optimization, and cost management.
Dataproc: Running Spark/Hadoop jobs.
Cloud Data Fusion: Designing, developing, and deploying data pipelines.
Cloud Dataflow (Apache Beam): Scalable stream and batch data processing.
Cloud Storage: Data lake setup and object lifecycle management.
Cloud Composer (Airflow): Workflow orchestration.
Cloud IAM: Role-based access control and security best practices.
VPC, Firewall, and KMS: Network configuration and data encryption.
Terraform / Deployment Manager: Infrastructure as Code (IaC).
CI/CD Pipelines: Automation for build, testing, and deployment processes.
Implement data quality, lineage, and cataloging using tools such as Data Catalog.
Integrate with Looker, Data Studio, or other BI tools for visualization and reporting.
Support compliance, auditing, and data access transparency.
Strong analytical, problem-solving, and communication skills.
Ability to collaborate with cross-functional and international teams.
Experience working in Agile environments.
Understanding of cost optimization, monitoring, and SLOs/SLAs.
Employment Verification
Police Clearance Certificate (PCC) / Criminal Record Check
Must have excellent English communication skills (as teams work internationally).
The candidate must be an in-house bench resource.