GlobalLogic Company Profile

Lead Data Engineer Python IRC

GlobalLogic Verified

Job Description

The project has the aim to develop AI Infrastructure (AII) is built atop Amazon Web Services resources with a particular focus on Redshift, SNS/SQS, CloudWatch, CloudFormation, Apache Airflow, as well as processing applications running on EC2 and other compute resources in the cloud. Software is developed in Python; the AWS Redshift database implements the Postgres API. The first area will be developing, deploying, and designing monitoring for a large number of new data feeds that will be ingested into the Redshift database from internal and external data streams. The technology involved will be SNS/SQS, EC2 autoscaling cluster, AWS Redshift, CloudWatch. The code will be writing in Python, and CloudFormation will be used for deployment. Extensive unit and integrated testing, and continuous deployment will need to be improved or developed. The second area will focus on developing Airflow/Python script-driven SQL and Spark workflows that will utilize data stored in Redshift to generate appropriate downstream BI. The project will involve extensive SQL query optimizations, spinning up EMR clusters, exporting and importing data from Spark clustering, data cleaning, and validation. The final area will focus on improving continuous testing and deployment of our tools, and improving security in the AWS account

Requirements:

Must have:

  • Strong Python Knowledge
  • Strong knowledge of SQL, including Analytical Functions
  • Good understanding of differences between OLTP and OLAP workload
  • Good understanding of distributed data processing concepts
  • Experience with distributed data processing, e.g. Apache Spark, GCP DataFlow, etc.
  • Good knowledge of Cloud technologies for any Cloud provider
  • Some experience with ETL (or ELT)
  • At least a high-level understanding of Machine Learning methodologies, types, and approaches

Nice to have:

  • Hands-on experience with AWS, namely SQSSNS, Redshift, EC2, S3
  • Knowledge of Apache Airflow
  • Experience with NoSQL distributed databases, especially with Cloud Warehouses (Amazon Reshift, Google BigQuery, Azure Synapse Analytics)

Preferences:

  • NoSQL, AWS Redshift, CloudFormation, AWS Cloudwatch, AWS SNS, AWS SQS, Docker, Apache Spark, Airflow

Job Responsibilities:

Leading engineering group that responsible for development and deployment features to the platform that performs:

  • getting, cleaning and save data from connected data feeds
  • generate interim data aggregation flows that genrates preliminary preparated data flows
  • generate final data aggregates
  • analytics generation ( reports, dashboard, etc. )
  • train ML models
  • improve data access interfaces and tools
  • managing data, i.e retention, backfilling, data managing, segmentation, performance tuning etc.

What We Offer

  • Exciting Projects: Come take your place at the forefront of digital transformation! With clients across all industries and sectors, we offer an opportunity to participate in creating market-defining products using the latest technologies.
  • Collaborative Environment: Expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities!
  • Work-Life Balance: GlobalLogic prioritizes work-life balance, which is why we offer flexible opportunities and options.
  • Professional Development: Our dedicated Learning & Development team regularly organizes certification and technical / soft skill training to help you realize your professional goals.
  • Excellent Benefits: We provide our consultants with competitive compensation and benefits
  • Fun Perks: We want you to love where you work, which is why we host sports classes, cultural, social and team building activities such as sports competitions and end-of-year corporate parties. Our vibrant offices also include dedicated GL Zones and rooftop decks where you can drink coffee or tea with your colleagues over a game of table football or darts!