Data Scientist (Azure)

About Trantor:

Trantor is a technology services company focused on outsourced product development and digital re-engineering. Leveraging our CaptiveCoE™ engagement model, we operate as a seamless extension of our client’s teams to provide rapid scalability with predictable budgets. Founded in 2012, Trantor has worked with customers across Tech, FinTech, Media & Cybersecurity industries. We have centers in the US, India, Canada, and Costa Rica. We are consistently rated as the #1 employer in the region with the ability to attract and retain technical talent. Our commitment to excellence and impactful results has translated to long-term relationships and value for our clients and solution partners.                                                                                                                                       


Job Description:

In this role, we are seeking a data engineer to design, implement, and optimize cloud-based data pipelines using Microsoft Azure services including ADF, Synapse, and ADLS.


Job Role & Responsibilities

  • Collaborate with our business analytics and data science teams, gathering requirements and delivering complete business intelligence solutions
  • Mentor junior software developers and build a strong team
  • Model data and metadata to support discovery, ad-hoc, and pre-built reporting
  • Design and implement data pipelines using Hadoop, Spark, and Azure services such as Blob Storage, SQL Database, Event Hubs, Data Factory, Synapse Analytics, and Databricks
  • Should be able to write Programs & Scripting, Strong in SQL, Proficiency in Python or Scala. Experience with PowerShell or Azure CLI for automation is a plus
  • Partner with security, privacy, and legal teams to deliver solutions that comply with security and privacy policies
  • Own the design, development, and maintenance of datasets our business analytics teams will use to drive key business decisions
  • Develop and promote standard methodologies in data engineering, including scalability, reusability, maintainability, and usability
  • Tune and ensure compute performance by optimizing queries, databases, files, tables, and processes
  • Ensure data and report service level agreements are met
  • Analyze and solve problems at their root, stepping back to understand the broader context
  • Own continuous engineering operational excellence of the datasets that drive key business decisions
  • Learn and understand a broad range of data resources and know when, how, and which to use and which not to use
  • Keep up to date with advances in big data technologies and run pilots to design the data architecture to scale with increased data volume using Azure
  • Continually improve ongoing reporting and analysis processes, automating or simplifying self-service support for datasets
  • Trage many possible courses of action in a high-ambiguity environment, making use of both quantitative analysis and business judgment 15 Qualifications we seek in you! Minimum

 Skills required

  • Bachelor’s degree in computer science or related technical field
  • 10+ years of experience in data architecture and business intelligence
  • 5+ years of experience in developing solutions in distributed technologies such as Hadoop, Spark
  • Experience in delivering end-to-end solutions using Azure services – Blob Storage, SQL Database, Event Hubs, Data Factory, Synapse Analytics, and HDInsight
  • Experience in programming using Python, Java, or Scala
  • Expert in data modeling, metadata management, and data quality
  • SQL performance tuning
  • Strong interpersonal and multitasking skills with the ability to balance competing priorities
  • Excellent communication (verbal and written) and interpersonal skills and an ability to effectively communicate with both business and technical teams
  • An ability to work in a fast-paced ambiguous environment where continuous innovation is occurring
  • Experience with a business intelligence reporting tool

Preferred Skills

  • Experience with Databricks for advanced analytics and data processing
  • Understanding of well-architected data pipeline designs
  • Expertise in monitoring and fault-tolerant pipeline designs
  • Knowledge of cluster configuration for optimal performance
  • Ability to create cost-optimal solutions
  • Experience in exploratory data analysis (dashboarding, plotting) using machine learning technologies and algorithms is desirable
  • Good knowledge of standard machine learning techniques (like regression, classification, anomaly detection, forecasting) by using standard machine learning libraries part of Spark, Python is desirable
  • Prior experience in gen AI and related tools and techniques (such as large language models, prompt engineering) is desirable
  • Having a relevant Azure certification (architecture/data/machine learning) is desirableData Engineer
Job Category: data engineer
Job Type: Full Time
Job Location: Noida
Shift Timing: UK Shift

Apply for this position

Allowed Type(s): .pdf