Big Data Engineering TL
Remote
Full Time

TRG Research and Development deals with Cyber Data Fusion and AI products for civilian protection. Our mission is to empower our customers to fight crime and terror through state-of-the-art technologies that provide accurate and precise intelligence.

As a Big Data Engineering Team Lead, you will be instrumental in accelerating and scaling our data pipelines and data lakes. Your primary focus will be on researching optimal solutions appropriate for the aforementioned purposes and then implementing, maintaining, and monitoring them.

Who you are:

  • Proficiency with Scala and Apache Spark
  • Proficiency with Hadoop ecosystem services such as MapReduce v2, HDFS, YARN, Hive, HBase
  • Experience with any data lake table formats (e.q. Apache Hudi, Apache Iceberg, Delta Lake)
  • Experience with building stream-processing systems using solutions such as Apache Kafka and Apache Spark streaming
  • Experience in orchestration tools (e.q. Apache Airflow)
  • Experience with integrating data from multiple heterogeneous sources and various formats (Parquet, CSV, XML, JSON, Avro)
  • Experience with SQL databases and NoSQL databases, such as Elasticsearch and MongoDB
  • Nice to have hands-on experience with Kubernetes
  • Strong communication and teamwork skills

What you will do:

  • Establish, lead, manage and mentor the big data team
  • Own the development of an inhouse Data Lake for storing structured and unstructured data
  • Research, design, and develop appropriate algorithms for Big Data collection, processing, and analysis
  • Define how data will be streamed, stored, consumed, integrated by different data systems
  • Identify relevant Big Data tools required to support new and existing product capabilities
  • Collaborate closely with the product team to define the requirements and milestones that relate to Big Data features
  • Closely interact with the Data Scientists in providing feature-ed datasets
  • Design, create, deploy, manage data pipelines within the organization
  • Create data architecture documents, standards, and principles and maintain knowledge on the data models
  • Collaborate and coordinate with multiple teams/departments to identify the data domains and data gaps between current state systems and future goals
  • Communicate clearly and effectively the data entities and their relationship within a business model
  • Audit performance and advise any necessary infrastructure changes
  • Develop key metrics for tests on the data end create data quality rules
  • Focus on scalability, availability, and data governance

We provide:

  • Friendly atmosphere in our Gemicle family
  • Ability to work with high-loaded projects with millions of users
  • Working in an international environment
  • Competitive salary
  • English classes
  • Gym discount
  • Flexible working hours and no overtimes
  • 18 holidays + 2 days off
  • Paid sick leaves
  • Cozy office in the centre of the city with fruits, cookies...
  • Regular team buildings and nice presents for holidays