Вакансия Data Engineer / Python

5 октября 2020    100

About Us 

IHS Markit is a global information company that provides expertise, data and solutions to 50,000+ customers helping them in making more informed decisions. Our research and development center in Minsk focuses on creation of three intellectual platforms of IHS Markit related to engineering and manufacturing domains. This includes scalable cognitive engines that help users – engineers, innovators and researchers – to discover and leverage knowledge locked in corporate repositories as well as in industry sources.  

The Minsk AI team is looking for new talents in data engineering critical for success of Data Science and Machine Learning/Deep Learning projects related to Natural Language Processing, Data Capturing, Content Understanding and Information Retrieval domains. 

Your Role 

You will be responsible for design and engineering aspects of innovative projects related to automatic structuring and understanding of unstructured content. Your role is needed to build robust data extraction, processing and transformation pipelines, efficiently apply intelligent models, as well as to curate all questions related to data life cycle. 

Your duties will include: 

  • Owning data engineering in projects with Deep Learning, Natural Language Processing, Data Capturing, Information Retrieval within the broad range of areas 
  • Working in the team with data scientists, ML engineers and developers on building the intelligent capabilities into company products 
  • Ensuring dataset quality and suitability for ML projects (automation of labeling, inspection and cleaning, normalization, augmentation) 
  • Discovering new data (finding and obtaining raw data necessary for experimental setups, e.g. find and download available data from internet with focused crawling) 
  • Developing data processing and transformation pipelines (designing ETL system for ML/DL projects, designing online leaning loops, embedding active learning algorithms into data annotation toolset, etc.) 
  • Organizing data warehousing, storage and versioning (make sure ML experiments are repeatable and keep track from data state) 

About You 

You are a talented engineer who is addicted to complex and fuzzy challenges. Your required qualifications and experience include: 

  • Strong coding and software engineering skills 
  • Ability to make good design decisions related to data 
  • Python programming experience 2+ years  
  • Experience with textual data engineering (encoding, formats, tools)  
  • Developed skills in algorithms and data structures 
  • Experience with data processing automation, schedulers and pipeline tools (Airflow, Luigi, Oozie, NiFi, Beam, make, etc.) 
  • Advanced Linux experience (Bash, CL tools) 
  • English language (B1+) 

 The following will increase our interest: 

  • Experience with programming in C++ 
  • Skills and knowledge in math and statistics 
  • Experience with SQL, NoSQL and Graph databases  
  • Experience with big data tools (Hadoop, Hive or Spark, etc.) 
  • Experience with search systems (Elasticsearch, Lucene, Solr) 
  • MS or PhD degree related to computer science, data science or statistics 
  • Publications in related domain 
  • Experience on projects with deep learning, natural language processing or information retrieval 
  • Experience in Machine Learning 

 What we offer 

Open and Collaborative Environment: 

  • Own product development based on science and technology  
  • Personal growth and career development supported on corporate level  
  • Support of self-study and research  
  • Development of own deep learning architectures  
  • Getting custom datasets from the team of professional annotators   
  • Training on powerful private GPU cloud  
  • Research and application of state-of-art models  
  • Development of own unique AI-driven products that work out-of-the-box and loved by world top companies  
  • Great colleagues and open atmosphere at workplace  
  • Knowledge and discoveries sharing inside and outside the team  
  • Collaboration with a great team of ML professionals  
  • Participation in international workshops and conferences  
  • Continuous education with invited tutors and paid online programs  

Employee benefits:  

  • English language classes  
  • Employee stock options plans  
  • Vacation time increase with tenure  
  • Extended medical insurance for employees and their families  
  • Personal accident coverage  
  • Employee assistance program  
  • Reimbursement of sports activities  
  • Corporate and social events 

Подписывайтесь на наш телеграм-канал @remotelist, чтобы всегда быть в курсе новых вакансий! Дайджесты с новыми вакансиями появляются каждые 2-3 часа.

Еженедельная рассылка топ-15 самых просматриваемых вакансий сайта. Письмо приходит каждое воскресенье.