5+ years of software development experience, ideally at a product company.
3+ years of experience building and supporting scalable as well as fault-tolerant batch, streaming, real-time and/or near-real-time data pipelines.
3+ years experience with one or more data-flow programming frameworks such as Apache Beam, Flink, Argo, Airflow, Prefect, etc.
Strong knowledge of data modeling experience with both relational and NoSQL databases
Hands-on experience with data warehouses, preferably AWS Redshift
Expert knowledge of SQL and Python.
Ability to work independently to deliver well-designed, high-quality, and testable code on time.
Capable of mentoring more junior developers.
English — upper-intermediate / advanced
Working with big data ecosystem tools such as:
— Singer, Airbyte, etc.
— Apache Kafka, AWS Kinesis
— Kafka Streams, Streamz, Storm 2.0, etc.
— Apache Hive, Spark, Iceberg, Presto, AWS Athena, etc.
Understanding and hands-on experience with Kubernetes
Open source projects on GitHub
— Silicon Valley Experience;
— 3 weeks of paid vacation and 2 weeks of days off+sick leaves;
— Hackers’ days;
— Corporate retreats;
— Paid lunches and parking;
— Covering professional learning: conferences, trainings, and other events;
— Sports activities compensation;
— English Speaking Club with native speakers;
— Medical insurance;
— VGS stock options.
Design, implement and operate large-scale, high-volume, high-performance data structures for analytics and data science.
Implement data ingestion routines both real time and batch using best practices in data modeling, ETL/ELT processes by leveraging various technologies and big data tools.
Gather business and functional requirements and translate these requirements into robust, scalable, operable solutions with a flexible and adaptable data architecture.
Collaborate with engineers to help adopt best practices in data system creation, data integrity, test design, analysis, validation, and documentation.
Collaborate with data analysts and scientists to create fast and efficient algorithms that exploit our rich data sets for optimization, statistical analysis, prediction, clustering and machine learning.
Help continually improve ongoing reporting and analysis processes, automating or simplifying self-service modeling and production support for customers.
Oversee junior team members activities and help with their professional grows.