Data Engineer

About

About Inarix

Inarix provides AI services for agricultural settings. We offer powerful digital tools for cereal qualification, an efficient alternative to complex hardware solutions. We successfully launched our first product in 2020, generating sizable revenues from top-tier clients in France and now operate in 10+ countries across 3 continents with a team of 35.

Now, we aim at broadening our innovative technology's reach and acquire new customers worldwide while expanding our product line.

About Inarix Product Offer

The Inarix mobile app, Pocket Lab, allows customers to photograph cereals to quickly access information about their quality. Our digital solution offers a radical unique value. It can replace existing solutions, such as hardware or expert laboratory analysis, and greatly expand the scope of what can be measured. This provides new tools to help the agricultural sector address challenges like rapid quality assessment, supply chain optimization, and traceability.

Behind the scenes, we use advanced Deep Learning algorithms to estimate various criteria from images, such as variety, protein level, and the percentage of broken grains. We have what we believe to be the largest database of grain images in the world, which serves multiple purposes. These include exploration, monitoring, labeling, and training of machine learning models. It is made possible by a state-of-the-art in-house data platform providing services both for our customers facing products and our R&D team. It ingests tens of thousands of data points and images every day, transforms them, enriches them, and makes them available to all stakeholders.

Job Description

As a data engineer, you will be in charge of the heart of Inarix: making our models run in a heartbeat and data flowing everywhere.

You'll quickly become a key contributor to our data platform which is used both by our customers and by our research and development team. This means improving constantly the current platform as well as expanding it with the latest technologies, by benchmarking, prototyping, and shipping new bricks in production. Currently based on a combination of Python, PostgreSQL, dbt, Dagster, Qdrant, and cloud services (AWS & GCP). You'll have the opportunity to expand and transform these services to support our ambitious growth plan.

You'll play a crucial role in the tools and processes that allow us to:

  • Collect large datasets continuously from various sources, filter, sort, process, store, and redirect data into our training pipelines, R&D experiments, and analytics solutions. Importantly, we expect to leverage AI-agent pipelines to ingest messy data locked in documents and images.

  • Support data access to our R&D team by contributing to our ETL processes (APIs, dbt, PostgreSQL) and our core library in python.

  • Improve and extend our data model, including the addition of new crop quality criteria and traceability data.

  • Expand our data monitoring and data-quality control using pipelines, models, dashboards, alerts, tracing products, etc.

As you have now understood, this job is as challenging as it can be rewarding. We don't expect you to know everything already and as Inarix evolves, the position will too. You'll have the opportunity to learn a lot and teach us a lot too.

You'll be managed by our Head of Software and have the opportunity to work with many teams within Inarix.

Working at Inarix

Inarix is a remote-first company: we work mostly in a distributed fashion. We provide the equipment and means for you to work efficiently from home, a co-working space or an internet-connected tree-house. We value this flexibility and the diversity it fosters.

Although we are mainly digitally (ultra) connected, we value and regularly organise in-person meetings: we have company-wide residential seminars quarterly. In addition, you should expect approximately 1 meetings/ month for this position. Most meetings will be held in Paris where we have an office so your location will affect your travel needs. Travel & accommodation costs to attend our meetings from most French Metropolitan cities will be entirely covered.

If you live outside of France, we will require you to work within 3 hours of the French time-zone; we also require you to have French fiscal residency or work through a third-party company; a specific travel package will be negotiated along with your salary. We work in English and therefore strongly encourage non-French speakers to apply.

About the stack

We don't really need sentences here do we?

Python, PostgreSQL, SQL, dbt, Dagster, count.co, Docker, Google Pub/Sub, Google FileStore, Google BigQuery, AWS S3, Google Storage, Kubernetes, ArgoCD, Newrelic, AzureDevOps, Slack, PyTest

Missions

  • Contribute to our data ingestion, transformation, and access layers

  • Contribute to data modeling for new projects & products

  • Contribute to expanding our data platform capability (data warehousing, lineage, monitoring)

  • Interact with our R&D team about the usage of the data platform and how to continuously improve it

Preferred Experience

Must Have

  • 2+ years of experience as a Data Engineer or similar position

  • Working experience with Python (Pandas, NumPy & common libraries)

  • Working experience with SQL

  • Working experience with any cloud platform (AWS, GCP, or Azure)

  • Git, Github, or alternatives

  • Docker, docker-compose

  • English (good written and spoken English)

  • Autonomous, Proactive, Team-player, Good communication

Nice to Have

  • Experience with PostgreSQL, BigQuery

  • Experience with ETL pipelines (dbt, dagster, argo-workflow, airflow, or any other)

  • Experience with no-SQL databases (Elasticsearch, MongoDB, or any other)

  • Worked on an ML-oriented data platform

Recruitment Process

The whole process is comprised of 4 interviews + a technical test

Additional Information

  • Contract Type: Full-Time
  • Location: Paris
  • Possible full remote
  • Salary: between 55000€ and 75000€ / year