As a data engineer, you will be in charge of the heart of Inarix: making our models run in a heartbeat and data flowing everywhere.
You'll quickly become a key contributor to our data platform which is used both by our customers and by our research and development team. This means improving constantly the current platform as well as expanding it with the latest technologies, by benchmarking, prototyping, and shipping new bricks in production. Currently based on a combination of Python, PostgreSQL, dbt, Dagster, Qdrant, and cloud services (AWS & GCP). You'll have the opportunity to expand and transform these services to support our ambitious growth plan.
You'll play a crucial role in the tools and processes that allow us to:
Collect large datasets continuously from various sources, filter, sort, process, store, and redirect data into our training pipelines, R&D experiments, and analytics solutions. Importantly, we expect to leverage AI-agent pipelines to ingest messy data locked in documents and images.
Support data access to our R&D team by contributing to our ETL processes (APIs, dbt, PostgreSQL) and our core library in python.
Improve and extend our data model, including the addition of new crop quality criteria and traceability data.
Expand our data monitoring and data-quality control using pipelines, models, dashboards, alerts, tracing products, etc.
As you have now understood, this job is as challenging as it can be rewarding. We don't expect you to know everything already and as Inarix evolves, the position will too. You'll have the opportunity to learn a lot and teach us a lot too.
You'll be managed by our Head of Software and have the opportunity to work with many teams within Inarix.
Working at Inarix
Inarix is a remote-first company: we work mostly in a distributed fashion. We provide the equipment and means for you to work efficiently from home, a co-working space or an internet-connected tree-house. We value this flexibility and the diversity it fosters.
Although we are mainly digitally (ultra) connected, we value and regularly organise in-person meetings: we have company-wide residential seminars quarterly. In addition, you should expect approximately 1 meetings/ month for this position. Most meetings will be held in Paris where we have an office so your location will affect your travel needs. Travel & accommodation costs to attend our meetings from most French Metropolitan cities will be entirely covered.
If you live outside of France, we will require you to work within 3 hours of the French time-zone; we also require you to have French fiscal residency or work through a third-party company; a specific travel package will be negotiated along with your salary. We work in English and therefore strongly encourage non-French speakers to apply.
About the stack
We don't really need sentences here do we?
Python, PostgreSQL, SQL, dbt, Dagster, count.co, Docker, Google Pub/Sub, Google FileStore, Google BigQuery, AWS S3, Google Storage, Kubernetes, ArgoCD, Newrelic, AzureDevOps, Slack, PyTest
Missions
Contribute to our data ingestion, transformation, and access layers
Contribute to data modeling for new projects & products
Contribute to expanding our data platform capability (data warehousing, lineage, monitoring)
Interact with our R&D team about the usage of the data platform and how to continuously improve it