Could a development we’ve made unintentionally break something else without us being aware of it? If a metric we are tracking starts to give incorrect values, how can we quickly identify it?
Within a dynamic working environment like Adevinta’s, it’s not uncommon for even the slightest tweak in code to ripple through our metrics, causing unexpected disruptions. Given that we meticulously monitor over 140 metrics, it’s naturally challenging to catch every fluctuation. Consequently, deviations, especially in pivotal metrics, might only come to light well after the initial disturbance, casting shadows on our reporting and experimentation efforts.
This was the case when we encountered a marked decline in the Leads metric on the iOS platform. This wasn’t just a minor dip; we’re talking about a staggering drop of approximately 100K leads daily. To put this into perspective, that’s a negative impact to daily revenue of about 41K euros.
Thanks to the alerting and dashboard system from Artemis, we were able to detect the drop quickly, and resolve it in less than a day.
Who are we?
We are a passionate group of analysts at Adevinta Benelux, focused on data quality. We started Artemis as a side project in Q3, 2023. As the main source of quantitative insights in our cross-functional product teams, we are heavily relied on to identify and resolve data quality issues. However, having access to clean data, and subsequent insights, is key to driving success for the team on the whole. So we started asking ourselves: How might we encourage ownership of data quality across the whole team? How can we identify issues and resolve them in a timely way? Driven by an Adevinta-wide push for improving data quality, we set out to create a tool that is easy to use, managed centrally, and prompts data-driven thinking.
Introducing Artemis
Artemis, at its core, is about turning the data we navigate daily into actionable insights, ensuring our marketplaces like Marktplaats and others within Adevinta remain at the pinnacle of operational excellence. Its vigilance follows a cyclical path – from observing deviations in our metrics to alerting our teams to instigating a swift investigation and steering towards a solution.
When there is an observed issue, we send customised alerts on Slack, making sure to tag the right team, indicating the platform that’s flagged and the metrics that are affected.
This helps teams to swiftly address the problem; pinpointing the core issue and releasing a fix. This ability to respond rapidly underscores the tool’s indispensable role in maintaining the smooth operation and resilience of our systems.
Artemis isn’t just another tool; it’s a paradigm shift in how we approach data health within our ecosystems.
The Heart of Artemis: Predictive Analysis and Proactive Health
You might be wondering: How exactly does Artemis work? How accurate are the alerts?
Predicting with Prophet
At the helm of Artemis’ predictive prowess is Prophet, the open-source software released by Meta’s Core Data Science team.
Prophet is a tool for forecasting time series data based on an additive model. Non-linear trends are fit with yearly, weekly and daily seasonality, as well as holiday effects. It works best with time series that have strong seasonal effects. Prophet is robust to missing data and shifts in the trend and typically handles outliers well.
The model enables Artemis to look into the future, predicting the upper and lower bounds of our key metrics, ensuring we’re always a step ahead.
Optimising operations and model execution
The model operates within the Databricks scheduling ecosystem and its codebase is securely housed in a GitHub repository. This setup not only grants us oversight of versioning history but also harmonises the execution schedule within an AWS-compatible framework.
To streamline the model’s operations and maintain its autonomy, metric names and definitions are dynamically sourced from a JSON file curated by the Adevinta Benelux teams. This system simplifies the process of modifying the tool’s metric roster, eliminating the need for manual adjustments and enhancing our agility in responding to evolving data landscapes.
For optimal efficiency and cost-effectiveness, the tool processes only the most recent day’s metric data, which is then stored in a partitioned table within S3. The model is trained daily using an extensive historical dataset—excluding the latest 10 days to mitigate the impact of any recent anomalies—and sets its sights on forecasting the next 40 days.
To elevate the model’s performance, we’ve fined-tuned it with some manual adjustments:
1. Mitigating Outliers: We’ve instituted a method where any data points straying beyond 3 standard deviations are tempered, and replaced by the average of the previous 28 days. This strategy significantly diminishes bias from temporary data issues (e.g. a sudden metric drop) over the model’s future performance.
2. Selective Historical Data Curation: In instances of substantial transitions (such as a major change on the ad page), we manually remove the affected historical data pre-transition from our predictions. Though this method reduces our data history, it has demonstrably sharpened the model’s predictive accuracy.
3. Enhanced Predictive Buffer: Despite Prophet’s design for precision forecasting, our primary objective leans towards identifying metric disruptions. To this end, we’ve opted to increase the predicted boundaries by 10%. This adjustment aims to dial down the rate of false positives, even if it slightly compromises the model’s accuracy.
4. Relevant Alerts: In case a metric is out of range for three consecutive days, we added an alert on a specific Slack channel that directly tags the responsible team in order to ensure a rapid and effective response.
Navigating the Future with Artemis
Artemis embodies the principles of the data mesh architecture. Data mesh is a conceptual framework that focuses on four key pillars, which Artemis not only adheres to but brings to life within our operations.
1. Domain-Oriented Distributed Architecture: Artemis is built with a clear understanding of the distinct domains it serves. It decentralises data ownership and architecture, ensuring that data is treated as a product. This means that each domain within Adevinta Benelux can manage and govern its own data landscape.
2. Product Thinking (Data as a Product): With Artemis, data is not a byproduct of our operations; it is the product. This mindset shift is crucial, as it focuses on creating data products that deliver value to both internal stakeholders and external customers. Artemis supports data that is accessible, reliable and effectively used to drive decision-making processes across all levels of Adevinta.
3. Self-Service Infrastructure as Platforms: Artemis champions the principle of self-service by enabling teams to interact with data without the need for specialised intermediaries. By offering a platform that teams can use independently, Artemis facilitates a more dynamic and agile approach to data quality.
4. Federated Computational Governance: Artemis supports the principle of federated governance. While the autonomy of domains is respected, Artemis provides a governance model that ensures data quality across the entire organisation.
The journey of Artemis is a testament to the power of the data mesh approach. It sets the stage for a future where every strategic move is informed by high-quality data.
In Conclusion: A New Era for Adevinta
As we continue to expand the reach of Artemis, we are deeply grateful for the unwavering support provided by the product teams throughout this process. We look forward to continued collaboration and development.
Artemis marks the dawn of a new era in data analytics at Adevinta. It’s a testament to our commitment to innovation, our dedication to data health and our vision for a future where every piece of data empowers us to create better experiences for our users across the globe.