Data Anomalies: Why They’re Important & What They Tell Us

Data Anomalies: Why They’re Important & What They Tell Us

By Jan Flatley-Feldman | #EuclidData | 01 June, 2018

If you’re a marketer looking to serve your customers ads, it’s imperative to know which of your store visitors actually are customers, rather than, say, employees. That way you can ensure the best use of your marketing spend on ads and have confidence your ads are served to the right people at the right time. Some location-data technologies are more accurate than others in identifying store visitors, and therefore better at increasing ad effectiveness. For example, using low-precision location data (e.g. GPS data) can result in targeting your employees, neighbors, and passers-by. Euclid’s unique, Wi-Fi-based solution allows us to continuously identify a store’s customers (with their explicit permission) throughout their visit rather than at a single point in time. This allows us to differentiate true customers from passers-by and staff. How do we do this? By identifying data anomalies.

What is an anomaly?

Data Set Anomolies

By comparing your customers’ visit data with the historical data from the many millions of customer visits in our network, we can compare visit characteristics, such as duration, frequency, and signal strength, to determine which types of sessions typically correspond to customer visits. Our experienced team of data scientists then use this information (drawn from observing and understanding trillions of data points) to create our classification and prediction models. When a session differs substantially from the norm in terms of time of day, visit duration, or frequency, we recognize this as an anomaly. In classifying an atypical visit as an anomaly, we are able to remove (or filter) it from our dataset, ensuring the remaining visits provide a more accurate picture of your customers’ true behaviors.

What do anomalies show us?

Data Set Anomalies

This now clean data delivers accurate and trustworthy results. For instance, we can reliably see how these characteristics vary across sectors, such as geographic regions, day of week, and time of day. Removing anomalies illuminates which visits are tied to customers rather than employees or printers and Wi-Fi access points that remain in the store far longer than a customer would. This allows us to flag non-customer profiles and create predictive metrics to define and forecast accurate shopper behavior patterns. Euclid simultaneously ingests transaction and marketing campaign data, allowing us to further filter the data and understand which types of visits and campaigns result in conversions. Because of our unique data-driven understanding of customer behavior, Euclid can also identify individuals who are frequently in the area and could potentially be interested in becoming customers.

Filtering the data through these various lenses, and removing anomalies, enhances the accuracy and potency of our client’s data, and empowers their marketers to better track ROI, and attribute store visits and purchases to specific campaigns and ads. Without anomalies we wouldn’t have an accurate understanding of store visitor data or the best way to use it to better foster brand-customer relationships.



Jan Flatley-Feldman

Jan Flatley-Feldman

As a top data scientist here at Euclid, Jan applies his statistical background to analytics and machine learning, contributing to customer solutions, and ensuring the dataset is reflected accurately.

More posts by Jan Flatley-Feldman