Methodology

Introduction

The Diamond Price List - DPL™, created for the diamond industry, is the result of extensive research, involving a unique and vast set of data available only to the DPL. The methodology is based on advanced Machine Learning, Artificial Intelligence (AI) and data science algorithms, creating an optimized price list to reflect the market in the most accurate and objective manner.

Artificial Intelligence algorithms rely on the availability of large amounts of quality data to run models that can learn complex and sophisticated functions. The DPL uses sophisticated algorithms and has access to an extensive set of rich data, affording it a competitive advantage when creating accurate and reliable artificial intelligence models.

Scheme Overview

methodlogy

Data Exploration

Firstly, we explored extensively the rich data we have on diamond asking prices and detected behavioral trends between prices in different cells in the same table or at similar locations across tables. Our next step is to calculate a final market price representation using supply and demand data and market analysis insights. We investigated the relationship between asking prices to the aforementioned market price representation. In particular, we analyzed the discount percentage of a certain stone from it’s final market price representation - distribution, consistency, and different factors affecting it.

Figure 1

Example of a heat map table with the discount percentage between our platform average asking price to our market price representation. The darker the color, the lower the average asking price compared with the market price. This kind of analysis enables us to detect some in-table and cross tables pricing trends between different cells.

data exploration

Preprocessing

Anomaly detection is a step in data preprocessing that identifies data points, events, and/or observations that deviate from a dataset’s normal behavior. Anomalous data can indicate critical incidents, such as a technical glitch, or potential opportunities, for instance a change in consumer behavior. Machine learning is progressively being used to automate anomaly detection.

Performing an extensive data exploration and using advanced anomaly detection methods, we were able to identify the data sources with the highest correlation to the final price list price and under what conditions. Conversely, using these methods led to an understanding of what noisy data should be filtered out. Those insights led us to implement data preprocess pipelines ensuring only relevant and reliable data is used as an input to our models.

Model - Training

Our model uses state of the art machine learning methods adapted specifically to our application. We have modeled the task here as a regression problem: For each table cell we take as input many variables, such as the average asking price, market analysis, supply and demand, its neighbors cells values, etc. As an output we expect to get the price list price value for the cell. During training, we use as a ground truth the market price representation we created earlier on. In addition, there are some restrictions and rules that apply to the model output whose purpose is to ensure consistency in the table and adherence to other domain conventions.

Figure 2
Sample Reference Prices Vs DPL Average Asking Price. The linear line (White) represents a multiplication of the asking price by a uniform factor. The Green dots are DPL price list prices and the red dots are the reference prices. It can be seen that DPL’s prices behave more linearly relative to the asking price and are therefore more consistent with market asking prices.

data exploration

Post Processing

Anomaly detection is a step in data preprocessing that identifies data points, events, and/or observations that deviate from a dataset’s normal behavior. Anomalous data can indicate critical incidents, such as a technical glitch, or potential opportunities, for instance a change in consumer behavior. Machine learning is progressively being used to automate anomaly detection.
Performing an extensive data exploration and using advanced anomaly detection methods, we were able to identify the data sources with the highest correlation to the final price list price and under what conditions. Conversely, using these methods led to an understanding of what noisy data should be filtered out. Those insights led us to implement data preprocess pipelines ensuring only relevant and reliable data is used as an input to our models.

Result

The result of the process described above, is an up-to-date price list, based on machine learning algorithms and the use of extensive and unique data, and at the same time undergoing the necessary regulation so that it meets market conventions. We truly believe that such work will serve the entire diamond market and create an objective standardisation for diamond prices from now on.

Update Process

update process

We see the update process of DPL as a significant advantage we have over other methods. Here, pricelist updates rely only on real world data, and therefore reflect market trends in the most objective and accurate way. The data participating in the update process is diverse and includes, among other things, asking price changes, supply and demand, and market analysis.
Similar to the process described above for pricelist creation, all relevant data is being preprocessed and fed into a machine learning model in order to determine the change percentage for each cell in the pricelist. Postprocess steps are taken here as well in order to make sure prices values are valid and consistent.
In the last year we have been constantly monitoring various data on our platform and conducting studies to further improve our models. We consistently see empirically that our pricelist responds correctly to different market trends compared to competing methods.

[email protected]

Powered By

lucy logo

All rights are reserved to DPL™ Inc.