Skip to content

Sample datasets for pricing analysis

I give practical examples of data analysis and share the code that performs these analyses. The analysis I share in these posts require a predefined data set. In this page, I share the sample data set (available for download for free) and explain this data set so that you can run your own analyses.

We can benefit from endless sources of data to run pricing analysis. In order to keep it simple, I chose the essential data sources that you can come across in any organization, regardless of the ERP system the organization uses. This way, we have a common denominator to be able to run these analyses anywhere.

List of data sources

The bare minimum

  • Invoice line data: The transactional data of a sales that is performed is recorded in invoice databases. By choosing invoice data over booking data (or quotes data), we make sure that we take into account only realized sales according to the accounting rules followed by the company.


  • Customer master data: If you like to have your analysis broken down by regions, customer groups, customer types, channels etc. you will need to have a mapping of those. Usually these are kept in customer master data. In smaller organizations, where customers are not grouped in ERP systems, this data might be kept by the FP&A team for reporting. I would definitely start with the FP&A team to trace this data.
  • Product master data: Same as above – if you like to break down your analysis into product groups, products families etc., you will need that mapping.


  • FX Rates per year: If you’re working on multi-currency invoice data sets, you will need a single reporting currency to aggregate your results.
  • Price lists: If you like to calculate deviations / discounts over price lists, you need this data
  • Cost data: if you like to calculate margins and run profitability analysis, you need at least standard cost data files.
  • Kit components: If certain SKUs are bundles of others and you need to analyze the prices, you need the kit-to-component mapping tables.

Sample Data sets for download

Table definitions

Column NameCriticalityData TypeDescription
Invoice dateBare minimumDateThe date the invoice was issued
Invoice NoEssentialInteger / floatIt helps share results with customer care team (i.e. on invoice xx there was this pricing, why?)
SKU CodeBare minimumDepending on your SKU coding, generally varcharThe unique product code
Customer codeBare minimumGenerally integerBill-to customer code (a customer might have multiple ship-to codes as well, make sure pick the one customer code that represents the entirety of a customer)
QuantityBare minimumGenerally integerNumber of units sold on the particular invoice line
RevenueBare minimumDecimal / DoubleAmount charged to customer in that invoice line

Customer Master

Column nameCriticalityData typeDescription
Customer No  Refers to the customer no in invoce data
Customer NameEssentialVarchar / TextHelps read the reports easier
ChannelEssentialVarchar / TextDirect, distribution, online, wholesaler, VAR, authorized dealer etc.
RegionEssentialVarchar / TextCountry, region etc.
Sales RepEssentialVarchar / TextIt makes your conclusions more actionable if you can break it down by sales rep

Product Master

Column nameCriticalityData typeDescription
SKU Code  Refers to the SKU code in invoice data table
SKU DescriptionEssentialVarchar / TextHelps read the reports easier, if the SKU codes are not as representative of the products
Product Group (Type)EssentialVarchar / TextIs this a computer, a printer, a mouse etc.
Product Family (Series)EssentialVarchar / TextIs this a Intel i3, i5 or i7

I will run the examples on these 3 files. I have a MSSQL database where I uploaded these datasets. I chose MSSQL (Microsoft SQL Express), because it is free, you can install it on almost any PC, there are vast resources online if you come across errors, and it can manipulate very large data sets. I analyze millions of lines on a standard desktop, which is not bad for industrial environment.

Leave a Reply

Your email address will not be published. Required fields are marked *