Data Warehouse in Medium e-Commerce Company
Need for First Company Data Warehouse
Running a company may be seen as repeatable process of making business decisions. Let’s consider medium size e-commerce company. Obviously, there are different departments and processes within. As we may expect, there is not single application to cover all of needs and expected usability for organization. In this article, I would like to cover how to connect data into single source, allowing conformed data elements that allows BI applications efficiently visualize organization data, and finally to make accurate decisions on top of it. That’s is the underlying story for setting up a mature Data Warehouse (DWH) in any company.
There are few reasons why companies start building DWH. When company is growing, it is starting to produce more and more data in new areas of interests. Main factor then is to separate daily operations from analytics. This will free core systems from performing large scale reporting. DWH then will run, right after business hours, main ETL process (Extract, Transform and Load) to fed up data into own storage. This will not affect performance of operational systems when employees are using it. Since data feeding process will end, data warehouse has full night to prepare data for analytics and reporting. This allows developers to slice and dice data with very deep dive into data. With proper storage of facts and dimensions, BI applications will be free to go with most complex reports without paying in processing time. Moreover, growing company also start to introduce another operational systems when new processes starts. Data warehouse should be ready to add another source into ETL process. Therefore, Architect role is to chose suitable model (Kimball vs Inmon, Star vs Snowflake, etc.) to allow balance between flexibility and scalability.
Although, DWH may be seen as back up for company operational data, I will not cover that in this article. Moreover, there is also problem when DWH encounter multiple data source and how to conform it into single point of truth. Let me skip those topics, I will cover them in another articles.
E-commerce company is focusing mostly on sales: retail (B2C) or B2B. Sales is core for e-commerce company, but there are multiple other business areas that add value, or even give competitive advantage. Let’s see sample structure of sales e-commerce company:
ERP vs Data Warehouse
Company could consider implementing holistic ERP system. In real world examples, ERP is mostly introduced when manufacturing process is part of the business. In other case when products are purchased from another organizations, ERP is replaced by CRM (as vendors in that case are similar to customers). So, CRM in our example is first data source system. Then, there is Logistic and Warehouse for products. We assume that some products are imported, and some of them is good to have in stock, as this will give us more flexibility and credibility as supplier for B2B. This process will be in another smaller system, tailor made for keeping low cost of rotation and stock. Another part is Accounting and Finance, where Accounting has own structures. This will be connected with CRM to generate invoices and send it to clients. Then, there is also Analytics, that build data warehouse for BI system, and deliver data to Finance, Marketing and Management (Top Managements and Operational Leaders as well). All parties are somewhere as clients or suppliers for Enterprise DWH.
Below, there is showed simple diagram of data flow with major break points.
Conclusion and next steps
Building DWH in our example company structure, will definitely encounter problem of joining data from different sources. We have three systems, but DWH should present dimensional model across all of them. This will start conformance process. This topic will be covered in another article. There is also data quality issue, that has been describe in article: Data Quality in Company. Although, DWH were described just as an idea, I will finish it here. You need to wait for more technical blog to come.
Author
Grzegorz Gruszka
Many companies see 'constant change' as baseline motto for process optimization, but people like me are those who translate that motto into action points. There is huge values in data, but my goal is to see data as an organizm, because understanding each small part does not replace big picture of organizm linkage to ecosystem. Data are part of our World. Let us start change it based on deep discoveries in data.