Dw allegro alain ozan.


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Dw allegro alain ozan.

  1. 1. Allegro’s DWH Implementation on Oracle Database Machine with OWB & OBIEE Rafał Kudliński, BI Manager Allegro Group
  2. 2. Allegro Group operates market leading e-commerce trading platforms; general, automotive and real estate classified sites; a price comparison site; and payment web services across Eastern Europe under various brands. Allegro Group operates 46 platforms in 13 countries. Do you really know Allegro? 2 Rafał Kudliński – BI Manager Allegro Group
  3. 3. Allegro is the most successful eCommerce trading system in Poland and the largest non-eBay auction platform worldwide. Do you really know Allegro? MORE THAN 14.5 1000 EMPLOYEES MILLION USERS 250 employees in the Over 9500 new users IT department: every day. 150 in the Over 3.5 million new development division users every year. and 100 in other IT divisions (IS, DWH, R&D, etc.). MORE THAN 500 MORE THAN 90 MILLION PAGE MILLION LISTED VIEWS (Peak) ITEMS The number of page The number of listed views has doubled items has increased within the last 3 years. by 75 million over the past 3 years. 3 Rafał Kudliński – BI Manager Allegro Group
  4. 4. Our figures are extraordinary in all areas. We use a leading-edge technology wherever possible. The power consumption of our DCs is bigger than 800 average households together! Do you really know Allegro? 4 Rafał Kudliński – BI Manager Allegro Group
  5. 5. We decided that measuring our internet services performance is most important for our business growth. It covers both services and user areas. We have to really understand what is happening on our websites and who is our most valuable user. Business requirements (Priority 1) What is our internet services performance? We need to play with data, simulate and have influence on growth; Users expecting quality and we are expecting more sales; We need to measure the categories?; We need to know if our pricing model is optimal?; We need to know when the growth is getting slowed down? What influence the success rate for auctions for different categories? We need to know our refunds, fees; We need to analyse the method of payment; Auctions on the front page and we need to analyse the conversion rate and why? We want to measure the effect of the changes in the categories; We need to benchmark countries; What products are sold most frequently ? new or used? What is source of our profit?; What is result of search? What user do after that? We need to know where user click and what he do on our site? 5 Rafał Kudliński – BI Manager Allegro Group
  6. 6. The Operational performance and marketing campaign effectiveness is also crucial to our business. Co brand and affiliate programs as valuable sources of traffic and new users registrations have to be carefully monitored as well. Business requirements (Priority 2 and 3) What is our operational performance? We need to do our job faster, better, more efficient; we need to know which projects to realize, which are profitable; We need to measure the activity? What is our Marketing campaigns performance? (We need to measure the campaigns?; We need to know effect of marketing action? source of traffic? need to track the result of our spending. What is our co-brand performance? What are the sources of new registrations? What is our Affiliate Performance? We need to know affiliation program impact? What is offered Products performance? We need to measure products? 6 Rafał Kudliński – BI Manager Allegro Group
  7. 7. Agenda 7 Rafał Kudliński – BI Manager Allegro Group
  8. 8. The first project took 6 months to complete, with 8 people working on it. Support from external companies was necessary due to the implementation of a new technology and software. Projects in Numbers Project duration : 12 months Project team: 8 – 12 people Man/days spent: 800 Active Users – 120 Implemented reports – 100 Implemented KPIs – 160 Biggest source system size – 7TB Largest Tables – 2.8 billion records 8 Rafał Kudliński – BI Manager Allegro Group
  9. 9. We load data from a real time copy of the production system. Extraction and transformation processes are performed to load data to DWH production scheme. Finally aggregations are built to improve query processing performance. We use OWB as ETL tool. Data warehouse architecture Allegro Logical Production Standby Load DWH Staging Area ETL DWH Production Oracle DB Oracle DB Oracle DB Oracle DB Oracle Data Guard Production Environment ETL 2 * IBM P590 Machine DB 1 DB 2 .. MySql DB ..10 MySql OWB MySql DataMart Oracle DB Click Stream recording Environment Data Warehouse Environment 10 * DL360 Machine Oracle Database Machine 9 Rafał Kudliński – BI Manager Allegro Group
  10. 10. We use Oracle Business Intelligence Enterprise Edition as BI tool. OBIEE is connected to both Target and DataMart schemas. We have almost 120 active users. 10 power users perform Ad- hoc queries. Allegro DWH & BI system architecture Interactive Dashboards DWH Production Ad-hoc Oracle BI Server Oracle BI Server Target Analysis Transaction Platforms Delivers DWH Production and Alerts DataMart Data Warehouse Environment MS Office Oracle Database Machine Plug-in Other Oracle 11g Systems 10 Rafał Kudliński – BI Manager Allegro Group
  11. 11. BI Portal presents the most important reports / KPIs describing performance of our major auction platforms in all countries we operate. We can find there information about open auctions, registered users, bids, sales and charges. Allegro Performance KPIs 11 Rafał Kudliński – BI Manager Allegro Group
  12. 12. Product managers can analyze a number of measures drilling down in the product category tree. They can filter data by selecting an auction type or a seller type. Auction Category Analysis 12 Rafał Kudliński – BI Manager Allegro Group
  13. 13. BI Portal contains also information about IT department performance. Managers can see current budget realization, SLA, Traffic and status of most important current IT projects. IT Department KPIs 13 Rafał Kudliński – BI Manager Allegro Group
  14. 14. We deliver information about the number of clicks grouped by users, user locations, services, scripts and, most importantly (not available yet), by product categories. The users can drill down to detail information. Click Stream analysis 14 Rafał Kudliński – BI Manager Allegro Group
  15. 15. DB Machine is very efficient in all types of ETL processing. We do parsing, cleaning, merging and joining of almost 500 million records each day. The number of aggregation tables is calculated and refreshed. Lesson 2 – ETL Processing – DB Machine do it all 15 Rafał Kudliński – BI Manager Allegro Group
  16. 16. As usual, in order to have excellent performance, you have to think about partitioning, compression and parallel query execution. No indices; full table scan performance needs to be considered. Lesson 3 – Data Architecture – Standard but Improved We use Standard Star Schema with a collection of fact tables Our largest fact tables have almost 3 billion records (billings, clicks) Our largest dimension tables have more than 15 million records (users, locations) No Indices - no need; in some cases using them was even worse Full Table Smart Scan – works very efficiently We heavily use partitions (days, months) and sub-partitions (attributes) Compression – saves space (avg. 30%) and improves performance Parallel query execution /*+parallel(table,8,3)*/ - works very well – average query execution time improvement = X10 16 Rafał Kudliński – BI Manager Allegro Group
  17. 17. Even when using DB Machine, it is necessary to use the aggregation tables to achieve necessary user interface performance. What you get is scalable and fast aggregation and reporting environment. Lesson 4 – User Access – fast and reliable We create a number of aggregation tables to avoid joins between million- record tables Reports and dashboards are delivered within seconds (<5s) even with >100 users working OBIEE works very well with DB Machine especially in reporting and dashboarding OBIEE ad-hoc Answers application is very powerful but still some users need to use SQL to get what they want (automatically generated queries are not adjusted to use all DB machine features) 17 Rafał Kudliński – BI Manager Allegro Group
  18. 18. Exadata storage server brings an additional value but not many additional tasks. It can be handled by DBA without any special skills Lesson 5 - DB Administration – no complexity Just typical RAC environment Fully integrated with Grid Control – storage cell monitored with a dedicated plug-in Distributed command execution Easy storage layer administration – replace/create a disk/diskgroup with no more than 3 commands Comprehensive command shell on a storage cell Additional hardware/software components needed for integration with SAN backup environment. Self-monitoring storage layer with email notifications 18 Rafał Kudliński – BI Manager Allegro Group
  19. 19. Support from Oracle and external experienced consultants is necessary for successful DW & BI implementation (using new environment) Lesson 7 – External support – helps and speeds up Business Needs Business Discovery (performed with the help of Oracle Consulting ) was very valuable to prioritize business requirements - Jamal El Faiz ETL Process Experience in massive data processing from ISE – Igor Michaljow OBIEE Expertise in building robust reports and dashboards from Oracle Consulting Alessandro Sabelli, Małgorzata Baran, Marzena Krzanowska DB Machine administration Some initial configuration made by Oracle Support from RAC PAC team (best practices, service requests) Update Patches are frequently released Experienced internal DBA is crucial – Wojciech Semenowicz 19 Rafał Kudliński – BI Manager Allegro Group
  20. 20. Right now we are working on processing Click Stream data to DWH. We have more than 400 mln page views every day. In next few months data from our payment and classified services will be loaded . Next Steps and Outlook FUTURE PRESENT PAST PAST 20 Rafał Kudliński – BI Manager Allegro Group
  21. 21. Recommendations Thing big act small and before you start search for the right staff! Plan and manage project carefully Have right sponsor and support from business site Oracle Database Machine is definitely right choice At the beginning Support from Oracle consulting is crucial Stand for Information Democracy in your company 21 Rafał Kudliński – BI Manager Allegro Group
  22. 22. Q&A 22 Rafał Kudliński – BI Manager Allegro Group