Oracle BI Big Strategy Jak szybko zarobić, A się nie zmęczyć. Paweł Płaszczak Business Intelligence Trends, Warszawa 20.6.2012 The data gurus 10 years, 150 projects
Most people use <10% neurons
Most people use <10% neurons myth
Most people use <10% neurons Most firms use <1% data myth fact
The offer We help you discover information you already have.
1. warehousing and Business Intelligence: Information delivery
The data Replaced 1000s jobs key asset: data Most critical Most vulnerable security safety downtime response time Access / WWW Application Hardware + OS Services Oracle Care Audit Optimization Metrics Monitoring Migration Hardware + OS
HA
HA Disaster Recovery Center DR DR
HA Disaster Recovery Center DR DR Archive Backup
Dev Test HA Disaster Recovery Center DR DR Archive Backup
multi master Dev Test HA Disaster Recovery Center DR DR Archive Backup
multi master Dev Test HA ETL Warehouse Disaster Recovery Center DR DR Archive Backup
multi master Dev Test HA ETL Staging Warehouse Disaster Recovery Center DR DR Archive Backup data mart data mart data mart data mart
In memo ry multi master Dev Test HA ETL Staging Warehouse Disaster Recovery Center DR DR Archive Backup data mart data mart data mart data mart
In memo ry consolidation multi master Dev Test HA ETL Staging Warehouse Disaster Recovery Center DR DR Archive Backup data mart data mart data mart data mart
In memo ry consolidation multi master cloud Dev Test HA ETL Staging Warehouse Disaster Recovery Center DR DR Archive Backup data mart data mart data mart data mart
In memo ry multi master Dev Test HA ETL Staging Warehouse DR DR Archive Backup data mart data mart data mart data mart
management Business Intelligence IT In memo ry multi master Dev Test HA ETL Staging Warehouse DR DR Archive Backup data mart data mart data mart data mart
electronics manufacturer case study Customer Tier 1 electronics manufacturer, shipping 120,000 units per day, and annual revenue in $ billions. Challenge from production floor must be available immediately. However, delays of minutes and hours were observed. Impact Production increase by 30% planned, enabling 8 digit growth Picture By Robert Scoble [CC-BY-2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
Customer World s 4th oil&gas corporation. Operations in over 130 countries, 95,000 employees. Challenge case study Efficient processing of large seismic data. Where are the undiscovered natural resources? Impact In oil&gas, seismic processing needed in all stages of exploration 10,000-node data centers not uncommon In exploration, efficient processing can save $100M per well Reservoir simulations support $1billion decisions
major insurance group case study
Klasyfikator Naive Bayes Predykcja klasyfikatora nr 1 Klasyfikator drzewo decyzyjne Prawdopodobieństwo predykcji klasyfikatora nr 1 Wniosek: klient 10387 z wysokim prawdopodobieństwem spowoduje szkodę ubezpieczeniową
retail & logistics case study: Jysk
Czas Capacity analysis Czas 50000 45000 40000 35000 30000 25000 Czas operacji w funkcji metrów ładowych 20000 15000 10000 5000 0 0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 50000 45000 40000 35000 30000 25000 20000 15000 10000 5000 0 Liczba metrów ładownych Czas operacji w funkcji liczby sklepów 0 20 40 60 80 100 120 140 Liczba sklepów
m m m m m m m m m m m m m m m m m m m m m m m m Cost per operation 2,5 2 1,5 1 0,5 0 Telco case study (1) 0 10 20 30 40 50 60 User experience 2 000 000 1 500 000 1 000 000 500 000 3 2 1 0 0 1 2 3 4 5 6 7 8 9 10 11 12 Availability 40 Incidents Capacity 30 20 10 0
Relative risk rank Telco case study (2) The aggregate risk timeline Legend 160 140 120 100 80 60 40 20 0 1 2 3 4 5 6 Business Continuity Safety Security severe <120 high <90 medium <60 low <30 trivial <10 Business Continuity 35 30 25 20 15 10 5 0 Security Safety Business Continuity Performance Risk by class severe high acceptable measured system 350 300 250 200 150 100 50 0 RPO RTO current required Transparency Maturity Compliance Performance 100% 100% 100% 100% 80% 80% 80% 80% 60% 60% 60% 60% 40% 40% 40% 40% 20% 20% 20% 20% 0% 0% 0% 0%
Sales case study (1)
Sales case study (2)
Sales case study (3)
2. Big and Agile Business Intelligence: Information discovery
What is big data? that cannot be easily managed with commodity methods.
Inspirational projects Univ. Amsterdam Eur. Commission Leibniz Reichenzen trum WD Policja Warta non-structured data scientific structured data business structured data Philips Enea Żabka Grid Forum Total BAT BP CNN MIMOS 2000 2005 2010 USGS Ricoh BigMatters (2009)
A brief history of Big Yesterday: large volumes, VLDB Big Today: unstructured, non-relational CERN 1 PB/second (usable 25 PB/yr) Metagenomics: 1000s of PB expected 1,2 mln PB unstructured content worldwide non-structured data scientific structured data USGS: 2 PB Astrophysics: 100 PB needed 90% of corporate data is unstructured business structured data 2010: Walmart 2,5 PB 2008: AT&T: 2000 2005 50 TB 2010 2013: Ebay 9 PB
Big challenge Non-structured, or semi-structured Inconsistent, nonrelational Various sources Large Dynamic Owned by many
The anatomy of Big & BI Text Text Morfologik Endeca Text Enrichment Text processing Cleansing Oracle Integrator Oracle Endeca Information Discovery Integrator Integration Oracle Warehouse Builder Reports & monitoring Warehousing Oracle Business Intelligence Oracle BI Publisher Oracle Oracle Endeca Server Oracle Mining Rapid Miner Mining Gephi Oracle Business Intelligence Oracle Endeca Studio Visualisation
Pic By Elopde (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons public security case study
Przykład 1 Dane z Allegro Krótki czas trwania? Czy cena zaniżona? Wiele kont? Grupa?
Przykład 2 Parametry pożyczki Parametry pożyczkobiorcy Parametry pożyczkobiorcy
Discovery value search Wybór Interfejsu wyszukiwania Podpowiedzi
Dane tekstowe Przed stemmingiem Po stemmingu
Wyszukiwanie wzorców, zachowań nietypowych: allegro, 31 mln rekordów aukcji mam do sprzedania zdjęcie psa oraz całkowicie za darmo dorzucam 50 000 wirtualnej gotówki na Casino.pl zdjęcie i gratis wysyłam w ciągu 5min od zaksięgowania wpłaty. Przypominam ze kasa jest własnością netinus Casino.pl Oficerski tasaczek oficera RADu paleta sprzętu RTV o wartości rynkowej 10000 zł w cenie startowej 650 zł - cały towar pochodzi z Niemiec użytkownik oferujący kilkadziesiąt aukcji: motocykl, felgi i opony do samochodów i inne części samochodowe sprzedawane w obrębie wyprzedaży garażowych
Decision trees 85% 92% 93% 100%
Subgroup discovery
Gephi Visual analytics
Grafy
Odległość suma krawędzi Kolor krawędzi wartość Wielkość węzła pożyczki wzięte Kolor węzła pożyczki udzielone
Badanie powiązań społecznych
The anatomy of Big & BI Text Text Morfologik Endeca Text Enrichment Text processing Cleansing Oracle Integrator Oracle Endeca Information Discovery Integrator Integration Oracle Warehouse Builder Reports & monitoring Warehousing Oracle Business Intelligence Oracle BI Publisher Oracle Oracle Endeca Server Oracle Mining Rapid Miner Mining Gephi Oracle Business Intelligence Oracle Endeca Studio Visualisation
Big opportunity Information discovery Marketing: sentiment analysis Telco: cross-selling Finance: insurance pricing, fraud Web: paid content Public: security, crime Sales: lead generation General: client behavior monitoring agile Business Intelligence
Beat your competitors Use 100% neurons Use 100% information
Paweł Płaszczak pawel@gridwisetech.com 508 378 895 Thank you We help you discover information you already have.