Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Podobne dokumenty
Metodyki projektowania i modelowania systemów Cyganek & Kasperek & Rajda 2013 Katedra Elektroniki AGH

Zarządzanie sieciami telekomunikacyjnymi

Proposal of thesis topic for mgr in. (MSE) programme in Telecommunications and Computer Science

Machine Learning for Data Science (CS4786) Lecture11. Random Projections & Canonical Correlation Analysis

Usługi IBM czyli nie taki diabeł straszny

Linear Classification and Logistic Regression. Pascal Fua IC-CVLab

Datacenter - Przykład projektu dla pewnego klienta.

Systemy wbudowane. Poziomy abstrakcji projektowania systemów HW/SW. Wykład 9: SystemC modelowanie na różnych poziomach abstrakcji

Cel szkolenia. Konspekt

DM-ML, DM-FL. Auxiliary Equipment and Accessories. Damper Drives. Dimensions. Descritpion

Dlaczego my? HARMONOGRAM SZKOLEŃ październik - grudzień ACTION Centrum Edukacyjne. Autoryzowane szkolenia. Promocje

BAZIE KWALIFIKACJI ZAGRANICZNYCH

archivist: Managing Data Analysis Results

Machine Learning for Data Science (CS4786) Lecture 11. Spectral Embedding + Clustering

HARMONOGRAM SZKOLEŃ. październik - grudzień 2019

Instrukcja obsługi User s manual

KATALOG SZKOLEŃ. Kod szkolenia Nazwa szkolenia Czas trwania. QC370 ALM Quality Center Scripting 11.x 2

General Certificate of Education Ordinary Level ADDITIONAL MATHEMATICS 4037/12

Towards Stability Analysis of Data Transport Mechanisms: a Fluid Model and an Application

Machine Learning for Data Science (CS4786) Lecture 24. Differential Privacy and Re-useable Holdout

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 9: Inference in Structured Prediction

ERASMUS + : Trail of extinct and active volcanoes, earthquakes through Europe. SURVEY TO STUDENTS.

Revenue Maximization. Sept. 25, 2018

Dlaczego my? HARMONOGRAM SZKOLEŃ lipiec - wrzesień ACTION Centrum Edukacyjne. Autoryzowane szkolenia. Promocje

Macierze All Flash. Czy to jest alternatywa dla macierzy klasy Enterprise? Krzysztof Jamiołkowski HP EG Storage Solutions Architect

Wpływ dyrektywy PSD II na korzystanie z instrumentów płatniczych. Warszawa, 15 stycznia 2015 r. Zbigniew Długosz

Marzena Kanclerz. Microsoft Channel Executive. Zachowanie ciągłości procesów biznesowych. z Windows Server 2012R2

Tychy, plan miasta: Skala 1: (Polish Edition)

Analysis of Movie Profitability STAT 469 IN CLASS ANALYSIS #2

PowerFlow Sundial: 7 $ 0 & Avanc Compatible 8 & - & & 9 & -. ,! " #$%& ' ()$%& * & +, - <.! + . / & = & ! / - 4.

Dlaczego my? HARMONOGRAM SZKOLEŃ kwiecień - czerwiec ACTION Centrum Edukacyjne. Autoryzowane szkolenia. Promocje RODO / GDPR

A DIFFERENT APPROACH WHERE YOU NEED TO NAVIGATE IN THE CURRENT STREAMS AND MOVEMENTS WHICH ARE EMBEDDED IN THE CULTURE AND THE SOCIETY

MoA-Net: Self-supervised Motion Segmentation. Pia Bideau, Rakesh R Menon, Erik Learned-Miller

Disaster Recovery z PlateSpin Forge. Jedno narzędzie do zabezpieczania serwerów fizycznych, wirtualnych i chmury!

Zakopane, plan miasta: Skala ok. 1: = City map (Polish Edition)

Dlaczego my? HARMONOGRAM SZKOLEŃ kwiecień - czerwiec ACTION Centrum Edukacyjne. Autoryzowane szkolenia. Promocje

How to share data from SQL database table to the OPC Server? Jak udostępnić dane z tabeli bazy SQL do serwera OPC? samouczek ANT.

PRZEWODNIK PO PRZEDMIOCIE. Negotiation techniques. Management. Stationary. II degree

Wdrożenie archiwum ELO w firmie z branży mediowej. Paweł Łesyk

THE ADMISSION APPLICATION TO PRIVATE PRIMARY SCHOOL. PART I. Personal information about a child and his/her parents (guardians) Child s name...


PROGRAM STAŻU. Nazwa podmiotu oferującego staż / Company name IBM Global Services Delivery Centre Sp z o.o.

Probabilistic Methods and Statistics. Computer Science 1 st degree (1st degree / 2nd degree) General (general / practical)

BEST S.A. CASE STUDY: DROGA DO IN MEMORY


OPBOX ver USB 2.0 Mini Ultrasonic Box with Integrated Pulser and Receiver

POLITYKA PRYWATNOŚCI / PRIVACY POLICY

HARMONOGRAM SZKOLEŃ styczeń - marzec 2017

Chmura zrzeszenia BPS jako centrum świadczenia usług biznesowych. Artur Powałka Microsoft Services

Internet of Things Devices

Realizacja systemów wbudowanych (embeded systems) w strukturach PSoC (Programmable System on Chip)

INSTRUKCJE JAK AKTYWOWAĆ SWOJE KONTO PAYLUTION

OpenPoland.net API Documentation

Szczypta historii Inteligentne rozmieszczanie. Pierwszy magnetyczny dysk twardy. Macierz RAID. Wirtualizacja. danych

Program szkolenia: Fundamenty testowania

Cracow University of Economics Poland

Budowa przełączników modularnych. Piotr Głaska Senior Product Manager Enterprise Networking Solutions

Auditorium classes. Lectures

Rozwiązania dla radiodyfuzji naziemnej Digital Audio Broadcasting Digital Multimedia Broadcasting

Hard-Margin Support Vector Machines

Integracja istniejącej infrastruktury do nowego systemu konwersja protokołów

Storage funkcje macierzy. Piotr Sitek

An Empowered Staff. Patty Sobelman

Zasady rejestracji i instrukcja zarządzania kontem użytkownika portalu

CENNIK SZKOLEO MICROSOFT OFFICE

Odnawialne źródła energii. Renewable Energy Resources. Energetics 1 st degree (1st degree / 2nd degree) General (general / practical)

FORMULARZ REKLAMACJI Complaint Form

SSW1.1, HFW Fry #20, Zeno #25 Benchmark: Qtr.1. Fry #65, Zeno #67. like

Effective Governance of Education at the Local Level

Verification in POMDPs

Podstawy automatyki. Energetics 1 st degree (1st degree / 2nd degree) General (general / practical) Full-time (full-time / part-time)

Helena Boguta, klasa 8W, rok szkolny 2018/2019

RADIO DISTURBANCE Zakłócenia radioelektryczne

What our clients think about us? A summary od survey results

Łukasz Reszka Wiceprezes Zarządu

Cracow University of Economics Poland. Overview. Sources of Real GDP per Capita Growth: Polish Regional-Macroeconomic Dimensions

Strona główna > Produkty > Systemy regulacji > System regulacji EASYLAB - LABCONTROL > Program konfiguracyjny > Typ EasyConnect.

Z-ZIP Równania Różniczkowe. Differential Equations

Standardowy nowy sait problemy zwiazane z tworzeniem nowego datacenter

Installation of EuroCert software for qualified electronic signature

Gradient Coding using the Stochastic Block Model

Easy Data Center w 30 min. Krzysztof Baczyński / Product Sales Specialist DC/V 14 maja 2015

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

SNP SNP Business Partner Data Checker. Prezentacja produktu

PROGRAM STAŻU. IBM Global Services Delivery Centre Sp z o.o. Nazwa podmiotu oferującego staż / Company name. Muchoborska 8, Wroclaw

5 nines. - czyli jak osiągnąć większą dostępność dzięki VMware i Metro Storage Cluster

Why do I need a CSIRT?

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Advanced Internet Information Services Management (IIS 8)

RADIO DISTURBANCE Zakłócenia radioelektryczne

Jazz EB207S is a slim, compact and outstanding looking SATA to USB 2.0 HDD enclosure. The case is

deep learning for NLP (5 lectures)

FORMULARZ APLIKACYJNY CERTYFIKACJI STANDARDU GLOBALG.A.P. CHAIN OF CUSTODY GLOBALG.A.P. CHAIN OF CUSTODY APPLICATION FORM

Exalogic platforma do aplikacji Oracle i Middleware. Jakub Połeć Business Development Manager CE

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

CEE 111/211 Agenda Feb 17

ELEMENTY DO TŁOCZNIKÓW STEMPLE I MATRYCE

No matter how much you have, it matters how much you need

Sargent Opens Sonairte Farmers' Market

Presented by. Dr. Morten Middelfart, CTO

Transkrypt:

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration Li Yin 1 Email: yinli@eecs.berkeley.edu Joint work with: Sandeep Uttamchandani Guillermo Alvarez John Palmer Randy Katz 1 1:University of California, Berkeley : IBM Almaden Research Center

Outline Observe-analyze-act in storage system: CHAMELEON Motivation System model and architecture Design details Experimental results Observe-analyze-act in other scenarios Example: network applications Future challenges

Need for Run-time System Management E-mail Application Web-server Application Data Warehousing SLA Goals Storage Virtualization (Mapping Application-data to Storage Resources) Static resource allocation is not enough Incomplete information of the access characteristics: workload variations; change of goals Exception scenarios: hardware failures; load surges.

Approaches for Run-time Storage System Management Today: Administrator observe-analyze-act Automate the observe-analyze-act: Rule-based system Complexity Brittleness Pure feedback-based system Infeasible for real-world multi-parameter tuning Model-based approaches Challenges: How to represent system details as models? How to create/evolve models? How to use models for decision making?

System Model for Resource Arbitration Host 1 Host Host n QoS Decision Module Interconnection Fabric Switch 1 Interconnection Fabric Switch m Throttling Points controller controller controller Input: SLAs for workloads Current system status (performance) Output: Resource reallocation action (Throttling decisions)

Our Solution: CHAMELEON Observe Analyze Act Current States Managed System Incremental Throttling Step Size Feedback Component Models Workload Models Models Current States Piece-wise Linear Programming Retrigger Throttling Value Designerdefined Policies Action Models Reasoning Engine Throttling Executor Knowledge Base

Knowledge Base: Component Model Objective: Predict service time for a given load at a component (For example: storage controller). Service_time controller = L( request size, read write ratio, random sequential ratio, request rate) An example of component model FAStT9, 3 disks, RAID Request Size 1KB, Read/Write Ratio., Random Access Controller Model Average Responce Time (ms) 5 15 1 5 5 1 15 5 3 35 Requests/Second

Component Model (cont.) Quadratic Fit S = 3., r =.3 Linear Fit S = 3., r =.739 Non-saturated case: Linear Fit S =.59, r =.99

Knowledge Base: Workload Model Objective: Predict the load on component i as a function of the request rate j Component_load i,j = W i,j ( workload j request rate) Example: Workload with KB request size,. read/write ratio and. sequential access ratio

Knowledge Base: Action Model Objective: Predict the effect of corrective actions on workload requirements Example: Workload J request Rate = A j (Token Issue Rate for Workload J)

Analyze Module: Reasoning Engine Formulated as a constraint solving problem Part 1: Predict Action Behavior: For each candidate throttling decision, predict its performance result based on knowledge base Part : Constraint Solving: Use linear programming technique to scan all feasible solutions and choose the optimal one

Reasoning Engine: Predict Result Chain all models together to predict action result Input: Token issue rate for each workloads Output: Expected latency Workload Request Rate Action Model Token Issue Rate Component Load latency Workload Request Rate Workload Model Workload 1 load Workload n Component Model

Reasoning Engine: Constraint Solving Formulated using Linear Programming Formulation: Variable: Token issue rate for each workload Objective Function: Minimize number of workloads violating their SLA goals Workloads are as close to their SLA IO rate as possible Example: Constraints: Workloads should meet their SLA latency goals Latency(%) 1 FAILED MEET 1 EXCEED LUCKY IOps(%) Minimize p ai p bi [ SLA i T(current_throughput i, t i )] SLA i where p ai = Workload priority p bi = Quadrant priority

Act Module: Throttling Executor Hybrid of feedback and prediction Reasoning Engine Invoked Ability to switch to rule-based (policies) when confidence value is low Ability to re-trigger reasoning engine Execute Designer Policies Yes Confidence Value < Threshold No Execute Reasoning Engine Output with Step-wise Throttling Proportional to Confidence Value Continue Throttling Re-trigger Reasoning Engine Analyze System States

Experimental Results Test-bed configuration: IBM x-series server (.GHz -way with GB memory, redhat server.1 kernel) FAStT 9 controller drives (RAID) Gbps FibreChannel Link Tests consist of: Synthetic workloads Real-world trace replay (HP traces and SPC traces)

Experimental Results: Synthetic Workloads Effect of priority values on the output of constraint solver 1.5 1.5 1.5 Latency(%) 1.5 w1 w w w3 Latency(%) 1.5 w1 w w w3 Latency(%) 1.5 w w1 w w3.5 1 1.5 IOps(%)(t1=,t=,t3=33.7%,t=9.17%) IOps(%)(t1=,t=,t3=5.9%,t=1.%).5 1 1.5 IOps(%)(t1=,t=.1%,t3=33.%,t=1.1%) (a) Equal priority (b) Workload priorities (c ) Quadrant priorities Effect of model errors on output of the constraint solver 1.5 1.5 Latency(%) 1.5 w1 w w w3 Latency(%) 1.5 w1 w w w3.5 1 1.5 IOps(%)(t1=,t=,t3=33.7%,t=9.17%) (a) Without feedback.5 1 1.5 IOps(%)(t1=,t=,t3=1.93%,t=1.5%) (b) with feedback

Experiment Result: Real-world Trace Replay Real-world block-level traces from HP (cello9 trace) and SPC (web server) A phased synthetic workload acts as the third flow Test goals: Do they converge to SLAs? How reactive the system is? How does CHAMELEON handle unpredictable variations?

Real-world Trace Replay Without CHAMELEON: With throttling 3 5 Cello9 SPC SYN 5 Cello9 SPC SYN 15 IOPS 15 IOPS 1 1 5 5 latency (ms) 5 1 15 5 3 35 5 5 1 1 Cello9 Cello SLA latency (ms) 5 1 15 5 3 35 5 5 1 1 Cello9 Cello SLA 5 1 15 5 3 35 5 5 5 1 15 5 3 35 5 latency (ms) 1 1 SPC SPC SLA 5 1 15 5 3 35 5 5 latency (ms) 1 5 1 15 5 3 35 5 SPC SPC SLA latency (ms) 1 1 1 SYN SYN SLA 5 1 15 5 3 35 5 5 latency (ms) 1 1 SYN SYN SLA 5 1 15 5 3 35 5

Real-world Trace Replay With periodic un-throttling Handling system changes 35 Cello9 SPC SYN 5 Cello9 SPC SYN 3 5 15 IOPS IOPS 15 1 1 5 5 5 1 15 5 3 35 5 5 5 1 15 latency 11 1 9 7 5 3 1 Cello9 Cello9 SLA 5 1 15 5 3 35 5 5 latency (ms) 7 5 3 1 Cello9 Cello9 SLA 5 1 15 latency 9 7 5 3 1 SPC SPC SLA 5 1 15 5 3 35 5 5 latency (ms) 1 SPC SPC SLA 5 1 15 latency 1 9 7 5 3 1 SYN SYN SLA 5 1 15 5 3 35 5 5 latency (ms) 11 1 9 7 5 3 SYN SYN SLA 5 1 15

Other system management scenarios? Automate the observe-analyze-act loop for other selfmanagement scenarios Example: CHAMELEON for network applications Example: A proxy in front of server farm 1: monitor network/server : behaviors establish/ maintain models for system behaviors 3: Analyze and make action decision : Execute or Distribute action decision to execution points

Future Work Better methods to improve model accuracy More general constraint solver Combining with other actions CHAMELEON in other scenarios CHAMELEON for reliability and failure

References L. Yin, S. Uttamchandani, J. Palmer, R. Katz, G. Agha, AUTOLOOP: Automated Action Selection in the ``Observe-Analyze-Act Loop for Storage Systems, submitted for publication, 5 S. Uttamchandani, L. Yin, G. Alvarez, J. Palmer, G. Agha, CHAMELEON: a self-evovling, fully-adaptive resource arbitrator for storage systems, to appear in USENIX Annual Technical Conference (USENIX 5), 5 S. Uttamchandani, K. Voruganti, S. Srinivasan, J. Palmer, D. Pease, Polus: Growing Storage QoS Management beyond a -year old kid, 3 rd USENIX Conference on File and Storage Technologies (FAST ),

Questions? Email: yinli@eecs.berkeley.edu