1 PL-Grid: Polish Infrastructure for Supporting Computational Science in the European Research Space Obliczenia naukowe i usługi dziedzinowe Programu PL-Grid Jacek Kitowski ACK Cyfronet AGH Konsorcjum PL-Grid Dzień Otwarty ACK Cyfronet AGH 24.11.2014
Plan wystąpienia 2 Program PL-Grid w zarysie Rodzina Projektow: PL-Grid, PLGrid Plus, PLGrid NG, PLGrid Core Osiągnięcia Wybrane usługi dziedzinowe Podsumowanie
ACK Cyfronet AGH 41 years of expertise Social Networking Human Resources Infrastructure Resources High Performace Computing Centre of Competence Network Resources High Performance Networking Rank TOP500 81 VI.2011 211 XI.2014 Site System Cores Cyfronet Poland Cyfronet Poland ZEUS Cluster Platform Hewlett-Packard ZEUS Cluster Platform Hewlett-Packard R max Tflops R peak Tflops 11,694 104.8 124.4 25,468 266.9 373.9
TOP500 Polish Sites Rank Site System Cores 81 106 113 145 175 211 143 170 221 275 342 163 -- ---- 194 - ---- 375 -- ---- Cyfronet Poland ICM Warsaw Poland TASK Gdańsk WCSS Wroclaw PCSS Poland June 2011 Nov. 2012 June 2013 Nov. 2013 June 2014 Nov. 2014 Zeus - Cluster Platform SL390/BL2x220, Xeon X5650 6C, 2.660GHz, Infiniband QDR, NVIDIA 2090 Hewlett-Packard BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect IBM GALERA PLUS -- Action Xeon HP BL 2x220/BL490 E5345/L5640 Infiniband ACTION Cluster Platform 3000 BL2x220, X56xx, 2.66 GHz, Infiniband Hewlett-Packard Rackable C1103-G15, Opteron 6234 12C 2.40 GHz, Infiniband QDR SGI 11,694 23,932 25,468 25,468 25,468 25,468 16,384 16,384 16,384 16,384 16,384 10,384 - ---- 6,348 -- -- 9,498 --- -- Rmax (TFlop/s) 104.8 234.3 266.9 266.9 266.9 266.9 172.7 189.0 189.0 189.0 189.0 65.6 - ---- 57.4 -- ---- 89.8 -- -- - Rpeak (TFlop/s) 124.4 357.5 373.9 373.9 373.9 373.9 209.7 209.7 209.7 209.7 209.7 97.8 -- -- 67.5 - ---- 211.1 -- -- 5 Allegro 2011-13 Nasza Klasa 2008, 2010-11 Telecomm. Company 2008, 2010
New HPC Asset 7 New Cluster Prometheus Contract signed 20.10.2014 Some data 1.65 PFlops 41,472 Haswell cores 216 TB RAM (DDR4) 10 PB disks, 150GB/s 1728 servers HP Apollo 8000 In operation Jan. 2015
PL-Grid Consortium 8 Consortium creation January 2007 a response to requirements from Polish scientists due to ongoing Grid activities in Europe and in the World Aim: significant extension of computing resources and solutions provided to the scientific community PL-Grid Programme Development based on: projects funded by the European Regional Development Fund as part of the Innovative Economy Program close international collaboration (EGI,.) previous projects (5FP, 6FP, 7FP, EDA ) National Network Infrastructure available: Pionier National Project computing resources: Top500 list Polish scientific communities: ~75% highly rated Polish publications in 5 Communities PL-Grid Consortium members: 5 High Performance Computing Polish Centres, representing the Communities, coordinated by ACC Cyfronet AGH
9 Implementation of the PL-Grid Programme Family of Projects by Operational Programme: Innovative Economy
Family of PL-Grid Projects coordinated by Cyfronet PL-Grid (2009 2012) Number of people involved: ca. 80 (total, from different Polish Centres) Outcome: Common base infrastructure National Grid Infrastructure (NGI_PL) Resources: 230 Tflops, 3.6 PB Real Users PLGrid NG (2014 2015) Expected outcome: Optimization of resources usage, training Extension of domain specific solutions by 14 add l domains Extension of resources and services by: ca. 8 Tflops, some PB 11 PLGrid PLUS (2011 2015) Number of people involved: ca. 120 Outcome: Focus on users (training, helpdesk ) Domain specific solutions: 13 domains (Specific computing environments) Extension of resources and services by: 500 Tflops, 4.4 PB PLGrid CORE (2014 2015) (Cyfronet only) Expected outcome: Competence Center End-user services Open Science paradigm Large workflow applications Data Farming mass computation Extension of resources and services by: ca. 1500 Tflops, 25 PB
Supercomputer Zeus 12 Xeon, 23 TB, 169 TFlops Opteron, 26 TB, 61 TFlops Xeon, 3,6 TB, 136 TFlops Xeon, 6 TB, 8 TFlops ZEUS Statistics 2012 (2013) Almost 8 mln jobs 21,000+ daily 80 mln CPU hours 9130 years 800+ active users 100PB+ usage of scratch (350PB) The longest job: 76 days Users needs taken into The biggest job: 576 cores (1024) account Ca. 50% CPU time for multicore jobs
Summary of Projects Results (up-to-date) 13 Close collaboration between Partners and research communities Development of IT PL-Grid Infrastructure and ecosystem Development of tools, environments and middleware services,clouds Integration, HPC, Data intensive, Instruments Development of 27 domain specific solutions
Summary of Projects Results (up-to-date) 14 Facilitation of community participation in international collaboration EGI Council, EGI Executive Board FP7 (VPH-Share, VirtROLL.) EGI-InSPIRE, FedSM, EGI-Engage, Indico DataCloud, EPOS, CTA,. Publications 26 papers on PL-Grid Project results Conference papers Journal papers and book chapters PL-Grid (07.2009 03.2012) PLGrid Plus (06.2012 10.2014) PLGrid Core (10.2014) PLGrid NG (10.2014) total 15 77 5 5 103 28 40 0 0 68 36 papers on PLGrid Plus Project results 147 authors, 76 reviewers Total 43 117 5 5 171
Journal Publications (subjective selection) 15 Journal IF Journal IF Journal IF J.Chem.Theor.Phys.Appl. 5.31 Phys.Lett. B 6,019 J.High Energy Phys. 6,22 Astonomy &Astrophys. 4,479 Inorganic Chem. 4,794 J.Org.Chem. 4,638 Optic Lett. 3,179 Appl.Phys.Lett. 3.515 J.Comput.Chem. 3,601 J.Phys.Chem. B 3,377 Soft Matter 4,151 Int.J.Hydrogen Energy 2,93 Physica B 1,133 J.Chem.Phys. 3,122 J.Phys.Chem.Lett. 6,687 Phys.Chem.Chem.Phys. 4,638 Fuel Processing Techn. 3,019 J.Magn. & Magn. Mat. 2,002 Eur.J.Inorg.Chem. 2,965 Chem.Phys.Lett. 1,991 Phys.Rev.B 3,664 Eur.Phys.J. 2,421 Future Gen.Comp.Syst. 2,639 J.Phys.Chem. C 4,835 Crystal Growth & Desing 4,558 Conferences: Cracow Grid Workshop (since 2001) KU KDM (since 2008) Macromolecules 5,927 Astrophys.J.Lett. 5,602 Phys.Rev.Letters 7,728 J.Chem.Theor.Appl. 5,31 Astrophys.J 6,28 Chem.Physics 2,028 Molec.Pharmaceutics 4,787 Eur.J.Pharmacology 2,684 Energy 4,159 Carbon 6,16 J.Biogeography 4,969 Electrochem.Comm. 4,287 J.Magn.&Magn.Mat. 1,892
17 Implementation of the PL-Grid Programme Deployed IT Platforms and Tools selected examples (by Cyfronet)
GridSpace: A platform for e-science applications 18 Experiment: an e-science application composed of code fragments (snippets), expressed in either general-purpose scripting programming languages, domain-specific languages or purpose-specific notations. Each snippet is evaluated by a corresponding interpreter. GridSpace2 Experiment Workbench: a web application - an entry point to GridSpace2. It facilitates exploratory development, execution and management of e-science experiments. Embedded Experiment: a published experiment embedded in a web site. GridSpace2 Core: a Java library providing an API for development, storage, management and execution of experiments. Records all available interpreters and their installations on the underlying computational resources. Computational Resources: servers, clusters, grids, clouds and e-infrastructures where the experiments are computed.
InSilicoLab science gateway framework 20 Goals Complex computations done in non-complex way Separating users from the concept of jobs and the infrastructure Modelling the computation scenarios in an intuitive way Different granularity of the computations Interactive nature of applications Dependencies between applications Summary The framework proved to be an easy way to integrate new domain-specific scenarios Even if done by external teams Natively supports multiple types of computational resources Including private resources e.g. private clouds Supports various types of computations Architecture of the InSilicoLab framework: Domain Layer, Mediation Layer with its Core Services, and Resource Layer. In the Resource Layer, Workers (`W') of different kinds (marked with different colors) are shown. Different kinds of users different kinds of resources
Scalarm 21 Scalarm overview What problems are addressed with Scalarm? Self-scalable platform for parametric studies Adapting to experiment size and simulation type Data farming experiments with an exploratory approach Parameter space generation with support of design of experiment methods Accessing heterogeneous computational infrastructure Exploratory approach for conducting experiments Supporting online analysis of experiment partial results Integrates with clusters, Grids, Clouds Self-scalability of the management/execution parts Scalarm Graphical User Interface
Onedata 22 A system that provides a unified and efficient access to data stored in organizationally distributed environments. Onedata Global Registry Provides a uniform and coherent view on all data stored on the storage systems distributed across the infrastructure Supports working in groups by creation of an easy-to-use shared workspace for each group. Serves data efficiently
Cloud Computing 23 The Cloud increases elasticity of research, as scientists can tune the virtual machines to their specific needs. The catalogue of VMs offered by PL-Grid contains many OSs. Thanks to this, users can run their software applications with Operating Systems other then Scientific Linux, including Windows or other Linux OSes. With Cloud, it is easy to build and put in operation a test environment. This feature is very convenient for scientists developing their own software. It is possible to maintain a communication with already executed computing job. In addition, every virtual machine can be easily duplicated, even in thousands of copies or more. Cloud platform is also the best and in many cases the only solution for running jobs with legacy software packages. Open Nebula, Open Stack,. IaaS, PaaS, STaaS.
24 Implementation of the PL-Grid Programme Deployed up-to-date Dedicated Services selected examples
Chemistry InSilicoLab for chemistry 25 The service aims to support the launch of complex computational quantum chemistry experiments in the PL-Grid Infrastructure. Experiments of this service facilitate planning sequential computation schemes that require the preparation of series of data files, based on a common schema.
Metallurgy Simulations of extrusion process in 3D 26 Main Objective: Optimization of the metallurgical process of profiles extrusion. Optimization includes: shape of foramera, channel position on a die, calibration stripes, extrusion velocity, ingot temperatures, tools. The proposed grid-based software simulates extrusion of thin profiles and rods of special alloys of magnesium, containing calcium supplements. These alloys are characterized by extremely low technological plasticity during metal forming. The FEM mathematical model developed.
Wyciskanie pręta ze stopu MA2-1 (współpraca: Moscow State University of Mechanical Engineering) 7 Identyfikacja parametrów modelu technologicznej plastyczności f,t, k f f d4 d k d t d1 exp exp 2 3 MA2-1: d 1 =0.0229; d 2 =0.128; d 3 =0.0161; d 4 =-0.156. 200 0 C 250 0 C 300 0 C Symulacja procesu wyciskania
Wyciskanie rury ze stopu MgCa08 (plgkremsa, IMN, Skawina, Polska) 8 Urządzenie do wyciskania Wariant 1 Wariant 2 A. MILENIN, M. GZYL, T. REC, B. PLONKA Computer aided design of wires extrusion from biocompatible Mg-Ca magnesium alloy // Archives of Metallurgy and Materials, 2, 2014.
Usługa MCMicro rozrost ziaren 29 Opracowanie zrównoleglonej wersji algorytmu rozrostu ziaren w celu uzyskania reprezentacji mikrostruktury jednofazowego materiału polikrystalicznego na bazie metody Monte Carlo (MC). Możliwe badania rekrystalizacji E J gb i, j 1 S S i j p E 1 0 E E 0 0 CPU CPU
Life Science Integromics a system for researchers from biomedicine and biotechnology The system was developed to allow: 30 data collection from experiments, laboratory diagnostics, diagnostic imaging, instrumental analysis and from medical interview, integration, management, processing and analysis of the collected data using specialized software and some of data mining techniques, hypotheses generation, data sharing and presentation of the results. Example: The diagram of an artificial neural network used to classify patients based on the expression of selected genes. The used method will allow to raise new hypotheses about the influence of individual genes on changes in the organisms.
Chemia Kwantowa 31 Wsparcie eksperymentów obliczeniowych - usługa InSilicoLab for Chemistry dodanie wsparcia do programu niedoida (program do obliczeń QC, rozwijany na UJ, implementujący specyficzne rozwiązania) wspieranie użytkowników Usługa QC Advisor informacja o dostępności metod chemii kwantowej w oprogramowaniu zainstalowanym w PL-Grid, przygotowana baza wiedzy dostępna w docs.grid.pl
Platforma integromicznych analiz danych z mikromacierzy DNA 32 Możliwości Wsparcie um Affymetrix i Agilent Integracja z EBI ArrayExpress Szereg zaawansowanych analiz Usługa dwujęzykowa Badanie korelacji ekspresji genów z ekspresją mirna lub stężeniem lipidów
Galaxy Server platforma wykonywania analiz NGS 33 Możliwości Analiza dużych zbiorów danych (10-200 GB) Integracja z systemami (plików i kolejkowym) zeusa Współdzielenie wyników i wizualizacji ze współpracownikami Szereg narzędzi i workflows przygotowanych przez ekspertów NGS
SynchroGrid Elegant the service for those involved in the design and operation of Synchrotron 34 Objectives: Preparation of tools needed to Synchrotron deployment and running, aimed at operations and research of the beam line (callaboration with SOLARIS) Addressing the estimated users needs in this scientific area focusing on data access and management especially the metadata for the experimental data gathered during the beam time. The developed service consists in: provision of the elegant (ELEctron Generation ANd Tracking) application in the parallel version on a cluster, configuring the Matlab software to read output files produced by this application in a Self Describing Data Sets (SDDS) format and to generate the final results in the form of drawings.
35
36
37
Usługa Słuch 38 Przetwarzanie "surowych" danych hałasu (nagrania hałasu) Modelowanie skutków słuchowych dla zadanych warunków ekspozycji w oparciu o sygnał hałasu Modelowanie skutków słuchowych dla zadanych warunków ekspozycji (obliczenia dla wielu punktów obserwacji - mapa zagrożeń słuchu) Prototyp portalu Mapy hałasu dla źródeł punktowych: suma (dla różnych częstotliwości)
Fenologia Workflow usługi 39 Automatyczne Fotografie Operator Techniczny Operator Fenolog Przeglądarka Zdjęć O.T. definiuje sceny weryfikacja jakościowa (ostrość, scena) odrzucenie zdjęć z osobami, pojazdami weryfikacja merytoryczna akceptacja przez nadanie faz fenologicznych: wegetatywna generatywna owoców generatywna liści Zdjęcia dostępne dla użytkowników końcowych
Przeglądarka meteorologiczna 40
41 Implementation of the PL-Grid Programme New Dedicated Services In Development within PLGrid NG selected examples
ebalticgrid 42 Objective: for research purposes of the Baltic Sea, providing the numerical model of the ice and the sea at a very advanced and complex structure; this will allow users to review and analyze the physical fields characterizing the Baltic Sea and its ice cover without having to learn the nontrivial structure. Baltic Sea surface temperature distribution. The salinity of the surface of the selected, enlarged area.
Hydrology 43 Objectives: providing forecasts of changes in the level of the ocean, determination of evapotranspiration estimates for local or regional hydrological models. A map of sea-level anomalies derived from observations conducted by altimetric satellites example data at spatial resolution 1/4 x 1/4', used to predict changes in the level of the ocean in real time.
Meteorology 44 Objective: building a platform for the generation of meteorological forecasts at high temporal and spatial resolution, and sharing those forecasts based on OGC (Open Geospatial Consortium) services. temperature + wind direction only temperature Numerical weather forecasts made by the meteorological model Weather Research and Forecasting (WRF), at high spatial and temporal resolution (4 km x 4 km, 1h) for the area of Poland and Baltic Sea.
45
Nuclear Power and CFD 46 Objective: analysis and design of complex nuclear systems, both experimental and industrial, in which the object of interest are changes in matter caused by nuclear radiation, such as nuclear transmutation, activation, radiological hazard, destruction of the structure, nuclear heating and after-turned-off heat, building a system allowing optimization of the rotodynamic machines using CFD techniques within the PL-Grid Infrastructure. An integrated system for the design and optimization of rotodynamic machines (the RoMa service). model discretization description with use of the equations and definition of boundary conditions results of the CFD computations (Ansys CFX) Zintegrowany system do projektowania i optymalizacji maszyn rotodynamicznych Geometria CAD Dyskretyzacja reactor core fuel assembly fuel rod The multi-layer numerical model of a core of the 4th generation nuclear reactor, cooled with the liquid lead, developed with support of the MCB service (the Monte Carlo Continuous Energy Burn-up Code). Obliczenia (ANSYS CFX) Prezentacja i obróbka wyników
Computational Chemistry 47 Objective: building the repository of experimental data (related mainly to the excited states of molecules), together with the results of tests of computational models relating to this data and tools enabling to carry out users own tests and comparisons, and finally, to initiate the calculations with a method chosen as a result of these analyses, extension of the InSilicoLab for Chemistry platform with the functionality facilitating the launch of highperformance computing codes allowing for strong parallelization of calculations and the use of hardware acceleration (GPGPU). MOOSE Optical Molecular Spectroscopy Database will contain experimental data and results of calculations in the field of spectroscopy, assisting users engaged in quantum chemical calculations in choosing the appropriate method and providing reference data and tools of its analysis. Support for GPU computing in InSilicoLab for Chemistry allows significantly accelerate quantum-chemical calculations by acceleration on graphics cards, and, at the same time, using the tools offered by the InSilicoLab portal.
Complex Networks 48 Objective: dynamic data collection from the Web, in particular Web 2.0 (including text data), sharing of collected data (raw or structured), processing of large data sets, sharing large collections of reference data for research.
Summary and Conclusions 49 Three dimensions of development: HPC/GRID/CLOUDs Data & Knowledge layer Network & Future Internet Deployments have the national scope; however with close European links Development oriented on end-users & research projects Achieving synergy between research projects and e-infrastructures by close cooperation and offering relevant services Durability at least 5 years after finishing the projects - confirmed in contracts Future plans: continuation of current policy with a support from EU Structural Funds Center of Excellence CGW as a place to exchange experience and for collaboration between escience centers in Europe
More information 51 Please visit our Web pages: http://www.plgrid.pl/en http://www.plgrid.pl CREDITS!
Credits 52 ACC Cyfronet AGH Kazimierz Wiatr Michał Turała Łukasz Dutka Marian Bubak Krzysztof Zieliński Karol Krawentek Agnieszka Szymańska Maciej Twardy Angelika Zaleska-Walterbach Andrzej Oziębło Zofia Mosurska Marcin Radecki Tomasz Szepieniec Mariusz Sterzel Renata Słota Tomasz Gubała Darin Nikolow Aleksandra Pałuk Patryk Lasoń Marek Magryś Łukasz Flis Robert Pająk Special thanks to domain experts! and many others.. ICM Marek Niezgódka Piotr Bała Maciej Filocha PCSS Maciej Stroiński Norbert Meyer Krzysztof Kurowski Tomasz Piontek Paweł Wolniewicz WCSS Jacek Oko Józef Janyszek Mateusz Tykierko Paweł Dziekoński Bartłomiej Balcerek TASK Rafał Tylman Mścislaw Nakonieczny Jarosław Rybicki