MICC JINR Multifunctional
Information and Computing
Complex

RU

Technical details

Implementation

LIT JINR cloud infrastructure (hereinafter referred to as “JINR cloud service”, “cloud service”, “cloud”) is running on OpenNebula software. It consists of the several major components:

  • OpenNebula core,
  • OpenNebula scheduler,
  • MySQL database back-end,
  • user and API interfaces,
  • cluster nodes (CNs) where virtual machines (VMs) or containers (CTs) are running.

Since its version OpenNebula 5.4 it became possible to build a high availability (HA) setup of front-end nodes (FNs) using just built-in tools implementing a Raft consensus algorithm. It provides a cheap and easy way to make an OpenNebula master node replication and eliminates a need in MySQL clustering. 

Apart from that a Ceph-based software-defined storage (SDS) was deployed as well.

A schema of  JINR cloud architecture is shown on the figure below.

There are the following types of the OpenNebula 5.4 servers shown on the figure:

  • Cloud worker nodes (CWNs) which host virtual machines (VMs) and containers (CTs) are marked on the figure by numeral “1” in a grey square;
  • Cloud front-end nodes (CFNs) where all core OpenNebula services including database, scheduler and some other ones are deployed (such hosts are marked on the figure  by black numeral “2” inside the same color square);
  • Cloud storage nodes (CSNs) based on Ceph SDS for keeping VMs’ and CTs’ images as well as users’ data (marked by black numeral “3” inside the same color square).

All these servers are connected to the same set of networks:

  • JINR public and private subnets (they are marked on the figure by blue lines near which the numeral “1” is placed in the same color circle and signed as “JINR pub and priv subnets”);
  • An isolated private network dedicated for a SDS traffic (dark green lines with numeral “2” in a circle of the same color and signed as “Priv subnet for storage”);
  • Management network (black lines with numeral “3” in a circle of the same color singed as “management subnet”).

All network switches excluding those which are for the management network have 48 ports with 10 GbE each as well as four 40 Gbps SPF-ports for uplinks.

All CSNs apart from HDDs disks for data have SSD disks for caching.

All cloud resources are splitted into few clusters depending on virtualization (KVM or OpenVZ) type and scientific experiment resources are used by. In case of KVM its virtual instance is called “virtual machine” or “VM” whereas in case of OpenVZ it is “container” or “CT”.

The JINR cloud service provides two user interfaces:

  • command line interface;
  • graphical web-interface «Sunstone» (either simplified or full-featured one are possible depending on user group he belongs to).

Cloud servers and most critical cloud components are monitored by a dedicated monitoring service based on Nagios monitoring software. In case of any problem with monitored objects a cloud administrators get notifications via both SMS and email.

Users can log in VMs/CTs either with help of their {rsa,dsa}-key or using own Kerberos login and password (see below). In the later case to get root privileges one needs to execute ‘sudo -‘ command. The authentication in Sunstone GUI is based on Kerberos. SSL encryption is enabled to increase information exchange between web-GUI and users’ browsers.

Cloud service utilization

Currently the JINR cloud usage is developed in three directions:

  • test, educational, development and research tasks within a participation in various projects;
  • systems and services deployment with high reliability and availability requirements;
  • extension of computing capacities of the grid-infrastructures.

Services and testbeds currently deployed in JINR cloud are the following (they are shown schematically on the Figure below too).

  1. EMI-based testbed (it is used for trainings, testing, development and research tasks related to grid technologies as well as for performing JINR obligations in local, national and international grid projects such as e.g. WLCG);
  2. PanDA services for the COMPASS experiment;
  3. DIRAC-based testbed (it is used for monitoring tools development for BESIII experiment distributed computing infrastructure as well as its computing facility);
  4. A set of VMs of NOvA experiment users for analysis and software development;
  5. NICA testbed for grid middleware evaluation for NICA computing model development;
  6. EOS testbed for research on heterogeneous cyber-infrastructures, computing federation prototype creation and development based on high performance computing, cloud computing and supercomputing for Big Data storage, processing and analysis;
  7. Helpdesk (a web application for the day to day operations of an IT environment including a user technical support of JINR IT services);
  8. Computational resources for such experiments as JUNO, Daya Bay, Baikal-GVD;
  9. Web-service HepWep (provides a possibility to use different tools for Monte-Carlo simulation in high-energy physics);
  10. Test instances of the JINR document server (JDS) and JINR Project Management Service (JPMS);
  11. CT for web-sites development including new JINR web-portal;
  12. JINR GitLab – local GitLab installation for all JINR users;
  13. Hadoop testbed;
  14. A set of users’ VMs and CTs which are used for their own needs;
  15. A set of CTs for evaluation of various monitoring software to be used for JINR Tier-1 grid site monitoring system development.

Moreover, a set of OpenNebula testbeds are deployed on the JINR cloud service for development and debugging OpenVZ driver for current and new OpenNebula software versions. Each of such testbeds consists of one OpenVZ CT and 2-3 KVM VMs act as a CNs with OpenVZ kernel installed.

Apart from that JINR cloud is used by BESIII experiment as computing resource.