.quattro architecture
http://cern.ch/quattro
German Cancio CERN IT
.quattro architecture - overview
Configuration Database
CDB (Central Configuration Database)
CCM (Configuration Cache Manager)
Installation:
AII (Automated Installation Infrastructure)
NCM (Node Configuration Manager)
Software Repository (SWRep)
Software Package Management Agent (SPMA)
.quattro home page (under construction) http://cern.ch/quattro
Configuration DB design
N V A A P I
Installation ...
CCM Cache
Node GUI
CLI
Pan XML
CDB
pan
Server Modules
Provide different access patterns to Configuration Information
Configuration Data Base (CDB)
Configuration Information store. The
information is updated in transactions, it is validated and versioned. Pan Templates are compiled into XML profiles
Server Module SQL/LDAP/HTTPer Module Serv
SQL/LDAP/HTTP
HTTP + notifications
•nodes are notified about changes of their configuration
•nodes fetch the XML profiles via
Pan Templates
with configuration information are input into CDB via GUI & CLI
Configuration DB status
System in implemented (except for CLI and Server Modules), most of the components in 1.0 production version,
Pilot deployment of the complete system for LCG 1 using the
“panguin” GUI (screenshot next slide) In parallel:
System being consolidated,
Issues of scalability and security being studied and addressed,
Server Modules under development (SQL).
More information:
http://cern.ch/hep-proj-grid-config/
panguin GUI for managing/editing PAN templates
XML profile generated by PAN (lxplus001)
install design
SWRep Servers
nfshttpftp
Mgmt API
ACL’s
Packages(rpm, pkg)
CCM SPMANCMSPMA
Components
Cdispd NCM
Registration Notification
SPMA
SPMA.cfg
cache packages
(RPM, PKG)
Node (re)install?
PXEDHCP Mgmt API
ACL’s
Installation server
DHCP handling
KS/JS
PXE handling
KS/JS generator
Node Install Client Nodes
CCM SPMANCMSPMA
Components
Cdispd NCM
Registration Notification
SPMA
SPMA.cfg
CDB
nfshttpftp
Mgmt API
ACL’s
Client Nodes
cache (rpm, pkg) Packages PXEDHCP (RPM, PKG)packagesMgmt API
ACL’s
Installation server
DHCP handling
KS/JS
PXE handling
KS/JS generator
Node Install
CCM
Node (re)install?
Automated Installation Infrastructure
• DHCP and Kickstart (or JumpStart) are re- generated according to CDB contents
•PXE can be set to reboot or reinstall by operator
Software Repository
• Packages (in RPM or PKG format) can be uploaded into multiple Software Repositories
•Client access is using HTTP, NFS/AFS or FTP
•Management access subject to authentication/authorization
Node Configuration Manager (NCM)
• Configuration Management on the node is done by NCM Components
•Each component is responsible for configuring a service (network, NFS, sendmail, PBS)
•Components are notified by the Cdispd whenever there was a change in their configuration
Software Package Mgmt Agent (SPMA)
• SPMA manages the installed packages
•Runs on Linux (RPM) or Solaris (PKG)
•SPMA configuration done via an NCM component
•Can use a local cache for pre-fetching packages (simultaneous upgrades of large farms)
install design
SWRep Servers
AII (Automated Installation Infrastructure)
Subsystem to automate the node base installation via the network
Layer on top of existing technologies (base system installer, DHCP, PXE)
Modules:
AII-dhcp:
manage DHCP server for network installation information
AII-nbp (network bootstrap program):
manages the PXE configuration for each node (boot from HD/ start the installation via network)
AII-osinstall:
Manage OS configuration files required by the OS installation procedure (KickStart, JumpStart)
AII: current status
Architectural design finished
Detailed Design, implementation progressing
first alpha version expected mid July
Node Configuration Management (NCM)
Client software running on the node which takes care of
“implementing” what is in the configuration profile
Modules:
“Components”
Invocation and notification framework
Component support libraries
NCM: Components
“Components” (like SUE “features” or LCFG ‘objects’) are responsible for updating local config files, and notifying services if needed
Components register their interest in configuration entries or subtrees, and get invoked in case of changes
Components do only configure the system
Usually, this implies regenerating and/or updating local config files (eg.
/etc/sshd_config)
Use standard system facilities (SysV scripts) for managing services
Components can notify services using SysV scripts when their configuration changes.
Possible to define configuration dependencies between components
Eg. configure network before sendmail
Component example
sub Configure { my ($self) = @_;
# access configuration information my $config=NVA::Config->new();
my $arch=$config->getValue('/system/architecture’); # NVA API
$self->Fail (“not supported") unless ($arch eq ‘i386’);
# (re)generate and/or update local config file(s) open (myconfig,’/etc/myconfig’); …
# notify affected (SysV) services if required if ($changed) {
NCM (contd.)
cdispd (Configuration Dispatch Daemon)
Monitors the config profile, and invokes components via the ncd if there were changes
ncd (Node Configuration Deployer):
framework and front-end for executing components (via cron, cdispd, or manually)
Dependency ordering of components
Component support libraries:
For recurring system mgmt tasks (interfaces to system services, sysinfo), log handling, etc
More details in NCM design document http://edms.cern.ch/document/372643
NCM: Status
Architectural design finished
Detailed (class) design progressing
First version expected end July
Porting/coding of base configuration components completed mid September
more than 60 components to be ported for having a complete EDG solution (configuring all EDG middleware services)!
Pilot deployment on CERN central interactive/batch facilities
SPM (Software Package Mgmt) (I)
SWRep (Software Repository):
Client-server toolsuite for the management of software packages
Universal repository:
Extendable to multiple platforms and package formats (RHLinux/RPM, Solaris/PKG,…
others like Debian dpkg)
Multiple package versions/releases
Management (“product maintainers”) interface:
ACL based mechanism to grant/deny modification rights (packages associated to
“areas”)
Current implementation using SSH
Client access: via standard protocols
HTTP (scalability), but also AFS/NFS, FTP
Replication: using standard tools (eg. rsync)
SPM (Software Package Mgmt) (II)
Software Package Management Agent (SPMA):
Runs on every target node
Multiple repositories can be accessed (eg. division/experiment specific)
Plug-in framework allows for portability
System packager specific transactional interface (RPMT, PKGT)
Can manage either all or a subset of packages on the nodes
Useful for add-on installations, and also for desktops
Configurable policies (partial or full control, mandatory and unwanted packages, conflict resolution…)
Addresses scalability
Packages can be stored ahead in a local cache, avoiding peak loads on software
SPM (Software Package Mgmt) (III)
SPMA functionality:
1. Compares the packages currently installed on the local node with the packages listed in the configuration
2.Computes the necessary install/deinstall/upgrade operations
3.Invokes the packager (rpmt/pkgt) with the right operation transaction set
The SPM is driven via a local configuration file
For batch/servers: A NCM component generates/maintains this cf file out of CDB information
For desktops: Possible to write a GUI for locally editing the cf file
Software Package Manager (SPM)
RPMT
RPMT (RPM transactions) is a small tool on top of the RPM libraries, which allows for multiple simultaneous package operations resolving dependencies (unlike RPM)
Example: ‘upgrade X, deinstall Y, downgrade Z, install T’ and verify/resolve appropriate dependencies
Does use basic RPM library calls, no added intelligence
Ports available for RPM 3 and 4.0.X
Will try to feedback to rpm user community after porting to RPM 4.2
CERN IT/PS working on equivalent Solaris port (PKGT)
SPMA & SWRep: current status
First production version available
Being deployed in the CERN Computer Centre (next slide)
Enhanced functionality (package cache management) for mid- October
Solaris port progressing
SPMA/SWRep deployment @ CERN CC
Phased out legacy SW distribution systems (including ASIS) on the central batch/interactive servers (LXPLUS&LXBATCH)
Using HTTP as package access protocol (scalability)
1000 nodes currently running it in production
Deployment page: http://cern.ch/wp4-install/CERN/deploy
Server clustering solution
For CDB (XML profiles) and SWRep (RPM’s over HTTP)
Replication done with rsync
Load balancing done with simple DNS round-robin
Currently, 3 servers in production (800 MHz, 500MB RAM, FastEthernet) giving ~ 3*12Mbyte throughput