

On-Demand XCache cluster

What's XCache

XCache description is available in this article here.

You can look at the official XrootD documentation for detailed information about the XRootD tool:

XCache components

The setup infrastructure is shown the figure below, where the clients that run the payload can be instructed to request data to a cache system deployed on the same cloud provider and thus with low latency. The cache stack consists in:

a proxy server to function as bridge between the private network of the cache and the client. This server will simply tunnel the request from cache servers.
a cache redirector for federating each cache server deployed. If a new server is added, it will be automatically configured to contact this redirector for registration
a configurable number of cache servers, the core of the tool that are responsibles for reading-ahead from remote site while caching.

Schema of the components deployed for using a caching on-demand system on cloud resources

This setup has been tested on different cloud providers. It is also been tested at a scale of 2k concurrent jobs on Open Telekom Cloud resources in the context of HelixNebulaScience Cloud project. In the context of the eXtreme Data-Cloud project, a collection of recipes have been produced for the automatic deployment of a cache service on demand using different automation technology. For bare metal installation an Ansible playbook is available that can deploy either directly on host or through docker container the whole stack. For those who use docker swarm for container orchestration a docker-compose recipe is also available as for Kubernetes where an Helm chart is provided. All these solutions have been integrated in DODAS and thus with very few changes the same setup can be automatically replicated in different kind of cloud resources.

AuthN/Z mode in XCache

Schema of AuthN/Z for caching on-demand system

The client show its identity only to the cache server
The cache server will check in its local mapfile if the client is allowed to read the requested namespace
If that is the case the cache server will server the file from its disk if already cached or it will use its own certificate (robot/service/power user as needed) to authenticate with the remote storage for the reading process
The remote storage check its own mapfile if the robot/service/power user certificate is allowed to read from that namespace.

N.B. a procedure to use a user proxy forwarding approach is available but not recomended for security reasons.

AuthN/Z mode in XCache with OIDC

Coming soon...

On-demand XCache deployment with docker compose

Please follow the instruction here

Deployment on Kubernetes with Helm

helm init --upgrade
helm repo add cloudpg https://cloud-pg.github.io/CachingOnDemand/
helm repo update
helm install -n cache-cluster cloudpg/cachingondemand

More details in this demo

Deployment with DODAS

A guided demo is available here

Ansible deployment

Requirements

Ansible 2.4
OS: Centos7
valid CMS /etc/vomses
Port: one open service port
Valid grid host certifate
Valid service certificate that is able to read from remote origin (to be stored in /etc/grid-security/xrd/xrdcert.pem, /etc/grid-security/xrd/xrdkey.pem)

Role Variables

Create and customize your xcache ansible configuration (e.g. xcache_config.yaml)

xrootd_version: 4.8.3-1.el7
metricbeat_version: 6.2.4-x86_64
monitoring: # enable metricbeat sensors by setting this to "metricbeat"
path_to_certs: /etc/grid-security/xrd
BLOCK_SIZE: 512k # size of the file block used by the cache
CACHE_LOG_LEVEL: info # server log level
CACHE_PATH: /data/xrd # folder for cached files
CACHE_RAM_GB: 12 # amount of RAM for caching in GB. Suggested ~50% of the total
HI_WM: "0.9" # higher watermark of used fs space
LOW_WM: "0.8" # lower watermark of used fs space
N_PREFETCH: "0" # number of blocks to be prefetched
ORIGIN_HOST: origin # hostname or ip adrr of the origin server
ORIGIN_XRD_PORT: "1094" # xrootd port to contact origin on
REDIR_HOST: xcache-service # hostname or ip adrr of the cache redirector
REDIR_CMSD_PORT: "31213" # cmsd port of the cache redirector
metricbeat_polltime: 60s # polling time of the metricbeat sensor
metric_sitename: changeme # sitename to be displayed for monitoring
elk_endpoint: localhost:9000 # elasticsearch endpoint url
elastic_username: dodas # elasticsearch username
elastic_password: testpass # elasticsearch password

Example Playbook

First install the role:

ansible-galaxy install git+https://github.com/Cloud-PG/CachingOnDemand.git,ansible

Then create a basic ansible playbook (e.g. xcache_playbook.yaml)

---
- hosts: localhost
  remote_user: root
  roles:
    - role: CachingOnDemand

Finally run the playbook installation:

ansible-playbook --extra-vars "@xcache_config.yaml" xcache_playbook.yaml

Step by step deployment on bare metal: CMS XCache

here