CMS DAC21

From ESCAPE_WIKI
Jump to: navigation, search

As reported here INFN/CMS use case is about: Caches, interactive analysis, Data ACLs and heterogeneous resources integration

  • Analysis facility → interactive and batch analysis:
    • Multiuser JupyterHUB integrated with IAM and RUCIO
    • Submission to htcondor batch system through the same IAM token authN/Z
    • Interactive extension toward batch systems with Dask
  • Managing Multiple layers of XCaches to serve different purposes: I/O bound application, HPC network fan out etc.
  • HPC integration in CMS
  • Data access (ACLs.). Managing embargoed data in RUCIO, end to end token integrated

Analysis facility

Multiuser JupyterHUB integrated with IAM and RUCIO

ID CMS001
Goal Demonstrate a typical end to end interactive analysis on NanoAOD at CMS reading from datalake and storing back results.
Workflow
Requirements Input dataset from opendata, if possible with embargoed CMS data
People Diego Ciangottini
WP 2
Success The user is able to run the analysis on the notebook and seamlessly store output back to the data-lake
Things to test IAM token storage authN, different input QoS
Impact Demonstrate the capability to integrate CMS analysis facilities with the ESCAPE data-lake

Submission to htcondor batch system through the same IAM token authN/Z

ID CMS002
Goal Demonstrate the capability to access data from ESCAPE Data-lake through CMS analysis jobs running on a HTCondor cluster and using IAM token for the whole chai of authN.Z.
Workflow
Requirements data accessible via IAM Token, even better if "embargoed"
People D Ciangottini
WP 2
Success A user submit a job to an HTCondor cluster authenticating via IAM token and then ask for the delegation of its identity/capabilities to the jobs, that, in turn, make use of the access_token provided to read data from the lake
Things to test IAM token authN/Z for both HTCondor and RSEs
Impact

Interactive extension toward batch systems with Dask

ID CMS003
Goal Put together tests of CMS001 and CMS002 through the inclusion of a dynamic scale out toward batch resources for interactive analysis. This test will use Dask to distribute the payloads from the notebooks
Workflow
Requirements
People D Ciangottini
WP 2/5
Success the user analysis completes without issue in a distributed Dask environment
Things to test End to end delegation of IAM token to access data on data-lake
Impact

Managing XCaches

ID CMS004
Goal Access to data through a dedicated XCache instance during exercises CMS001,002,005
Workflow
Requirements XCache instance at CNAF
People
WP 2
Success CMS001,002,005 succeed with input data from cache
Things to test seamless read from XCache
Impact

HPC integration in CMS

ID CMS005
Goal Replicate the user experience of CMS001 and CMS002 on CINECA HPC resources
Workflow Leveraging SLURM instead of HTCondor
Requirements same as CMS001 and CMS002
People
WP 2
Success User can ran both batch and interactive CMS analysis extending towards HPC resources
Things to test
Impact

Managing embargoed data with tokens in RUCIO

ID CMS005
Goal Managing embargoed data with IAM tokens in RUCIO
Workflow
Requirements
People D Ciangottini
WP 2
Success User can access and register data on an RSE using its IAM token
Things to test embargoed data through token
Impact Heavly impact on the results of CMS002