1582165 : Automation of NAF data upload and access for CMS HGCAL upgrade

Created: 2026-03-06T09:53:02Z - current status: open

Here is the anonymized and summarized version of the reported issue:


Summary of the Issue

A research group working on the [EXPERIMENT] upgrade is generating ~10 GB of data per module during daily testing (10–20 modules/day, scaling to ~2000 modules over 12 months). The workflow involves: 1. Data storage: Test stations store data locally, then manually upload it to the NAF via rsync. 2. Data access: Successive test stations fetch input data from the NAF (via sshfs mounts) for multi-step testing.

Problems Identified:

  1. Manual uploads: Requires manual rsync operations by team members, with credentials stored insecurely on test station PCs.
  2. File ownership: Data uploaded by different team members splits ownership, complicating administration (e.g., moving/deleting files).
  3. Data access: sshfs mounts rely on personal accounts, posing security risks.

Proposed Solution:

Create a functional account with: - Limited access (read/write to a dedicated NAF folder, e.g., naf:/data/dust/[PROJECT]). - Credentials stored on test station PCs for automated uploads/access. - Unified file ownership to simplify administration.

Questions:

  1. Is this approach feasible, and what limitations might apply?
  2. Are there alternative solutions to address these issues more efficiently?

Suggested Solution

Based on the NAF documentation, the following steps could resolve the issues:

  1. Functional Account:
  2. Request a dedicated NAF account for the project via the UCO (or experiment-specific support, e.g., naf-[EXPERIMENT]-support@desy.de).
  3. Specify the need for:
    • A shared folder (e.g., /data/dust/[PROJECT]) with group permissions.
    • Automated access (e.g., Kerberos keytabs or SSH keys) for test stations.
  4. Limitation: Functional accounts may require approval and adherence to DESY’s security policies.

  5. Automated Uploads:

  6. Use Kerberos authentication (instead of passwords) for rsync/sshfs:
    • Configure test stations to use kinit with a keytab file (stored securely).
    • Example cron job: bash 0 2 * * * kinit -k -t /path/to/keytab [FUNCTIONAL_ACCOUNT] && rsync -avz /local/data/ [FUNCTIONAL_ACCOUNT]@naf:/data/dust/[PROJECT]/
  7. Alternative: Use GridFTP or xrootd if the data is grid-accessible (see WAN Reads).

  8. File Ownership:

  9. Set up a group (e.g., naf-[EXPERIMENT]-group) with write permissions for the shared folder.
  10. Use chmod g+s (setgid) to ensure new files inherit group ownership.
  11. Example: bash chgrp naf-[EXPERIMENT]-group /data/dust/[PROJECT] chmod 2770 /data/dust/[PROJECT] # rwxrws---

  12. Data Access:

  13. Replace sshfs with NFS/AFS mounts (if available) or automated rsync pulls using the functional account.

Next Steps

  1. Contact the [EXPERIMENT] support team (e.g., naf-[EXPERIMENT]-support@desy.de) or UCO to:
  2. Request a functional account and shared folder.
  3. Discuss Kerberos/keytab setup for automation.
  4. Test the workflow with a small subset of data before full deployment.

Sources Used

  1. Getting a NAF Account
  2. Getting Support (UCO)
  3. Experiment-Specific Support
  4. WAN Reads
  5. Prerequisites for NAF Batch Usage