Usage¶

To use ViewClust in a project:

import viewclust

All following examples are using vc as an alias:
```
import viewclust as vc
```
Functions can then be called with vc:
```
vc.job_use(jobs, d_from, target)
```

To use the ViewClust/Slurm sub-module:

from viewclust import slurm

Functions can then be called with slurm:

slurm.sacct_jobs(account, d_from, d_to=d_to)

ViewClust has the following collection of functions:

get_users_run (see docstring)
job_use (see docstring)
node_use (see docstring)
slurm.mem_info (see docstring)
slurm.sacct_jobs (see docstring)

Investigating Core-year Usage¶

The following pattern can be used to process jobs dataframes and get instantaneous core-year usage:

import viewclust as vc
from viewclust import slurm

# Query parameters
account = 'def-tk11br_cpu'
d_from = '2021-02-14T00:00:00'
d_to = '2021-03-16T00:00:00'

# DataFrame query
jobs_df = slurm.sacct_jobs(account, d_from, d_to=d_to)

# ViewClust processing
target = 50
clust_target, queued, running, dist = vc.job_use(jobs_df, d_from, target, d_to=d_to, use_unit='cpu-eqv')

Things to note about this example:

The functions use optional arguments. Docstrings are supported in all cases.
To generate the input jobs dataframe, we are using an included slurm helper function.
- The main job query from Slurm can take a long time if your query period is large!
Input frames can be anything as long as they are dataframes with the expected columns:
- submit (datetime64[ns]): submit time
- start (datetime64[ns]): start time
- end (datetime64[ns]): end time
- state (String or object): job state
- timelimit (timedelta64[ns]): reserved run time
- reqcpus (int64): total number of CPU cores
- mem (int64): total reserved memory
- reqtres (String or object): all requested resources, including GPUs
A plot can be generated with ViewClust-Vis

Investigating Memory Usage¶

In an effort to help with tickets, the follow pattern can be used to investigate memeory efficiency:

import viewclust as vc
import pandas as pd
from viewclust import slurm

# Define some accounts to look at
accounts = ['def-tk11br_cpu','def-jdesjard_cpu']

# Pick a time to start looking from
d_from = '2020-01-10T00:00:00'

# Loop over your queries if you want
for account in accounts:
    print('Working on: ', account) # Just some printing to make sure things are running
    # Queries slurm and uses viewclust to build memory usage plots via
    # MaxRSS field inside of sacct.
    slurm.mem_info(d_from, account, fig_out=account+'.html')
    # Saves output as an html figure!

Things to note about this example:

The functions use optional arguments. Docstrings are supported in all cases.
To generate the input job frame, we are using an included slurm helper function
Data is based on MaxRSS from slurm - which isn’t always a clear indicator

Usage¶

Investigating Core-year Usage¶

Investigating Memory Usage¶

ViewClust

Navigation

Related Topics