get_file_list.pl
get_file_list.pl queries the STAR File Catalog to find data files and build the file lists you feed to your analysis (or to the Data Carousel).
Full reference: File Catalog user manual
Syntax
get_file_list.pl -keys 'path,filename' -cond 'storage=XX,filetype=XX,filename~XX,production=XX,trgsetupname=XX' -limit NN -distinct -delim '/'
-cond— what to select (see the table below). Use~for a substring match (filename~st_physics) and!=to exclude (storage!=hpss). Keys are case-insensitive.-keys— what to print for each match (see Output).-distinct— drop duplicate rows.-limit NN— cap the number of results;-limit 0returns all files.-delim '/'— join the printed fields with/instead of the default::.
What to select (-cond)
| Key | Selects | Where to find values |
|---|---|---|
storage | local (/home/starlib/…), NFS (/star/dataXX/…), or HPSS (/home/starreco/reco/…) | — |
filetype | daq_reco_event (DST), daq_reco_muDst (MuDst), daq_reco_picoDst (PicoDst) | PicoDst format |
filename | filename substring, e.g. filename~st_physics | — |
production / library | production tag / software library | production options |
trgsetupname | trigger setup | data summary · production |
What to print (-keys)
For a plain file list, use -keys 'path,filename' -delim '/'.
To see where the data lives and how big it is, add more fields, e.g. -keys 'fdid,storage,site,node,path,filename,events':
storage—local,NFS, orHPSS(handy to know if a file is still on tape)site,node— where the file physically sitsevents— event count, useful to check before you restore from tapefdid— file descriptor ID
Most common query
Find PicoDst files on disk (not tape):
get_file_list.pl -keys 'path,filename' \
-cond 'storage!=hpss,filetype=daq_reco_picoDst,filename~st_physics,production=P12id,trgsetupname=pp200_production_2012' \
-limit 10 -distinct -delim '/'
Use storage=local to search only the distributed disks.
More examples
get_file_list.pl -keys 'path,filename'\
-cond 'storage=hpss,filetype=daq_reco_muDst,filename~st_physics,production=P11id,trgsetupname=AuAu19_production'\
-limit 10 -distinct -delim '/'
get_file_list.pl -keys path,filename \
-cond storage=local,trgsetupname=production_pp200trans_2015,filetype=daq_reco_mudst,filename~st_fms_16 \
-delim '/'
get_file_list.pl -keys 'fdid,storage,site,node,path,filename,events' \
-cond 'trgsetupname=AuAu19_production, filetype=daq_reco_MuDst, filename~st_physics, storage!=hpss' \
-limit 60 -delim '/'
get_file_list.pl -keys 'path,filename' \
-cond 'production=P11id,filetype=daq_reco_MuDst,trgsetupname=AuAu19_production,tpx=1,filename~st_physics,sanity=1,storage!=HPSS' \
-limit 60 -delim '/'
Tips
- Check disk first, tape last. Query with
storage!=hpssand only fall back tostorage=hpssif nothing comes back — no point restoring a file that’s already on disk. (The catalog searcheslocal→NFS→HPSSin that order anyway.) /home/starlib/…is distributed disk (DD). You can’t browse it, butrootcan open the files using the prefixroot://xrdstar.rcf.bnl.gov:1095/./home/starreco/reco/…is HPSS tape. Those files can’t be opened until you restore them — see below.
Restoring files from HPSS
Once you have the HPSS paths, restore them through the Data Carousel — that page has the full hpss_user.pl walkthrough and request-list formats. Never pull MuDst straight off tape with hsi/htar; see HPSS (Tape) for why. The short version:
- Find the files with
get_file_list.plandstorage=hpss. - Pick a good run in the STAR RunLog Browser.
- Check event counts first by adding
eventsto-keys(e.g.-keys 'path,filename,events'). - Make a target directory on your local or PWG disk.
- Submit the request:
hpss_user.pl HPSSFilePath/ TargetFilePath/(hpss_user.pl -hlists all options). - Track it on the accounting page; restored files land in your target directory.