Open in Colab: https://colab.research.google.com/github/casangi/casaconfig/blob/master/docs/external_data.ipynb


[1]:
# this page contains live examples of casa usage from the following installation
!pip install casaconfig==0.0.26  > /dev/null
!pip install python-casacore==3.4.0 >/dev/null

External Data

Each CASA distribution requires a runtime configuration and minimal repository of binary data for CASA to function properly. This is contained in the **casaconfig repository** and bundled in to a **casaconfig package** for modular CASA. The repository includes Measures Tables that deal with the Earth Orientation Parameters (EOPs), ephemeris data, antenna configurations, beam models, and calibration corrections, along with a default configuration file that properly sets up and maintains the data repository contents.

By default, CASA maintains this system automatically, placing small daily updates of measures data in the users ~/.casa/ directory. Users with custom ~/.casa/config.py files may modify or disable this functionality.

Warning

CASA now uses a single config.py to specify its entire configuration. Users must ensure their ~/.casa/config.py is up-to-date (see section on Default Configuration) or delete it to utilize default functionality from casaconfig. Old config.py files will not work.

The following figure depicts a high level view of the external data management system, including the casaconfig Github repository, modular pip package and installation, monolothic tarball and installation, and ASTRON ftp server.

casaconfigsystem

The following sections illustrate how to manually manipulate data contents. By default, CASA will generally handle this automatically unless the user overrides the settings with their own ~/casa/config.py file.

Locating the Data Directory

The casaconfig package contains an internal __data__ subdirectory to hold the repository tables.

[2]:
# get the location of the data directory within the installation package
from casaconfig import get_data_dir
print(get_data_dir())
/usr/local/lib/python3.7/dist-packages/casaconfig/__data__/
[3]:
# see what's in it
!ls /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/
README.txt

The initial installation of the casaconfig package leaves the __data__ folder empty. This is because the data tables are quite large and not suitable for Python package distribution. The __data__ folder must be populated after installation (not necessary with monolithic casa tarball which already has the data populated).

[4]:
# populate the installation __data__ folder
from casaconfig import pull_data
pull_data()

!ls /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/
casaconfig downloading data contents to /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/ ...
alma      demo         ephemerides  gui   README.txt
catalogs  dish_models  geodetic     nrao  telescope_layout
[5]:
# populate some custom location with the data folder contents
pull_data('./casadata')

!ls ./casadata
casaconfig downloading data contents to ./casadata ...
alma      demo         ephemerides  gui   telescope_layout
catalogs  dish_models  geodetic     nrao

WARNING: If you are using python-casacore directly (outside of CASA), you will need to set your .casarc file to point to wherever you installed casaconfig and/or populated a data folder

[6]:
# tell casacore where to find casaconfig
from casaconfig import set_casacore_path
data_path = get_data_dir()
set_casacore_path(data_path)

!more ~/.casarc
writing /root/.casarc...
measures.directory: /usr/local/lib/python3.7/dist-packages/casaconfig/__data__

Updating the Data Directory

Most of the data tables (such as beam models, antenna and Jy/K correction tables, and the antenna configuration files for the CASA simulator) are versioned by CASA release and do not need to change. However, the Casacore Measures tables (ie geodetic subdirectory) must be updated frequently after release.

It is important to know what version/date of Measures data is currently populated. This can be done by examining the readme.txt file in the geodetic subdirectory.

[7]:
!cat /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/geodetic/readme.txt
# measures data populated by casaconfig
version : WSRT_Measures_20211102-160001.ztar
date : 2021-11-02

Alternatively, the IERSPredict table has a version field (requires python-casacore also be installed).

[8]:
# view the current measures data date
from casaconfig import table_open
xds = table_open('/usr/local/lib/python3.7/dist-packages/casaconfig/__data__/geodetic/IERSpredict')

print(xds.attrs['vs_date'])
2021/11/02/15:00

The measures_update() function is used to download new measures data from the originating source. By default, this function will retrieve the latest data. If you already have the latest data, then nothing will happen.

[9]:
from casaconfig import measures_update
measures_update()
!cat /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/geodetic/readme.txt
casaconfig connecting to ftp.astron.nl ...
casaconfig downloading WSRT_Measures_20220308-160001.ztar from ASTRON server to /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/ ...
# measures data populated by casaconfig
version : WSRT_Measures_20220308-160001.ztar
date : 2022-03-08

Specific versions of past measures data can be retrieved as well. This may be important if trying to exactly replicate the conditions of a particular data reduction run in CASA. Generally though the measures data is appended with time, so past and current versions should have the same values at the same points in time (see later section of casacore measures data contents).

[10]:
# see if something newer is available
from casaconfig import measures_available
versions = measures_available()

print(versions[-3:])
['WSRT_Measures_20220306-160001.ztar', 'WSRT_Measures_20220307-160001.ztar', 'WSRT_Measures_20220308-160001.ztar']
[11]:
# retrieve a version from a while back
measures_update(version=versions[-10], force=True)
!cat /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/geodetic/readme.txt
casaconfig connecting to ftp.astron.nl ...
casaconfig downloading WSRT_Measures_20220227-160001.ztar from ASTRON server to /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/ ...
# measures data populated by casaconfig
version : WSRT_Measures_20220227-160001.ztar
date : 2022-03-08

Split Data Locations

Often times with site installations of CASA, it is convenient to leave a bulk of the data contents in a single shared location that is write-protected. At runtime, users can pull the latest Measures data (small) to their local CASA working directory. This scheme gives users complete control over what version / update frequency of Measures data they prefer without the inefficiency of storing copies of the complete casaconfig data contents for every user.

To make use of this, we need to split the location of the measures data (i.e. geodetic folder) from the rest of the casaconfig data contents. Let’s assume we’ve configured CASA to look for data in two locations:

  1. The default casaconfig package installation location

  2. A local folder in each users home directory (./my_local_data)

[12]:
# SYSTEM ADMIN - one-time setup of the package installation folder
from casaconfig import pull_data
pull_data()
casaconfig found populated data folder /usr/local/lib/python3.7/dist-packages/casaconfig/__data__/
[13]:
# USERS - every time you run CASA
from casaconfig import measures_update
measures_update('./my_local_data')
casaconfig connecting to ftp.astron.nl ...
casaconfig downloading WSRT_Measures_20220308-160001.ztar from ASTRON server to ./my_local_data ...

Now we have the large static data stored with the package installation, and the small measures data stored in a user writeable location for regular updates.

[14]:
!du -h -d 0 /usr/local/lib/python3.7/dist-packages/casaconfig/__data__
!du -h -d 0 ./my_local_data
543M    /usr/local/lib/python3.7/dist-packages/casaconfig/__data__
26M     ./my_local_data

Default Configuration

A default configuration file is included in casaconfig that sets up split data locations (as described in the preceding section), and automatically updates the measures data each time CASA is run. This file can be retrieved and saved to the users home directory. From there, users can modify as desired to change default values, print/log additional information, etc.

[15]:
from casaconfig import write_default_config

write_default_config('~/.casa/config.py')

!cat ~/.casa/config.py
writing /root/.casa/config.py
import os, time, pkg_resources

# list of paths where CASA should search for data subdirectories
datapath = [pkg_resources.resource_filename('casaconfig', '__data__/')]

# location of required runtime measures data, takes precedence over location(s) in datapath list
measurespath = os.path.expanduser("~/.casa/measures")

# automatically update measures data if not current (measurespath must be user-writable)
measures_update = True

# log file path/name
logfile='casalog_%s.log' % time.strftime("%Y-%m-%d", time.localtime())

# do not create a log file when True, If True, then any logfile value is ignored and there is no log file
nologfile = False

# print log output to terminal when True (in addition to any logfile and CASA logger)
log2term = False

# do not start the CASA logger when True
nologger = False

# avoid starting GUI tools when True. If True then the CASA logger is not started even if nologger is False
nogui = False

# the IPython prompt color scheme. Must be one of "Neutral", "NoColor", "Linux" or "LightBG", default "Neutral"
colors = "Neutral"

# startup without a graphical backend if True
agg = False

# attempt to load the pipeline modules and set other options appropriate for pipeline use if True
pipeline = False

# create and use an IPython log in the current directory if True
iplog = False

# allow anonymous usage reporting
telemetry_enabled = True

# location to place telemetry data prior to reporting
telemetry_log_directory = os.path.expanduser("~/.casa/telemetry")

# maximum size of telemetry recording
telemetry_log_limit = 20480

# telemetry recording size that triggers a report
telemetry_log_size_interval = 60

# telemetry recording report frequency
telemetry_submit_interval = 604800

# allow anonymous crash reporting
crashreporter_enabled = True

# include the user's local site-packages in the python path if True. May conflict with CASA modules
user_site = False

This config.py file is executed when casatools is initially imported (or monolithic CASA is started). Changes to the file require restarting the python environment and re-importing casatools.

Data Directory Contents

Casacore Measures

The casacore Measures tables are needed to perform accurate conversions of reference frames. Casacore infrastructure includes classes to handle physical quantities with a reference frame, so-called Measures. Each type of Measure has its own distinct class in casacore which is derived from the Measure base class. One of the main functionalilties provided by casacore w.r.t. Measures, is the conversion of Measures from one reference frame to another using the MeasConvert classes.

Many of the spectral, spatial, and time reference frames are time-dependent and require the knowledge of the outcome of ongoing monitoring measurements of properties of the Earth and astronomical objects by certain service observatories. This data is tabulated in a number of tables (Measures Tables) which are stored in the casadata repository in the subdirectory geodetic. A snapshot of this repository is included in each tarball distribution of CASA and in the casadata module for CASA6+.

Measures tables are updated daily based on the refinement of the geodetic information from the relevant services like the International Earth Rotation and Reference Systems Service (IERS). Strictly speaking, the Measures tables are part of the casacore infrastructure which is developed by NRAO, ESO, NAOJ, CSIRO, and ASTRON. In order to keep the repository consistent between the partners, the Measures tables are initially created at a single institution (ASTRON) and then copied into the NRAO casadata repository from which all CASA users can retrieve them. As of March 2020, the update of the NRAO CASA copy of the Measures tables in geodetic and the planetary ephemerides in directory ephemerides takes place every day between 18 h UTC and 19 h UTC via two redundant servers at ESO (Garching).

CASA releases need to be updated with recent Measures tables (see above). For observatory use, the update period should not be longer than weekly in order to have the EOPs up-to-date for upcoming observations. The shortest reasonable update interval is daily. For offline data analysis use, the update period should not be longer than monthly. Weekly update is recommended.

Legacy installations processing old data do not have to be updated because the relevant contents of the Measures Tables is not changing anymore for the more distant past.

The following list describes the individual Tables in subdirectory geodetic:

  • IERSeop2000: The IERS EOP2000C04_05 Earth Orientation Parameters using the precession/nutation model “IAU2000” (files eopc04_IAU2000.xx)

  • IERSeop97: The IERS EOPC04_05 Earth Orientation Parameters using the precession/nutation model “IAU 1980” (files eopc04.xx)

  • IERSpredict: IERS Earth Orientation Data predicted from NEOS (from file ftp://ftp.iers.org/products/eop/rapid/daily/finals.daily)

  • IGRF: International Geomagnetic Reference Field Schmidt semi-normalised spherical harmonic coefficients. (Note that this still uses IGRF12. An update to IGRF13 is underway.)

  • IMF (not a Measures Table proper, access not integreated in Measures framework): Historical interplanetary magnetic field data until MJD 52618 (December 2002).

  • KpApF107 (not a Measures Table proper, access not integreated in Measures framework): Historical geomagnetic and solar activity indices until MJD 54921 (April 2009)

  • Observatories: Table of the official locations of radio observatories. Maintained by the CASA consortium.

  • SCHED_locations (not a Measures Table proper, access not integreated in Measures framework): VLBI station locations

  • TAI_UTC: TAI_UTC difference (i.e. leap second information) obtained from USNO

Measures Tables in the directory ephemerides:

We can open and inspect the contents of measures geodetic data directory using the casaconfig table_open function. This returns an **xarray dataset** object (abbreviated as xds).

[16]:
from casaconfig import table_open, get_data_dir
xds = table_open(get_data_dir()+'/geodetic/IERSpredict')

print(xds)
<xarray.Dataset>
Dimensions:  (d0: 929)
Dimensions without coordinates: d0
Data variables: (12/13)
    MJD      (d0) float64 5.88e+04 5.88e+04 5.88e+04 ... 5.973e+04 5.973e+04
    X        (d0) float64 0.1556 0.1544 0.1532 0.1521 ... 0.136 0.1374 0.1398
    DX       (d0) float64 1.9e-05 3e-05 3e-05 ... 0.007849 0.007854 0.007939
    Y        (d0) float64 0.275 0.275 0.275 0.275 ... 0.4688 0.4691 0.4683
    DY       (d0) float64 2.1e-05 2.4e-05 2.6e-05 ... 0.01043 0.01033 0.01031
    DUT1     (d0) float64 -0.1629 -0.1634 -0.1639 ... -0.1148 -0.1138 -0.1129
    ...       ...
    LOD      (d0) float64 0.0005854 0.0004804 0.0003702 ... 0.0 0.0 0.0
    DLOD     (d0) float64 2.7e-06 4.3e-06 3e-06 3.3e-06 ... 0.0 0.0 0.0 0.0
    DPSI     (d0) float64 -112.4 -112.5 -112.5 -112.3 -112.1 ... 0.0 0.0 0.0 0.0
    DDPSI    (d0) float64 0.174 0.086 0.086 0.086 0.081 ... 0.0 0.0 0.0 0.0 0.0
    DEPS     (d0) float64 -7.223 -7.307 -7.465 -7.583 -7.617 ... 0.0 0.0 0.0 0.0
    DDEPS    (d0) float64 0.183 0.26 0.26 0.26 0.272 ... 0.0 0.0 0.0 0.0 0.0
Attributes: (12/20)
    vs_create:    2020/02/10/03:00
    vs_date:      2022/02/27/15:00
    vs_version:   0623.0747
    vs_type:      IERS Earth Orientation Data predicted from NEOS
    tab_version:  0002.0000
    mjd0:         58798.0
    ...           ...
    lod_unit:     s
    dlod_unit:    s
    dpsi_unit:    arcsec
    ddpsi_unit:   arcsec
    deps_unit:    arcsec
    ddeps_unit:   arcsec
[17]:
xds.plot.scatter('MJD','DX')
[17]:
<matplotlib.collections.PathCollection at 0x7fb7a2679d10>
_images/external_data_32_1.png
[18]:
xds.plot.scatter('MJD','DUT1')
[18]:
<matplotlib.collections.PathCollection at 0x7fb7a2bef1d0>
_images/external_data_33_1.png

Observe how the IERSpredict table updates over time

[19]:
from casaconfig import measures_update, measures_available
versions = measures_available()
measures_update('./older_measures', version=versions[2], force=True)
measures_update('./newer_measures', version=versions[-1], force=True)
old_xds = table_open('./older_measures/geodetic/IERSpredict')
new_xds = table_open('./newer_measures/geodetic/IERSpredict')
new_xds.DX[200:].plot()
old_xds.DX[200:].plot()
casaconfig connecting to ftp.astron.nl ...
casaconfig downloading WSRT_Measures_20211121-160001.ztar from ASTRON server to ./older_measures ...
casaconfig connecting to ftp.astron.nl ...
casaconfig downloading WSRT_Measures_20220308-160001.ztar from ASTRON server to ./newer_measures ...
[19]:
[<matplotlib.lines.Line2D at 0x7fb7a2147590>]
_images/external_data_35_2.png

Ephemeris Data

The ephemeris tables hold a selection of the solar system objects from JPL-Horizons database. The data tables are generated from the JPL Horizons system’s on-line solar system data and ephemeris computation service (https://ssd.jpl.nasa.gov/?horizons ). These are primarily used to determine flux models for the solar system objects used in the setjy task. These tables are stored as CASA tables in the casadata repository under ephemerides/JPL-Horizons. The current ephemeris tables cover ephemerides until December 31, 2030 for those objects officially supported in setjy.

Available objects, which include major planets, satellites, and asteroids, are: Mercury, Venus, Mars, Jupiter, Saturn, Uranus, Neptune, Pluto, Io, Europa, Ganymede, Callisto, Titan, Ceres, Vesta, Pallas, Juno, Lutetia, Sun and Moon (the objects in bold are those supported in ‘Butler-JPL-Horizons 2012’ standard in setjy.).

The format of the table name of these tables is objectname_startMJD_endMJD_J2000.tab These tables required by setjy task are included in the data directory in the CASA distribution. The available tables can be listed by the following commands:

#In CASA6

CASA <1>: import glob

CASA <2>: jpldatapath=os.getenv('CASAPATH').split(' ')[0]+'/data/ephemerides/JPL-Horizons/*J2000.tab'

CASA <3>: glob.glob(jpldatapath)

The following data are retrieved from the JPL-Horizons system (the nubmer in the parentheses indicates the column number listed in the JPL-Horizons system). One should refer https://ssd.jpl.nasa.gov/?horizons_doc for the detailed descreption of each of these quantities.

Quantities

column no.

Unit/format

Descrition

column label

Date

n.a.

YYYY-MM-DD

HH:MM

Date__(UT)__HR:MN

Astrometric RA & DEC

1

degrees

Astrometric RA and Dec with respect to the observer’s location (GEOCETRIC)

R.A._(ICRF)_DEC

Observer sub-long& sub-lat

14

degrees

Apparent planetodetic (“geodetic”) longitude and latitude of the center of the target seen by the OBSERVER at print-time

ob-lon, ob-lat

Solar sub-long & sub-lat

15

degrees

Apparent planetodetic (“geodetic”) longitude and latitude of the Sun seen by the OBSERVER at print-time

Sl-lon, Sl-lat

North Pole Pos. ang. & dist.

17

degrees and arcseconds

Target’s North Pole position angle and angular distance from the “sub-observer” point

NP.ang, NP.ds

Helio range & range rate

19

AU, km/s

Heliocentric range (r) and range-rate

(rdot)

Observer range & range rate

20

AU, km/s

Range (delta) and range-rate (deldot) of the target center with respect to the observer

delta, dedot

S-T-O angle

24

degrees

Sun-Target-Observer angle

S-T-O

The script request.py (located in casatasks.private for CASA6) can be used to retrieve the ephemeris data from the JPL-Horizons system via e-mail (See also Manipulate Ephemeris Objects page). Further, the saved email text file is converted to a CASA table format using JPLephem_reader2.

#In CASA6

CASA <5>: from casatasks.private import JPLephem_reader2 as jplreader

CASA <6>: outdict = jplreader.readJPLephem('titan-jpl-horizons-ephem.eml')
opened the file=titan-jpl-horizons-ephem.eml

CASA <7>: jplreader.ephem_dict_to_table(outdict,'Titan_test_ephem.tab')
Got nrows = 3653 from MJD

The converted table contains following columns.

Column name

unit/format

description

MJD

day

modified Julian date

RA

degree

atrometric right acension in ICRF/J2000 frame

DEC

degree

astrometric declination in ICRF/J2000 frame

Rho

AU

Geocentric distance

RadVal

AU/d

Geocentric distance rate

NP_ang

degree

North pole position angle

NP_dist

degree

North pole angular distance

DiskLong

degree

Sub-observer longitude

DiskLat

degree

Sub-observer latitude

Sl_lon

degree

Sub-Solar longitude

Sl_lat

degree

Sub-Solar latitude

r

AU

heliocentric distance

rdot

km/s

heliocentric distance rate

phang

degree

phase angle

Array Configuration

Array configuration files for various telescopes are distributed with each CASA release. These configuration files can be used to define the telescope for simulator tools and tasks. Currently, configuration files for the following telescopes are available in CASA:

  • ALMA / 12m Array

  • ALMA / 7m ACA

  • VLA

  • VLBA

  • Next-Generation VLA (reference design)

  • ATCA

  • MeerKat

  • PdBI (pre-NOEMA)

  • WSRT

  • SMA

  • Carma

The full list of antenna configurations can be found in the CASA Guides on Simulations.

One can also locate the directory with the configurations in the CASA distribution and then list the configuration files, using the following commands in CASA:

CASA <1>: print os.getenv('CASAPATH').split(' ')[0] + '/data/alma/simmos/'
/home/casa/packages/RHEL7/release/casa-release-5.4.0-68/data/alma/simmos/

CASA <2>: ls /home/casa/packages/RHEL6/release/casa-release-5.4.0-68/data/alma/simmos/

If a configuration file is not distributed with CASA but retrieved elsewhere, then the configuration file can be called by explicitly writing the full path to the location of the configuration file in the antennalist paramter of the simulator tasks.

NOTE: the most recent ALMA configuration files may not always be available in the most recent CASA version. ALMA configuration files for all cycles are available for download here. For the Next-Generation VLA reference design, the latest information can be found here.