Data types and formats

There are many common data types/ structures and terminology to go along with them. Some examples:

How to work with file formats commonly found at NSIDC: In most cases, it’s best to avoid low-level libraries such as netCDF4 or h5py. Higher-level libraries provide more intuitive access, automatically handle metadata, and streamline analysis. Some format descriptions and reccomendations are in the table below.

File Format Description Recommended Tools
NetCDF4 / NetCDFx Multidimensional climate/remote sensing data (time, lat, lon, variables). xarray (xr.open_dataset) in Python; terra or ncdf4 in R.
HDF5 Hierarchical format for storing arrays, tables, and metadata; used widely in NASA products. xarray, pandas; avoid h5py unless necessary.
HDF-EOS Earth Observing System variant of HDF, often with swath, grid, or point structures. xarray, h5netcdf, NASA harmony-py.
Shapefile Vector geospatial data (points, lines, polygons) with CRS support. geopandas (Python); sf (R).
GeoTIFF Georeferenced raster imagery and gridded data. rasterio, rioxarray (Python); terra, raster (R).
CSV/TSV Tabular text-based files, rows = observations, columns = variables. pandas (Python); readr/data.table/tibble (R).