> For the complete documentation index, see [llms.txt](https://guides.geospatial.bas.ac.uk/working-with-scientific-data-formats-in-gis/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://guides.geospatial.bas.ac.uk/working-with-scientific-data-formats-in-gis/good-grib.md).

# "Good" GRIB

GRIB files could contain multiple data layers, and working with them in GIS could be difficult. But tools like xarray could help pre-process them.

## Data source

[ERA5 hourly data on single levels from 1940 to present](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview).

The downloaded GRIB file contains hourly Surface pressure and Sea surface temperature for the 13th-17th of March 2023.

## QGIS

GRIB files are supported by QGIS and could be easily added to the map. But not all the metadata from the file is displayed properly.&#x20;

<figure><img src="/files/nQ3alHH3WJ7xc0qgIwGG" alt=""><figcaption><p>QGIS doesn't show sub-datasets names</p></figcaption></figure>

The layer is georeferenced, but sub-dataset names are not read properly, and time in the metadata is read as a Unix time tag.&#x20;

<figure><img src="/files/UuPdmaiSsKnxHhZ58xA8" alt=""><figcaption><p>GRIB layer properties</p></figcaption></figure>

## Python + Xarray

Xarray is a Python library for working with multidimensional data. It provides the capability of reading almost every multidimensional data format along with its metadata, as well as subsetting,  resampling, and saving the data. It is convenient to use Jupyter Notebook GUI to study and work with datasets.

The full notebook download is below. To run it, [set up conda environment](/working-with-scientific-data-formats-in-gis/intro-notes.md#configuring-environment-for-jupyter-notebook), save the notebook to your working directory, and run jupyter notebook from this directory. Data for the notebook is not provided.

{% file src="/files/EqWUZHraIH64oT6rL5NT" %}

### Open dataset

```python
import xarray as xr

ds = xr.open_dataset('era5_sst_sp.grib')
```

It is very useful to study the dataset structure by printing it.&#x20;

<figure><img src="/files/Il3HvriVgmmRoTNj7ftw" alt=""><figcaption><p>dataset printout by xarray</p></figcaption></figure>

You can see, that dataset has 3 dimensions: `time`, `latitude`, and `longitude`. 2 data variables are listed: `sst` - sea surface temperature, and `sp` - surface pressure, they have the same dimensions: 120\*721\*1440. Every variable line could be expanded to view the details.

### Plot data variable

As all the dimensions are labeled, it is enough to name the dimension and pass the value for data selection.

```python
import matplotlib.pyplot as plt
import pandas as pd
```

To view a separate 1-hour layer

{% code overflow="wrap" %}

```python
plt.imshow(ds.sel(time = pd.to_datetime('2023-03-15T07:00:00')).sst.values, cmap = 'Spectral')
plt.colorbar()
plt.show()
```

{% endcode %}

<figure><img src="/files/uno6ReSQzdIUlTJlnPui" alt=""><figcaption><p>Note, that longitude hase 0-360 values, not -180 +180</p></figcaption></figure>

To view values for a specific point

{% code overflow="wrap" %}

```python
plt.plot(ds.sel(latitude=-55 , longitude=360-40).time,
         ds.sel(latitude=-55 , longitude=360-40).sp)
plt.ylabel('Pa')
plt.grid()
```

{% endcode %}

<figure><img src="/files/L2lFwylk9BB6SMBDYH8b" alt=""><figcaption><p>Plot of surface pressure for point (-50, -40)</p></figcaption></figure>

### Clip data to AOI

As it was shown before, subsetting by coordinates is very simple, as all the dimensions are labeled with physical values. To subset to South Georgia's extent we will use vector file.

```python
import geopandas as gpd

extent = gpd.read_file('SouthGeorgiaExtent.shp')
extent.total_bounds
> [-42.0, -58.0, -32.0, -52.0]
```

{% code overflow="wrap" %}

```python
sg_ds = ds.sel(latitude=slice(-52, -57) , longitude=slice(360-42, 360-32))
```

{% endcode %}

<figure><img src="/files/1jdbDVjGHYLGbNg3yv2Q" alt=""><figcaption><p>Resulting dataset has dimensions 120*21*45</p></figcaption></figure>

To convert temperature to Celsius

```python
sg_ds['sst'] = sg_ds['sst'] - 273.15
```

### Resample

```python
sg_ds_daily = sg_ds.resample(time="D").mean()
sg_ds_monthly = sg_ds.resample(time="M").mean()
```

<figure><img src="/files/bFHX7xdER1DTJQgXRvB4" alt=""><figcaption><p>Resulting dataset has dimensions 5*21*45, as it was resampled tio daily means</p></figcaption></figure>

### Save data

<pre class="language-python"><code class="lang-python"><strong>sg_ds_daily.to_netcdf('export_netcdf_daily.nc')
</strong>sg_ds_daily['sst'].to_netcdf('export_netcdf_daily_sst.nc')
</code></pre>

### View in QGIS

The output file is georeferenced, has variable name and readable time.

<figure><img src="/files/HZl9BbFMTRI3kf87HmSo" alt=""><figcaption></figcaption></figure>

Converting NetCDF to GeoTIFF will be covered in the next section.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://guides.geospatial.bas.ac.uk/working-with-scientific-data-formats-in-gis/good-grib.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
