Pandasis a library that permits using data frame (stylized DataFrame) structures and includes a suite of I/O and data manipulation tools. Unlike NumPy, Pandas allows you to reference named columns instead of using indices. With Pandas, you can perform the same kinds of essential tasks that are available in spreadsheet programs (but now automated and with fewer mouse clicks!). For those who are familiar with R programming language, Pandas mimics the R data.frame function.
A limitation of Pandas is that it can only operate with 2D data structures. More recently, the xarraypackage has been developed to handle higher‐dimensional datasets. In addition, Pandas can be somewhat inefficient because the library is technically a wrapper for NumPy, so it can consume up to three times as much memory, particularly in Jupyter Notebook. For larger row operations (500K rows or greater), the differences can even out. (Goutham, 2017).
Matplotlibis a plotting library, arguably the most popular one. Matplotlib can generate histograms, power spectra, bar charts, error charts, and scatterplots with a few lines of code. The plots can be completely customized to suit your aesthetics. Due to their similarities, this is another package where MATLAB experience may come in handy.
I will discuss two common self‐describing data formats, netCDF and HDF, in Section 3.2.3. Two major packages for importing these formats are the netCDF4and h5pypackages. These tools are advantageous because the user does not have to have any knowledge of how to parse the input files, so long as the files follow standard formatting. These two packages import the data, which can then be converted to NumPy to perform more rigorous data operations.
Cartopyis a package for projecting geospatial data to maps. It can also be used to access a wealth of features, including land/ocean masks and topography. Many projections are available, and you can easily transform the data between them.
Previously, Basemapwas the primary package for creating maps. You may come across examples that use it online. However, the package is now deprecated and Cartopy has become the primary package that interfaces with Matplotlib.
Cartopy is a package available from the SciTools organization, which was originally developed by the UK Met Office. It has now expanded into a community collaboration.
The packages detailed in this section are worth mentioning because they may apply to your specific project. Further, some features are too good to ignore, so they are highlighted below. However, if your code requires a long‐term shelf life, it may be best to find alternative solutions, as the following packages may change more rapidly than those listed in Section 2.3.
xarrayis a package that borrows heavily from Pandas to organize multidimensional data. Mathematical operations are lightning fast thanks to dimensional and coordinate indexing. Visualization is also easy. xarray is valuable to Earth scientists because it permits opening multiple netCDF files with ease. Interpolation and group operations is also possible.
The xarray syntax can be challenging to newcomers. It can be difficult to wrangle the data into the format needed. Nevertheless, this tool is worth the time investment due to the many features of interest to Earth science.
Daskinterfaces with Pandas, Scikit‐Learn, and NumPy to perform parallel processing and out‐of‐memory operations that can read data in chunks without ever being totally in the computer’s RAM. This is very useful for working with large datasets. If speed needs to be prioritized, it would be worth learning this package.
Irisis a format‐agnostic Python library for analyzing and visualizing Earth science data. If datasets follow the standard CF formatting conventions, Iris can easily load the data. The Iris package has a steep learning curve but can be useful for performing meteorological computations. Like Cartopy, Iris is a package available from the SciTools organization.
MetPyis a collection of tools in Python for reading, visualizing, and performing calculations with weather data. MetPy enables downloading a curated collection of remote sensing datasets. Unit conversions are easy to perform, which is helpful when making calculations of meteorological variables. MetPy is maintained by Unidata in Boulder, Colorado.
Cfgribis a useful package for reading GRIB1 and GRIB2 data, which is a common format for reanalysis and model data, particularly for the ECMWF. Cfgrib decodes GRIB data in a way that it mimics the structure of NetCDF files using the ecCodespython package. ecCodes was developed by ECMWF to decoding and encoding standard WMO GRIB and BUFR files.
I hope you are excited to begin your Python journey. Since it is free and open‐source, Python is a valuable tool that you can carry with you for the rest of your career. Furthermore, there are many existing packages to perform common tasks in the Earth Sciences, such as importing common datasets, organizing data, performing mathematical analysis, and displaying results. In the next chapter, I will describe some of the common satellite data formats you may encounter.
1 Dask: Scalable analytics in Python. (n.d.). Retrieved November 25, 2020, from https://dask.org/
2 ecmwf/cfgrib. (2020). Python, European Centre for Medium‐Range Weather Forecasts. Retrieved from https://github.com/ecmwf/cfgrib(Original work published July 16, 2018).
3 Matplotlib: Python plotting — Matplotlib 3.3.3 documentation. (n.d.). Retrieved November 25, 2020, from https://matplotlib.org/.
4 MetPy — MetPy 0.12. (n.d.). Retrieved November 25, 2020, from https://unidata.github.io/MetPy/latest/index.html.
5 NumPy. (n.d.). Retrieved November 25, 2020, from https://numpy.org/
6 Overview: Why xarray? — xarray 0.16.2.dev3+g18a59a6d.d20200920 documentation. (n.d.). Retrieved November 25, 2020, from http://xarray.pydata.org/en/stable/why‐xarray.html.
7 pandas documentation — pandas 1.1.4 documentation. (n.d.). Retrieved November 25, 2020, from https://pandas.pydata.org/pandas‐docs/stable/index.html.
8 Vaze, P., Neeck, S., Bannoura, W., Green, J., Wade, A., Mignogno, M., et al. (2010). The Jason‐3 Mission: completing the transition of ocean altimetry from research to operations. In R. Meynart, S. P. Neeck, & H. Shimoda (Eds.) (p. 78260Y). Presented at the Remote Sensing, Toulouse, France. https://doi.org/10.1117/12.868543.
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.
Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.