4 National Oceanic and Atmospheric Administration (2020, June 12). GOES‐R Series Level I Requirements (LIRD). www.goes‐r.gov/syseng/docs/LIRD.pdf.
Python is a free and open‐source programming language. There are over 200,000 packages registered online that expand Python’s capabilities. This chapter provides a description of some useful packages for the Earth sciences. Some of these useful packages include NumPy, Pandas, Matplotlib, netCDF4, h5py, Cartopy, and xarray. These packages have a strong development base and a large community of support, making them appropriate for scientific investigation.
In this chapter, I discuss some reasons why Python is a valuable tool for Earth scientists. I will also provide an overview of some of the commonly used Python packages for remote sensing applications that I will use later in this book. Python evolves rapidly, so I expect these tools to improve and new ones to become available. However, these will provide a solid foundation for you to begin your learning.
Chances are, you may already know a little about what Python is and have some motivation to learn it. Below, I outline common reasons to use Python relevant to the Earth sciences:
Python is open‐source and free. Some of the legacy languages are for profit, and licenses can be prohibitively expensive for individuals. If your career plans include remaining at your current institution or company that supplies you with language licenses, then open source may not be of concern to you. But often, career growth means working for different organizations. Python is portable, which frees up your skillset from being exclusively reliant on proprietary software.
Python can increase productivity. There are thousands of supported libraries to download and install. For instance, if you want to open multiple netCDF files at once, the package called xarray can do that. If you want to re‐grid an irregular dataset, there is a package called pyresample that will do this quickly. Even more subject‐specific plots, like Skew‐T diagrams, have a prebuilt package called MetPy. For some datasets, you can download data directly into Python using OPenDAP. Overall, you spend less time developing routines and more time analyzing results.
Python is easy to learn, upgrade, and share. Python code is very “readable” and easy to modularize, so that functions can be easily expanded or improved. Further, low‐ or no‐cost languages like Python increase the shareability of the code. When the code works, it should be distributed online for other users’ benefit. In fact, some grants and journals require online dissemination.
You may already have knowledge of other computer languages such as IDL, MATLAB, Fortran, C++, or R. Learning Python does not mean you will stop using other languages or rewrite all your existing code. In many cases, other languages will continue to be a valuable part of your daily work. For example, there are a few drawbacks to using Python:
Python may be slower than compiled languages. While many of the core scientific packages compile code on the back end, Python itself is not a compiled language. For a novice user, Python will run more slowly, especially if loops are present in the code. For a typical user, this speed penalty may not be noticeable, and advanced users can tap into other runtime frameworks like Dask or Cython or even run compiled Fortran subroutines to enhance performance. However, new users might not feel comfortable learning these workarounds, and even with runtime frameworks and subroutines, performance might not improve. If speed is a concern, then Python could be used as a prototype code tool prior to converting into a compiled language.
New users often run packages “as‐is” and the contents are not inspected. There are thousands of libraries available, but many are open‐source, community projects and bugs and errors will exist. For example, irregular syntax can result whenever there is a large community of developers. Thus, scientists and researchers should be extra vigilant and only use vetted packages.
Python packages may change function syntax or discontinued. Python changes rapidly. While most developers refrain from abruptly changing syntax, this practice is not always followed. In contrast, because much of the work in developing these packages is on a volunteer basis, the communities supporting them could move on to other projects and those who take over could begin a completely new syntax structure. While this is unlikely to be the case for highly used packages, anything is possible. For example, a popular map plotting package called Basemap was discontinued in the module Matplotlib and replaced with Cartopy, which is an older code package. I recommend using packages that have backing from Earth science research institutions (e.g., the UK Met Office, NASA, Lamont, etc.) to raise confidence that the packages you choose to use will be relatively stable.
Unlike legacy languages such as Fortran and C++, there is no guarantee that code written in Python will remain stable for 30+ years. However, the packages presented in this book are “mature” and are likely to continue to be supported for many years. Additionally, you can reproduce the exact packages and versions using virtual environments (Section 11.3). This text highlights newer packages that save significant amounts of development time and streamline certain processes, including how to open and read netCDFfiles and gridding operations.
2.2 Useful Packages for Remote Sensing Visualization
Python contains intrinsic structure and mathematical commands, but its capabilities can be dramatically increased using modules. Modules are written in Python or a compiled language like C to help simplify common, general, or redundant tasks. For instance, the datetime module helps programmers manipulate calendar dates and times using a variety of units. Packagescontain one or more modules, which are often designed to facilitate tasks that follow a central theme. Some other terms used interchangeably for packages are libraries and distributions.
At the time of writing, there are over 200,000 Python packages registered on pipy.organd more that live on the internet in code repositories such as GitHub ( https://github.com/). Many of the most popular packages are often developed and maintained by large online communities. This massive effort benefits you as a scientist because many common tasks have already been developed in Python by someone else. This can also create a dilemma for scientists and researchers – the trade‐off between using existing code to save time against time spent researching and vetting so many code options. Additionally, because many of these packages do not have full‐time staff support, the projects can be abandoned by their development teams, and your code could eventually become obsolete.
In your research, I suggest you use three rules when choosing packages to learn and work with:
1 Use established packages.
2 Use packages that have a large community of support.
3 Use code that is efficient with respect to reduced coding time and increased speed of performance.
Following is a list of the main Python packages that I will cover in this text.
NumPyis the fundamental package for scientific computing with Python. It can work with multidimensional arrays, contains many advanced mathematical functions, and is useful for linear algebra, Fourier transforms, and for generating random numbers. NumPy also allows users to encapsulate data efficiently. If you are familiar with MATLAB, you will feel very comfortable using this package.
Читать дальше