In this, we will see the Python library NumPy, which is a very useful array processing library in Data Science.
What exactly is NumPy?
It is a general purpose array processing package. Provides a multi dimensional array object and tools as well for working with the arrays.
It is a package that is fundamentally there for scientific computations. So, some of the important features are as follows:
- It is a powerful N- dimensional array object.
- It has sophisticate broadcasting functions.
- Consists of tools for integration of C/C++ and Fortran code.
- It also has linear algebra, Fourier transform, and random number capabilities.
Apart from the scientific uses, it can also be there as efficient multi-dimensional container of generic data.
We can define arbitrary data types using NumPy, which allows it to speedily and seamlessly integrate to a large variety of data bases.
- In case we are using Mac and Linux we can install NumPy using the pip command:
pip install numpy
- In case of Windows, we need to download the pre-built windows installer for NumPy as it does not have any package manager.
- Arrays in NumPy – Its main object is a homogenous multidimensional array.
- It is nothing but a table of elements(usually numbers), all of same type and index by tuple of positive integers.
- In the NumPy library dimensions are known as axes. Rank is the number of axes.
- Array class of NumPy is ndarray.
- Creation of Array: We can create arrays in various ways in NumPy.
- We can create array from a regular Python list or tuple using the function array. So, the array type will be deduce from the sequential elements.
- Most of the time, the elements in an array are not known but its size is known. NumPy offers functions for creation of arrays with initial placeholder content. Example: np.zeroes, np.ones, np.full, etc.
- For creation of sequence of numbers, Numpy will provide a function which is analogous to the range which returns arrays instead of lists.
- arrange: It will return evenly space values within a specific interval. The step size is specific.
- linespace: It will return evenly space values within a specific interval. The step size is specific.
- Reshaping an array: Reshape method can be there to reshape an array. In order to reshape an array we need to see that the original size of the array remains the same.
- Flatten array: We can get copy of full array collapse into one dimension, using flatten method. It will accept order argument. Its default value is ‘C’ (row-major order). For column major order we use ‘F’.
Array indexing is important for analyzing and manipulating array object. NumPy provides several ways of array indexing.
- Slicing: The arrays in NumPy can be slice. Arrays are multi-dimensional, so, we need to specify a slice for each dimension of array.
- Integer array Indexing: In this the lists are for indexing for each dimension. To construct a new arbitrary method, a one on one mapping is done for corresponding elements.
- Boolean array Indexing: It is a method there for selecting elements from aaray which satisfy some conditions.
NumPy provides a wide range of built-in arithmetic operations.
- Operations on single array: We use overload arithmetic operatorsin order to do element wise operations on an array to create another array. In the case of +=, *= -= operators, the array existing is modified.
Unary operators: Unary operations are provided as a method of ndarray It includes sum, min, max, etc. By setting an axis parameter we can apply row or column wise functions.
The operations are apply on array elementwise and new array class is create. All basic arithmetic operators can be use here. The existing array is modified in case of +=, -= and = operators.
- Universal functions(ufunc): Numpy provides us with the mathematical functions like sin, cos, exp, etc. The functions operate elementwise on array and produce an array as output.
- Sorting array: We use sort method to sort NumPy arrays.