Skip to main content

Learn NumPy - Free tutorial

NumPy stands for numerical python and is a python library used for working with arrays. It has functions for working in the domains of linear algebra, fourier transform and matrices. It was created in 2005 by Travis Oliphant and is an open source project that can be used freely.

Why Use NumPy

Python has lists that serve the purpose of arrays but they are slow too process. NumPy aims to provide an array object that is up to 50x faster than traditional python lists. the array object in NumPy is called ndarray. It provides a lot of supporting functions that make working with ndarray very easy.

NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently. this behaviour is called locality of reference in computer science. It is also optimized to work with the latest CPU architecture. This is why NumPy is faster than lists. NumPy is written partially in python but most of the parts that require fast computation are written in C or C++.

The source code for NumPy can be found on NumPy's official GitHub Page.

INSTALLATION

If you have python and pip already installed then installation of NumPy is very easy. Use the following command:

 pip install numpy  

If this command fails, use a python distribution that already has NumPy installed like Anaconda, Spyder etc. Once it is installed, you can import it for use in your applications using the import keyword. It is usually imported under the np alias. You can create an alias with the as keyword while importing. Consider the example below:

 import numpy as np  
 array1 = np.array([1, 2, 3, 4, 5]  
 print(array1)  

To check the NumPy version, the version string is stored under _version_ attribute.

 import numpy as np  
 print(np._version_)  

Creating Arrays

As previously stated, NumPy is used to work with arrays. The array object in NumPy is called ndarray. We can create a NumPy ndarray object using the array( ) function. 

 import numpy as np  
 array1 = np.array([1, 2, 3, 4, 5]) 
 print(array1)  
 print(type(array1))  

The built-in python function type( ) tells us the type of the object passed to it. In the code above, the type( ) function shows that array1 is nump.ndarray type. To create an ndarray we can pass a list, tuple or any array-like object into the array( ) method and it will be converted into an ndarray. For example, if we pass a tuple, the code will appear as below:

 import numpy as np  
 array1 = np.array((1, 2, 3, 4, 5))  
 print(array1)  

Dimensions in Arrays

A dimension in arrays is one level of array depth(nested arrays - which are arrays that have arrays as their elements).

0-D arrays or scalars are the elements in an array. Each value in an array is a 0-D array. We can create a 0-D array with any number. For example 11.

 import numpy as np  
 array1 = np.array(11)  
 print(array1)  

1-D arrays - it is an array that has 0-D arrays as its elements. It is also called a uni-dimensional array. These are the most common and basic arrays. Example;

 import numpy as np  
 array1 = np.array([1, 2, 3, 4, 5])  
 print(array1)  

2-D arrays

It is an array that has 1-D arrays as its elements. They are often used to represent matrix or  second order tensors. NumPy has a whole sub module dedicated towards matrix operations called numpy.mat

For example we can create a 2-D array containing numbers 1 to 6.

 import numpy as np  
 array1 = np.array([[1, 2, 3], [4, 5, 6]])  
 print(array1)  

3-D arrays

It is an arrays that has 2-D arrays(matrices) as its elements. They are often used to represent a third order tensor. We can create a 3-D array with two 2-D arrays both containing two arrays with the values 1 to 6.

 import numpy as np  
 array1 = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])  
 print(array1)  

How to check the number of dimensions

NumPy arrays provide the ndim attribute that returns an integer that tells us how many dimensions the array has. To check how many dimension an array has, we can use a code like the one provided below. We will use examples from the arrays we have created in the examples above.

 import numpy as np  
 x = np.array(11)  
 y = np.array([1, 2, 3, 4, 5])  
 z = np.array([[1, 2, 3], [4, 5, 6]])  
 a = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])  
   
 print(x. ndim)  
 print(y. ndim)  
 print(z. ndim)  
 print(a. ndim)  

Output:

0

1

2

3

Higher dimensional arrays

An array can have any number of dimensions. when the array is created, you can define the number of dimensions by using the ndmin argument. as an example, we will create an array with 5 dimensions and verify that is indeed has 5 dimensions.

 import numpy as np  
 array1 = np.array([1, 2, 3, 4], ndmin = 5)  
 print(array1)  
 print('Number of dimensions:', array1.ndim)   

In this array, the innermost dimension(5th dim) has 4 elements. The 4th dim has 1 element, that is the vector, the 3rd dim has 1 element, that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and the 1st dim has 1 element that is a 4D array.

Array Indexing

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0 meaning that the first element has index 0 and the second element has an index 1 and so on. To get the first element in an array:

 import numpy as np  
 array1 = np.array([1, 2, 3, 4])  
 print(array1[0])  

To get the 5th and 6th elements from an array and add them,

 import numpy as np  
 array1 = np.array([1, 2, 3, 4, 5, 6, 7])  
 print(array1[4] + array1[5]) 

Access 2-D arrays

To access elements from 2-D arrays, we can use comma separated integers representing the dimension and index of the element. To access the 2nd element on 1st dim

 import numpy as np  
 array1 = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])  
 print('2nd element on 1st dim:', array1[0, 1])   

To access the 4th element on the 2nd dim

 import numpy as np  
 array1 = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])  
 print('2nd element on 1st dim:', array1[1, 3])    

Access 3-D arrays

To access elements from 3-D arrays, we can use comma separated integers representing the dimension and index of the element. for example to access the third element of the 2nd array of the first array:

 import numpy as np  
 array1 = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])  
 print(array1[0, 1, 2])   

The output is 6.

Explanation

The first number represents the first dimension, which contains 2 arrays.

[[1, 2, 3], [4, 5, 6]] and [[7, 8, 9], [10, 11, 12]].

Since we selected 0, we are left with the 1st array. [[1, 2, 3], [4, 5, 6]]. 

The second number represents the 2nd dimension which also contains 2 arrays. [1, 2, 3] and [4, 5, 6].

Since we selected 1 we are left with the 2nd array [4, 5, 6].

The third number represents the 3rd dimension which contains 3 values 4, 5, 6. since we selected 2 we end up with the third value which is 6.

Negative Indexing

Use negative indexing to access an array from the end. For example to print the last element from the 2nd dim:

 import numpy as np  
 array1 = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])  
 print('Last element from 2nd dim:', array1[1, -1])  

Output is 10

Array Slicing

Slicing in python means taking elements from one given index to another given index. We pass slice instead of index like this [start:end] We can also define the step like this [start:end:step]. If we don't pass start its considered zero. If we don't pass end it's considered the length of array in that dimension. If we don't pass step it's considered 1.

EXAMPLE

Slice elements from index 1 to index 5 from the following array:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7])  
   
 print(arr[1:5])   

The result includes the start index but excludes the end index.

Negative slicing

Use the minus operator to refer to an index from the end.

Example

Slice from the index 3 from the end to index 1 from the end:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7])  
   
 print(arr[-3:-1])   

Output: [5 6]

STEP

Use the step value to determine the step of the slicing:

Example

Return every other element from index 1 to index 5:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7])  
   
 print(arr[1:5:2])   

Output: [2 4]

Return every other element from the entire array:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7])  
   
 print(arr[::2])  

Output: [1 3 5 7]

Slicing 2-D arrays

From the second element, slice elements from index 1 to index 4 (not included):

 import numpy as np  
   
 arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])  
   
 print(arr[1, 1:4])  

Output: [7 8 9]

From both elements return index 2:

 import numpy as np  
   
 arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])  
   
 print(arr[0:2, 2])   

Output: [3 8]

From both elements, slice index 1 to index 4 (not included), this will return a 2-D array:

 import numpy as np  
   
 arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])  
   
 print(arr[0:2, 1:4])  

Output: [[2 3 4] [7 8 9]]

Data Types

By default Python has these data types 

  • strings, integer, float, boolean, complex(eg 1.0 +2.0j)

Numpy has some extra data types and refer to data types with one character like i for integers and u for unsigned integers. Below is a full list

i- integer, b-boolean, u-unsigned integer, f-float, c-complex float, m-timedelta, M-datetime, O-object, S-string, U-unicode string, V- fixed chunk of memory for other data type(void)

Checking the data type for an array

The NumPy array object has a property called dtype that returns the data type of the array.

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4])  
   
 print(arr.dtype)  

Output: int64

Creating arrays with a defined data type

We use the array( ) function to create arrays, this function can take an optional argument: dtype that allows us to define the expected data type of the array elements:

Example - Create an array with data type string:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4], dtype='S')  
   
 print(arr)  
 print(arr.dtype)  

Output: [b'1' b'2' b'3' b'4'] 

|S1

For i, u, f, s, and U we can define size as well.

Example 2- Create an array with data type 4 bytes integer:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4], dtype='i4')  
   
 print(arr)  
 print(arr.dtype)  

Output: [1 2 3 4] 

int32

What if a value cannot be converted? - If a type is given in which elements can't be casted then NumPy will raise a ValueError. In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.

Converting data type on existing arrays

The best way to change the data type of an existing array, is to make a copy of the array with the  astype() method. The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

The data type can be specified using a string, like 'f' for float, 'i' for integer etc or you can use the data type directly like float for float and int for integer.

Change data type from float to integer by using 'i' as parameter value:

 import numpy as np  
   
 arr = np.array([1.1, 2.1, 3.1])  
   
 newarr = arr.astype('i')  
   
 print(newarr)  
 print(newarr.dtype)  

Output: [1 2 3]

int32

Change data type from float to integer by using int as parameter value:

 import numpy as np  
   
 arr = np.array([1.1, 2.1, 3.1])  
   
 newarr = arr.astype(int)  
   
 print(newarr)  
 print(newarr.dtype)  

Output: [1 2 3]

int64

Change data type from integer to boolean:

 import numpy as np  
   
 arr = np.array([1, 0, 3])  
   
 newarr = arr.astype(bool)  
   
 print(newarr)  
 print(newarr.dtype)  

Output: [True False True]

bool

Copy vs View

The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

COPY

Make a copy, change the original array, and display both arrays:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5])  
 x = arr.copy()  
 arr[0] = 42  
   
 print(arr)  
 print(x)  

Output: [42 2 3 4 5]

[1 2 3 4 5]

The copy should not be affected by the changes made to the original array.

VIEW

Make a view, change the original array,  and display both arrays:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5])  
 x = arr.view()  
 arr[0] = 42  
   
 print(arr)  
 print(x)  

Output: [42 2 3 4 5]

[42 2 3 4 5]

The view should be affected by the changes made to the original array.

Make changes in the view:

Make a view, change the view, and display both arrays:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5])  
 x = arr.view()  
 x[0] = 31  
   
 print(arr)  
 print(x)  

Output: [31 2 3 4 5]

[31 2 3 4 5]

The original array should be affected by the changes made to the view.

Check if an array owns its data

Copy owns the data and view does not own the data. So how can we check this?

Every NumPy array has the attribute base that returns None if the array owns the data. Otherwise, the base attribute refers to the original object.

Print the value of the base attribute to check if an array owns its data or not:

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5])  
   
 x = arr.copy()  
 y = arr.view()  
   
 print(x.base)  
 print(y.base)  

Output: None

[1 2 3 4 5]

The copy returns None. The view returns the original array.

Array Shape

The shape of an array is the number of elements in each dimension. NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.

To print the shape of a 2-D array:

 import numpy as np  
   
 arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])  
   
 print(arr.shape)  

The code above returns (2, 4) which means that the array has 2 dimensions and each dimension has 4 elements.

Create an array with 5 dimensions using ndmin using a vector with values 1, 2,3 ,4 and verify that the last dimension has value 4

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4], ndmin=5)  
   
 print(arr)  
 print('shape of array :', arr.shape)  

Output: [[[[[1 2 3 4]]]]]

shape of array: (1, 1, 1, 1, 4)

EXPLANATION: In this array, the innermost dimension(5th dim) has 4 elements. The 4th dim has 1 element, that is the vector, the 3rd dim has 1 element, that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and the 1st dim has 1 element that is a 4D array.

What does the shape tuple represent?

Integers at every index tells about the number of elements the corresponding dimension has. In the example above, at the index 4 we have value 4 so we can say that 5th(4 + 1th) dimension has 4 elements.

Reshaping arrays

Reshaping means changing the shape of an array. The shape of an array is the number of elements in each dimension. By reshaping we can add or remove dimensions or change number of elements in each dimension.

Example

Convert the following 1-D array with 12 elements into a 2-D array

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])  
   
 newarr = arr.reshape(4, 3)  
   
 print(newarr)  

The outermost dimension will have 4 arrays each with 3 elements.

Output: [[1 2 3] 

[4 5 6] 

[7 8 9] 

[10 11 12]]

Reshape from 1-D to 3-D

Convert the following 1-D array with 12 elements into a 2-D array. The outermost dimension will have 2 arrays that contain 3 arrays, each with 2 elements.

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])  
   
 newarr = arr.reshape(2, 3, 2)  
   
 print(newarr)  

Output

[[[1 2] 

[3 4]

[5 6]]


[[ 7 8]

[9 10]

[11 12]]]

Can We Reshape into any Shape?

Yes, as long as the elements required for reshaping are equal in both shapes.

We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.

Check if the returned array is a copy or a view

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])  
   
 print(arr.reshape(2, 4).base)  

Output: [ 1 2 3 4 5 6 7 8 ]

The example above returns an original array so its a view.

You are allowed to have one "unknown" dimension.

Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.

Pass -1 as the value, and NumPy will calculate this number for you.

Convert 1D array with 8 elements to 3D array with 2x2 elements

 import numpy as np  
   
 arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])  
   
 newarr = arr.reshape(2, 2, -1)  
   
 print(newarr)  

Output: [[[1 2]

[3 4]]


[[5 6]

[7 8]]]

We cannot pass -1 to more than one dimension.

Flattening Arrays

Flattening array means converting a multidimensional array into a 1D array.

We can use reshape(-1) to do this.

Convert the array into a 1D array.

 import numpy as np  
   
 arr = np.array([[1, 2, 3], [4, 5, 6]])  
   
 newarr = arr.reshape(-1)  
   
 print(newarr)  

Output: [1 2 3 4 5 6]

There are a lot of functions for changing the shapes of arrays in NumPy; flatten, ravel and also for rearranging the elements rot90, flip, fliplr, flipud etc. These fall under intermediate to advanced section of NumPy.

Array Iterating

Iterating means going through elements one by one. as we deal with multi dimensional arrays in NumPy, we can do this using basic for loop for Python. If we iterate on a 1D array, it will go through each element one by one. Iterate on the elements of the following 1D array:

 import numpy as np  
   
 arr = np.array([1, 2, 3])  
   
 for x in arr:  
  print(x)  

Output:

1

2

3

Iterating 2D arrays

In a 2D array it will go through all the rows. Iterate on the elements in the following 2D array:

 import numpy as np  
   
 arr = np.array([[1, 2, 3], [4, 5, 6]])  
   
 for x in arr:  
  print(x)  

Output:

[1 2 3]

[4 5 6]

If we iterate on a n-D array it will go through the n-1th dimension one by one. To return the actual values, the scalars, we have to iterate the arrays in each dimension. Iterate on each scalar element of the 2D array:

 import numpy as np  
   
 arr = np.array([[1, 2, 3], [4, 5, 6]])  
   
 for x in arr:  
  for y in x:  
   print(y)  

Output:

1

2

3

4

5

6

Iterating 3D arrays

It will go through all the 2D arrays. Iterate on the following elements:

 import numpy as np  
   
 arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])  
   
 for x in arr:  
  print(x)  

To return the actual values, the scalars, we have to iterate the arrays in each dimension. Iterate down to the scalars:

 import numpy as np  
   
 arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])  
   
 for x in arr:  
  for y in x:  
   for z in y:  
    print(z)  

Output:

1 2 3 4 5 6 7 8 9 10 11 12

Iterating Arrays using nditer( )

The function nditer( ) is a helping function that can be used from very basic to very advanced iterations. it solves some basic issues which we face in iteration.


Comments

Popular posts from this blog

Fun Terminal Commands Every Linux User Should Try

Accessing Maps from the terminal with MapSCII Requirements Telnet installed Internet Connection Firewall is disabled You can do this on Linux, Unix, Mac OS X or Windows with an app like PuTTY or the Windows 10 Linux bash shell or any Os that supports telnet. Open terminal and write the command below. telnet mapscii.me Hit enter and you're ready to browse and enjoy MapSCII. Navigate using the keyboard or mouse. Use the following keys on your keyboard Arrow keys: up, down, right, left A to zoom in Z to zoom out C toggles ASCII mode on/off You can also click and drag and hold on the map with your cursor. If your connection dropped without a reason, reconnect with telnet -E mapscii.me and use only cursors, A and Z to navigate. The Mapscii project is open source and you can install it locally if you'd like. Check out their project here on GitHub . The Dancing ASCII Party Parrot Requirements Curl installed Internet

SQL for Data Analysis - Udacity

  Entity Relationship Diagrams An  entity relationship diagram  (ERD) is a common way to view data in a database. Below is the ERD for the database we will use from Parch & Posey. These diagrams help you visualize the data you are analyzing including: The names of the tables. The columns in each table. The way the tables work together. You can think of each of the boxes below as a spreadsheet. What to Notice In the Parch & Posey database there are five tables (essentially 5 spreadsheets): web_events accounts orders sales_reps region You can think of each of these tables as an individual spreadsheet. Then the columns in each spreadsheet are listed below the table name. For example, the  region  table has two columns:  id  and  name . Alternatively the  web_events  table has four columns. The "crow's foot" that connects the tables together shows us how the columns in one table relate to the columns in another table. In this first lesson, you will be learning the bas

Impressive - Check if Your Email Address Has Been Hacked - Free,Easy Tutorial

haveibeenpwned.com Data breaches are rampant and many people don't appreciate the scale or frequency with which they occur. A "breach" is an incident where data is inadvertently exposed in a vulnerable system, usually due to insufficient access controls or security weaknesses in the software. How is the legitimacy of a data breach established? Attackers often give "breach" announcements, which are later revealed to be hoaxes. There is a delicate balance to be struck between making data searchable as soon as possible and conducting proper due diligence to confirm the breach's validity. In order to verify the authenticity of a violation, the following steps are normally taken: Has the affected provider made a public statement about the security breach? Does the information stolen in the breach show up in a Google search (i.e., it was simply copied from another source)? Is the structure of the data consistent with what you'd expect to see in a breach? Have