Lesson 4

01 - What is NumPy

NumPy:

is a Python library that acts as a wrapper around underlying C and Fortran code. Thus, very fast.
focuses on matrices which are called in nd-arrays. It’s syntax is very similar to MATLAB,

02 - Lesson outline

If you’re familiar with NumPy (esp. the following operations), feel free to skim through this lesson.

Create a NumPy array:

from a pandas dataframe: pandas.DataFrame.values

from a Python sequence: numpy.array

with constant initial values: numpy.ones, numpy.zeros

with random values: numpy.random

Access array attributes: shape, ndim, size, dtype

Compute statistics: sum, min, max, mean

Carry out arithmetic operations: add, subtract, multiply, divide

Measure execution time: time.time, profile

Manipulate array elements: Using simple indices and slices, integer arrays, boolean arrays

03 - Relationship to Pandas

Pandas is a kind of wrapper for NumPy.

Data frame is just a wrapper around this ndarray, access the columns with symbols and the rows by dates. And you CAN treat DF as an ndarray directly. *However, data frame allow us get many more routines.

04 - Notes on Notation

how to access cells within ndarray.

nd1(row,col)

in NumPy, our columns and rows begin at 0. nd1[0,0] is the first item.
slicing: address sub portion of the nd array?
Use the colon :

nd1[0:3,1:3]

[0:3,1:3] indicates starting at the zeroth row to just before the third and the first column to just before the third. The last value is one past the one that you actually want to include.

use the colon by itself in the rows position to access all of the rows.
Negative index: the last row = -1, the second to last row would be -2.

5. Quiz: Replace a slice

answer to the quiz Suppose we have these two ND arrays, nd1 and nd2. And we want to replace some of the values in nd1, with these values from nd2.

Which are correct?

06 - Creating NumPy arrays

You can access the underlining NumPy array within a Pandas data frame using the values property.

Let’s create NumPy arrays from scratch.

one dimensional array from known values.

Use the array function to convert most array-like objects into an ndarray.

The code above import the library numpy and rename it as np, then call a function np.array and pass a list which has value [2,3,4].
np.array function can take as input a list, a template, or other sequences.

Now create a 2D array by passing in a sequence of sequences to the np.array function.

the sequence of sequences are a list of lists or tuple of lists…

07 - Arrays with initial values

Empty array

np.empty() function takes the shape of the array as input.
For the example above, we will create an empty array with five rows.
Passing in a tuple with values 5 and 4 will create an empty array with five rows and four columns.
just add another number to the sequence to create arrays of higher dimension.
The empty array is not actually empty!!! the elements of the array read in whatever values were present in the corresponding memory location.
And by default the elements are the floating points.

np.ones

Next, we create an array full of ones. using np.ones(), the above example creates an array of 5 rows and 4 columns with all the values equal to 1.

08 - Specify the datatype

What parameter do you need to add to this function to create an array of integers instead?

Documentation for the array.ones() function might be helpful.

Documentation: numpy.ones

numpy.ones

Data types

NumPy User Guide: Data types

Documentation:

numpy.empty

numpy.ones

numpy.zeros

numpy.array

numpy.ndarray (direct ndarray constructor)

Answer: dtype is the parameter the values to be integers using NumPy data type np.int_.

09 - Generating random numbers

Numpy functions to generate arrays filled with random values.

np.random.random(), np.random.rand(), np.random.normal(),

*the np.random.random() function generates uniformly sampled floating point values in [0.0, 1.0).

Note: the array shape is a tupple, however, np.random.rand() accepts a sequence of numbers as arguments and straight of the tuple. np.random.rand(5,4) is give the same results as np.random.random((5,4)). Numpy provides this to achieve compatibility with the Matlab syntax.

What if you wanted a sample from a different distribution?

np.random.normal() function samples from normal distribution. the function also accepts mean and std of the distribution as input.

To generate integers, use the np.random.randint() function

Passing to values 0 and 10, np.random.randint() WILL generate a single integer between the range 0 and 10.
Passing size= 5 to np.random.randint() to generate 5 integers between 0 and 10.
passing a tuple value to the attribute size, which will create a 2d array with all the values between the range 0 and 10.
Check out the random sampling routines on the numpy website for more distribution and usage radiations.

NumPy Reference: Random sampling

Sampling functions:

numpy.random.random: Samples a Uniform distribution in [0.0, 1.0)

numpy.random.rand: Like random, but slightly different syntax

numpy.random.normal: Normal or Gaussian distribution

numpy.random.randint: Integers from Uniform distribution

10 - Array attributes

Attributes like size and shape are very useful when you have to over array elements to perform some computation.

given ndarray a, ` a = np.random.random((5,4)):a.shapewill return the __shape__ of it as a tuple ( (5,4))a.shape[0]will return number of rows (5)a.shape[1]will return number of columns (4)len(a.shape) and a.ndimwill return the __dimension__ of the array, e.g. a has 2 dimensionsa.sizereturns the total number of elements in an array.a.dtype` checks the data type of the values present in array A.

Attributes of numpy.ndarray:

numpy.ndarray.shape: Dimensions (height, width, …)

numpy.ndarray.ndim: No. of dimensions = len(shape)

numpy.ndarray.size: Total number of elements

numpy.ndarray.dtype: Datatype

Time: 00:02:33

11 - Operations on ndarrays

mathematical operations on np arrays

Use seed to generate random numbers

import numpy as np

np.random.seed(693)
a = np.random.randint(0,10, size = (5,4))

The output is an array with five rows, four columns, and all the values between the range 0 and 10. And since seed function is used, the random number generator with the constant, to get the same sequence of numbers every time.

Summing

a.sum() sums all the elements in an array a.sum(axis = 0) returns the sum of each columns. a.sum(axis = 1) returns the sum of each rows.

minimum, maximum, and mean of an array.

a.min(axis = 0) minimum of each column a.max(axis = 1) the maximum of each row a.mean() the mean of the entire array.

More operations:
NumPy Reference: Mathematical functions
- numpy.sum: Sum of elements - along rows, columns or all
- numpy.min, numpy.max, numpy.mean: Simple statistics
Also: numpy.random.seed to (re)set the random number generator.

Time: 00:03:40

12 - Quiz Locate maximum value

finding the position of some element in an array

The answer could also be return a.argmax()

NumPy Reference: Sorting, searching, and counting

13 - Timing python operations

how to time a particular operation.

the time library can help us know how fast our operation is.

capture the time snapshot before and after the operation is performed and subtract the two times.

Time: 00:00:56

Documentation:

time.time: Time in seconds, as a floating-point number

14 - How fast is NumPy

This module demos how fast NumPy can perform certain operations. will skip this one. All you need to know is that NumPy is fast

Documentation:

time.time: Current time in seconds (float value)

timeit: Average execution time measurement

profile: Code profiling

iPython “magics”:

%time: How long does it take to run once

%timeit: Averaged over multiple runs

%prun/%lprun: Per-function/line profiling

15 - Accessing array elements

a[3,2] a[0:2, 0:2] a[:,0:3:2 n is to m is to t, will give you values in the range n before m, but in steps of size t, hence this statement will give you values of the column 0.Skip the values of the column one, and then give the values of the column 2.

NumPy Reference: Indexing

Note: Indexing starts at 0 (zero).

Time: 00:02:29

16 - Modifying array elements

a[0, 0] = 1 This will give us access to the element at the position 0, 0 in the a and Using the assignment operator = to assign a value one to it.

a[0, :] = 2 can assign value of 2 to the entire row.

a[:, 3] = [1, 2, 3, 4, 5] can assign a list of values to a row or a column.

Time: 00:01:32

17 - Indexing an array with another array

NumPy array can be indexed with other arrays.

the length of the indices array and the returned array will be the same. Also it return value from array a at index 1,1,2,3.

NumPy Reference: Indexing

Integer array indexing: Select array elements with another array

Time: 00:01:33

18 - Boolean or mask index arrays

a[a < mean] for each value in array A, compare it with the mean, If it is less, we retain the value.

a[a < mean] = mean replace these values with the mean value.

NumPy Reference: Indexing

Integer array indexing

Boolean array indexing

Time: 00:01:47

19 - Arithmetic operations

Arithmetic operations on arrays are always applied element-wise.

2 * a it is element-wise multiplication.

a / 2.0 if the array and the divisor are integers, the output will also be integers. Using 2.0 instead of 2 as the divisor, we will get float values.

important note: the shape of a and b should be similar before the operation a + b, else it will throw error.

a + b

a * b

a / b

As seen before, since array a and b are integers, we get the final array in the form of integers as well. convert one of the arrays to float to get results as float .

20- Learning more NumPy

Resources from NumPy User Guide and Reference:

The N-dimensional array

Data types

Array creation [more]

Indexing [more]

Broadcasting

Random sampling

Mathematical functions

Linear algebra

Time: 00:00:16

Total Time: 00:35:59

First Draft 2019-01-10

ML4T-笔记---01-03-The-power-of-NumPy

Lesson 4

01 - What is NumPy

02 - Lesson outline

03 - Relationship to Pandas

04 - Notes on Notation

slicing: address sub portion of the nd array?

5. Quiz: Replace a slice

06 - Creating NumPy arrays

Let’s create NumPy arrays from scratch.

07 - Arrays with initial values

08 - Specify the datatype

Documentation:

09 - Generating random numbers

Numpy functions to generate arrays filled with random values.

What if you wanted a sample from a different distribution?

Sampling functions:

10 - Array attributes

11 - Operations on ndarrays

NumPy Reference: Mathematical functions

12 - Quiz Locate maximum value

13 - Timing python operations

Documentation:

14 - How fast is NumPy

Documentation:

iPython “magics”:

15 - Accessing array elements

16 - Modifying array elements

17 - Indexing an array with another array

NumPy Reference: Indexing

18 - Boolean or mask index arrays

NumPy Reference: Indexing

19 - Arithmetic operations

20- Learning more NumPy

Resources from NumPy User Guide and Reference:

双手合十，感恩拜访；双手张开，接受馈赠

Comments