Cosc 1201 Final
and, or vs. &, | (2.6)
& bit wise element wise both are true item by item. and or both true true
int16
16-bit signed integer (-32_768 to 32_767).
uint16
16-bit unsigned integer (0 to 65_535).
float16
16-bit-precision floating-point number type: sign bit, 5 bits exponent, 10 bits mantissa.
int32
32-bit signed integer (-2_147_483_648 to 2_147_483_647).
uint32
32-bit unsigned integer (0 to 4_294_967_295).
float32
32-bit-precision floating-point number type: sign bit, 8 bits exponent, 23 bits mantissa.
float64
64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa.
int64
64-bit signed integer (-9_223_372_036_854_775_808 to 9_223_372_036_854_775_807).
uint64
64-bit unsigned integer (0 to 18_446_744_073_709_551_615).
int8
8-bit signed integer (-128 to 127).
uint8
8-bit unsigned integer (0 to 255).
inheritance
= (IS-A) When a class inherits from a class list? Inheritance is a mechanism that allows a new class (subclass or derived class) to inherit attributes and methods from an existing class (base class or superclass). The subclass can then extend or override the inherited functionalities. Syntax: pythonCopy code class BaseClass: # Base class definition class SubClass(BaseClass): # Subclass inherits from BaseClass # Additional attributes and methods can be defined here Example: pythonCopy code class Animal: def __init__(self, name): self.name = name def speak(self): pass # Placeholder for the speak method class Dog(Animal): def speak(self): return "Woof!" class Cat(Animal): def speak(self): return "Meow!" # Instances of subclasses dog = Dog(name="Buddy") cat = Cat(name="Whiskers") print(dog.speak()) # Outputs: Woof! print(cat.speak()) # Outputs: Meow! In this example, Dog and Cat are subclasses of the Animal class. They inherit the name attribute from Animal and provide their own implementation of the speak method.
numpy.newaxis
A convenient alias for None, useful for indexing arrays. Examples >>> newaxis is None True >>> x = np.arange(3) >>> x array([0, 1, 2]) >>> x[:, newaxis] array([[0], [1], [2]]) >>> x[:, newaxis, newaxis] array([[[0]], [[1]], [[2]]]) >>> x[:, newaxis] * x array([[0, 0, 0], [0, 1, 2], [0, 2, 4]])
Masks
A more powerful pattern is to use Boolean arrays as masks, to select particular subsets of the data themselves. x Out[26]: array([[5, 0, 3, 3], [7, 9, 3, 5], [2, 4, 7, 6]]) We can obtain a Boolean array for this condition easily, as we've already seen: In [27]: x < 5 Out[27]: array([[False, True, True, True], [False, False, True, False], [ True, True, False, False]], dtype=bool) Now to select these values from the array, we can simply index on this Boolean array; this is known as a masking operation: In [28]: x[x < 5] Out[28]: array([0, 3, 3, 3, 2, 4]) What is returned is a one-dimensional array filled with all the values that meet this condition; in other words, all the values in positions at which the mask array is True.
np..reduce
A reduce repeatedly applies a given operation to the elements of an array until only a single result remains. For example, calling reduce on the add ufunc returns the sum of all elements in the array: In [26]: x = np.arange(1, 6) np.add.reduce(x) Out[26]: 15 Similarly, calling reduce on the multiply ufunc results in the product of all array elements: In [27]: np.multiply.reduce(x) Out[27]: 120
intp
Alias for the unsigned integer type (one of numpy.ubyte, numpy.ushort, numpy.uintc, numpy.uint and np.ulonglong) that is the same size as a pointer. Compatible with the C uintptr_t.
Combined Indexing
And we can combine fancy indexing with masking: In [12]: mask = np.array([1, 0, 1, 0], dtype=bool) X[row[:, np.newaxis], mask] Out[12]: array([[ 0, 2], [ 4, 6], [ 8, 10]]) All of these indexing options combined lead to a very flexible set of operations for accessing and modifying array values.
zeros(shape[, dtype])
Array of zeros with the given shape, dtype, and order. shape : int or sequence of ints Shape of the new array, e.g., (2, 3) or 2. dtype : data-type, optional The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64. >>> np.zeros((5,), dtype=np.int) array([0, 0, 0, 0, 0])
Summing the Values in an Array
As a quick example, consider computing the sum of all values in an array. Python itself can do this using the built-in sum function: In [1]: import numpy as np In [2]: L = np.random.random(100) sum(L) Out[2]: 55.61209116604941 The syntax is quite similar to that of NumPy's sum function, and the result is the same in the simplest case: In [3]: np.sum(L) Out[3]: 55.612091166049424 However, because it executes the operation in compiled code, NumPy's version of the operation is computed much more quickly: In [4]: big_array = np.random.rand(1000000) %timeit sum(big_array) %timeit np.sum(big_array) 10 loops, best of 3: 104 ms per loop 1000 loops, best of 3: 442 µs per loop
np.log2(a)
Base-2 logarithm of x. x = np.array([0, 1, 2, 2**4]) >>> np.log2(x) array([-Inf, 0., 1., 4.])
bool_
Boolean type (True or False), stored as a byte.
methods
Class instances can also have methods (defined by its class) for modifying its state. Instance Methods: Methods in Python are typically defined within a class and are associated with instances of that class. These are called instance methods because they operate on an instance of the class. pythonCopy code class Dog: def __init__(self, name, age): self.name = name self.age = age def bark(self): print(f"{self.name} says Woof!") In this example, bark is an instance method of the Dog class. Accessing Methods: Methods are called using dot notation (object.method()). For example: pythonCopy code my_dog = Dog(name="Buddy", age=3) my_dog.bark() # Calling the 'bark' method This would print the message associated with the bark method for the my_dog object. Self Parameter: The first parameter of an instance method is always self. It is a reference to the instance of the class on which the method is called. It allows the method to access and modify the attributes of the object. pythonCopy code def bark(self): print(f"{self.name} says Woof!") Here, self.name refers to the name attribute of the object. Modifying Attributes: Methods can modify the attributes of an object. For example: pythonCopy code class Dog: def __init__(self, name, age): self.name = name self.age = age def celebrate_birthday(self): self.age += 1 print(f"{self.name} is now {self.age} years old!") Calling celebrate_birthday on a Dog object would increment its age attribute.
np.copy()
This can be most easily done with the copy() method:
Classes
Classes provide a means of bundling data and functionality together. They allow you to create user-defined data types, which can have attributes (variables) and methods (functions). Here are some key concepts related to classes in Python: class Dog: def __init__(self, name, age): self.name = name self.age = age def bark(self): print(f"{self.name} says Woof!") # Creating an instance of the Dog class my_dog = Dog(name="Buddy", age=3) # Accessing attributes print(f"{my_dog.name} is {my_dog.age} years old.") # Calling a method my_dog.bark() In this example, Dog is a class with attributes (name and age) and a method (bark). my_dog is an instance of the Dog class, and you can access its attributes and call its methods.
complex128
Complex number type composed of two double-precision floating-point numbers, compatible with Python complex. Complex number type composed of 2 64-bit-precision floating-point numbers.
complex_
Complex number type composed of two double-precision floating-point numbers, compatible with Python complex. Complex number type composed of 2 64-bit-precision floating-point numbers.
complex64
Complex number type composed of two single-precision floating-point numbers. Complex number type composed of 2 32-bit-precision floating-point numbers.
Ufuncs vectorization
Computations using vectorization through ufuncs are nearly always more efficient than their counterpart implemented using Python loops, especially as the arrays grow in size. np.arange(5) / np.arange(1, 6) Out[5]: array([ 0. , 0.5 , 0.66666667, 0.75 , 0.8
array slicing (2.2) Creating copies of arrays
Despite the nice features of array views, it is sometimes useful to instead explicitly copy the data within an array or a subarray. This can be most easily done with the copy() method: In [35]: x2_sub_copy = x2[:2, :2].copy() print(x2_sub_copy) [[99 5] [ 7 6]] If we now modify this subarray, the original array is not touched: In [36]: x2_sub_copy[0, 0] = 42 print(x2_sub_copy) [[42 5] [ 7 6]] In [37]: print(x2) [[99 5 2 4] [ 7 6 8 8] [ 1 6 7 7]]
float_
Double-precision floating-point number type, compatible with Python float and C double. 64-bit precision floating-point number type: sign bit, 11 bits exponent, 52 bits mantissa.
random.normal([mean, std, size])
Draw random samples from a normal (Gaussian) distribution. mean float or array_like of floats. Mean ("centre") of the distribution. float or array_like of floats. std. Standard deviation (spread or "width") of the distribution. Must be non-negative. int or tuple of ints, optional. size. Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(loc, scale).size samples are drawn. np.random.normal(3, 2.5, size=(2, 4)) array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
numpy.e
Euler's constant, base of natural logarithms, Napier's constant. e = 2.71828182845904523536028747135266249775724709369995...
np.all
Evaluate whether all elements are true import numpy as np arr = np.array([[True, True, True], [False, True, True], [True, True, True]]) # Check if all elements are True result = np.all(arr) print(result) # Output: False
np.any
Evaluate whether any elements are true arr = np.array([[False, False, False], [False, True, False], [False, False, False]]) # Check if any element is True result = np.any(arr) print(result) # Output: True
fancy indexing (2.7)
Fancy indexing is like the simple indexing we've already seen, but we pass arrays of indices in place of single scalars. This allows us to very quickly access and modify complicated subsets of an array's values. When using fancy indexing, the shape of the result reflects the shape of the index arrays rather than the shape of the array being indexed: In [4]: ind = np.array([[3, 7], [4, 5]]) x[ind] Out[4]: array([[71, 86], [60, 20]]) Fancy indexing also works in multiple dimensions. Consider the following array: In [5]: X = np.arange(12).reshape((3, 4)) X Out[5]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) Like with standard indexing, the first index refers to the row, and the second to the column: In [6]: row = np.array([0, 1, 2]) col = np.array([2, 1, 3]) X[row, col] Out[6]: array([ 2, 5, 11]) The pairing of indices in fancy indexing follows all the broadcasting rules that were mentioned in Computation on Arrays: Broadcasting. So, for example, if we combine a column vector and a row vector within the indices, we get a two-dimensional result: In [7]: X[row[:, np.newaxis], col] Out[7]: array([[ 2, 1, 3], [ 6, 5, 7], [10, 9, 11]]) Here, each row value is matched with each column vector, exactly as we saw in broadcasting of arithmetic operations. For example: In [8]: row[:, np.newaxis] * col Out[8]: array([[0, 0, 0], [2, 1, 3], [4, 2, 6]]) It is always important to remember with fancy indexing that the return value reflects the broadcasted shape of the indices, rather than the shape of the array being indexed.
np..outer
Finally, any ufunc can compute the output of all pairs of two different inputs using the outer method. This allows you, in one line, to do things like create a multiplication table: x = np.arange(1, 6) np.multiply.outer(x, x) Out[30]: array([[ 1, 2, 3, 4, 5], [ 2, 4, 6, 8, 10], [ 3, 6, 9, 12, 15], [ 4, 8, 12, 16, 20], [ 5, 10, 15, 20, 25]])
x[start:stop:step]
If any of these are unspecified, they default to the values start=0, stop=size of dimension, step=1. We'll take a look at accessing sub-arrays in one dimension and in multiple dimensions.
np..accumulate(x)
If we'd like to store all the intermediate results of the computation, we can instead use accumulate: In [28]: np.add.accumulate(x) Out[28]: array([ 1, 3, 6, 10, 15]) In [29]: np.multiply.accumulate(x) Out[29]: array([ 1, 2, 6, 24, 120])
implicit vs. explicit typecasting (2.1)
Implicit Typecasting: Definition: Automatic type conversion by the interpreter. Example: num_int + num_float converts num_int from int to float. Explicit Typecasting: Definition: Manual type conversion by the programmer. Example: int(num_str) explicitly converts "10" from str to int. Hierarchy of Types: Numeric Types:Int ➔ Float (Implicit)Int/Float ➔ Complex (Implicit) Sequence Types:Str ➔ List/Tuple (Implicit) Boolean Type:Int ➔ Bool (Implicit) No Implicit Casting:Mapping Type (Dict)None Type (NoneType) Note that when constructing an array, they can be specified using a string: np.zeros(10, dtype='int16') Or using the associated NumPy object: np.zeros(10, dtype=np.int16)
constructors
In Python, a constructor is a special method that is automatically called when an object of a class is created. The purpose of a constructor is to initialize the attributes (variables) of the object. In Python, the constructor method is named __init__. Here's a basic example of a constructor in Python: pythonCopy code class Dog: def __init__(self, name, age): self.name = name self.age = age # Creating an instance of the Dog class and calling the constructor my_dog = Dog(name="Buddy", age=3) In this example: The Dog class has a constructor method __init__. The constructor takes three parameters: self, name, and age. Inside the constructor, self.name and self.age are instance attributes that are initialized with the values passed as arguments (name and age). When you create an instance of the Dog class (my_dog in this case), the __init__ method is automatically called, and it initializes the attributes of the object with the specified values.
array indexing (2.2)
In a multi-dimensional array, items can be accessed using a comma-separated tuple of indices: In [10]: x2 Out[10]: array([[3, 5, 2, 4], [7, 6, 8, 8], [1, 6, 7, 7]]) In [12]: x2[2, 0] Out[12]: 1 In [13]: x2[2, -1] Out[13]: 7 values can also be modified using any of the above index notation: In [14]: x2[0, 0] = 12 x2 Out[14]: array([[12, 5, 2, 4], [ 7, 6, 8, 8], [ 1, 6, 7, 7]]) Keep in mind that, unlike Python lists, NumPy arrays have a fixed type. This means, for example, that if you attempt to insert a floating-point value to an integer array, the value will be silently truncated. Don't be caught unaware by this behavior! In [15]: x1[0] = 3.14159 # this will be truncated! x1 Out[15]: array([3, 0, 3, 3, 7, 9])
centering an array
In the previous section, we saw that ufuncs allow a NumPy user to remove the need to explicitly write slow Python loops. Broadcasting extends this ability. One commonly seen example is when centering an array of data. Imagine you have an array of 10 observations, each of which consists of 3 values. Using the standard convention (see Data Representation in Scikit-Learn), we'll store this in a $10 \times 3$ array: In [17]: X = np.random.random((10, 3)) We can compute the mean of each feature using the mean aggregate across the first dimension: In [18]: Xmean = X.mean(0) Xmean Out[18]: array([ 0.53514715, 0.66567217, 0.44385899]) And now we can center the X array by subtracting the mean (this is a broadcasting operation): In [19]: X_centered = X - Xmean To double-check that we've done this correctly, we can check that the centered array has near zero mean: In [20]: X_centered.mean(0) Out[20]: array([ 2.22044605e-17, -7.77156117e-17, -1.66533454e-17]) To within machine precision, the mean is now zero.
comparison operations and ufuncs (2.6)
It is also possible to do an element-wise comparison of two arrays, and to include compound expressions: In [11]: (2 * x) == (x ** 2) Out[11]: array([False, True, False, False, False], dtype=bool) Just as in the case of arithmetic ufuncs, these will work on arrays of any size and shape. Here is a two-dimensional example: In [12]: rng = np.random.RandomState(0) x = rng.randint(10, size=(3, 4)) x Out[12]: array([[5, 0, 3, 3], [7, 9, 3, 5], [2, 4, 7, 6]]) In [13]: x < 6 Out[13]: array([[ True, True, True, True], [False, False, True, True], [ True, True, False, False]], dtype=bool)
Pandas Dataframes (3.1)
Like the Series object, the DataFrame has an index attribute that gives access to the index labels: In [20]: states.index Out[20]: Index(['California', 'Florida', 'Illinois', 'New York', 'Texas'], dtype='object') Additionally, the DataFrame has a columns attribute, which is an Index object holding the column labels: In [21]: states.columns Out[21]: Index(['area', 'population'], dtype='object') Thus the DataFrame can be thought of as a generalization of a two-dimensional NumPy array, where both the rows and columns have a generalized index for accessing the data.
Binary Search Big O
O(lg n)
Merge Sort
O(n lg n)
Quick Sort
O(n lg n)
Linear Search
O(n)
Insertion Sort
O(n^2)
Selection Sort Big O
O(n^2)
Multi dimensional aggregates
One common type of aggregation operation is an aggregate along a row or column. Say you have some data stored in a two-dimensional array: In [9]: M = np.random.random((3, 4)) print(M) [[ 0.8967576 0.03783739 0.75952519 0.06682827] [ 0.8354065 0.99196818 0.19544769 0.43447084] [ 0.66859307 0.15038721 0.37911423 0.6687194 ]] By default, each NumPy aggregation function will return the aggregate over the entire array: In [10]: M.sum() Out[10]: 6.0850555667307118 Aggregation functions take an additional argument specifying the axis along which the aggregate is computed. For example, we can find the minimum value within each column by specifying axis=0: In [11]: M.min(axis=0) Out[11]: array([ 0.66859307, 0.03783739, 0.19544769, 0.06682827]) The function returns four values, corresponding to the four columns of numbers. Similarly, we can find the maximum value within each row: In [12]: M.max(axis=1) Out[12]: array([ 0.8967576 , 0.99196818, 0.6687194 ]) The way the axis is specified here can be confusing to users coming from other languages. The axis keyword specifies the dimension of the array that will be collapsed, rather than the dimension that will be returned. So specifying axis=0 means that the first axis will be collapsed: for two-dimensional arrays, this means that values within each column will be aggregated.
array slicing (2.2) subarrays as no-copy views
One important-and extremely useful-thing to know about array slices is that they return views rather than copies of the array data. This is one area in which NumPy array slicing differs from Python list slicing: in lists, slices will be copies. Consider our two-dimensional array from before: print(x2) [[12 5 2 4] [ 7 6 8 8] [ 1 6 7 7]] Let's extract a $2 \times 2$ subarray from this: In [32]: x2_sub = x2[:2, :2] print(x2_sub) [[12 5] [ 7 6]] Now if we modify this subarray, we'll see that the original array is changed! Observe: In [33]: x2_sub[0, 0] = 99 print(x2_sub) [[99 5] [ 7 6]] In [34]: print(x2) [[99 5 2 4] [ 7 6 8 8] [ 1 6 7 7]] This default behavior is actually quite useful: it means that when we work with large datasets, we can access and process pieces of these datasets without the need to copy the underlying data buffer.
np.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")
Parameters: a1, a2, ...sequence of array_like The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default). axisint, optional The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Default is 0. outndarray, optional If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified. dtypestr or dtype If provided, the destination array will have this dtype. Cannot be provided together with out. New in version 1.20.0. casting{'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional Controls what kind of data casting may occur. Defaults to 'same_kind'. # concatenate along the second axis (zero-indexed) np.concatenate([grid, grid], axis=1) array([[1, 2, 3, 1, 2, 3], [4, 5, 6, 4, 5, 6]]) x = np.array([1, 2, 3]) y = np.array([3, 2, 1]) np.concatenate([x, y]) array([1, 2, 3, 3, 2, 1]) a = np.array([[1, 2], [3, 4]]) >>> b = np.array([[5, 6]]) >>> np.concatenate((a, b), axis=0) array([[1, 2], [3, 4], [5, 6]])
np.reshape(newshape)
Parameters: newshape int or tuple of ints The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and the remaining dimensions. a = np.arange(6).reshape((3, 2)) >>> a array([[0, 1], [2, 3], [4, 5]]) # column vector via reshape x.reshape((3, 1)) array([[1], [2], [3]])
random.seed([seed])
Reseed the singleton RandomState instance. >>> numpy.random.seed(0) ; numpy.random.rand(4) array([ 0.55, 0.72, 0.6 , 0.54]) >>> numpy.random.seed(0) ; numpy.random.rand(4) array([ 0.55, 0.72, 0.6 , 0.54]) With the seed reset (every time), the same set of numbers will appear every time. If the random seed is not reset, different numbers appear with every invocation: >>> numpy.random.rand(4) array([ 0.42, 0.65, 0.44, 0.89]) >>> numpy.random.rand(4) array([ 0.96, 0.38, 0.79, 0.53])
random.randint(low[, high, size, dtype])
Return random integers from low (inclusive) to high (exclusive). lowint or array-like of ints Lowest (signed) integers to be drawn from the distribution (unless high=None, in which case this parameter is one above the highest such integer). highint or array-like of ints, optional If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if high=None). If array-like, must contain integer values sizeint or tuple of ints, optional Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned. dtypedtype, optional Desired dtype of the result. Byteorder must be native. The default value is int. np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8) array([[ 8, 6, 9, 7], # random [ 1, 16, 9, 12]], dtype=uint8).
broadcasting (2.5)
Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side. Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape. Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised. Let's take a look at an example where both arrays need to be broadcast: In [10]: a = np.arange(3).reshape((3, 1)) b = np.arange(3) Again, we'll start by writing out the shape of the arrays: a.shape = (3, 1) b.shape = (3,) Rule 1 says we must pad the shape of b with ones: a.shape -> (3, 1) b.shape -> (1, 3) And rule 2 tells us that we upgrade each of these ones to match the corresponding size of the other array: a.shape -> (3, 3) b.shape -> (3, 3) Because the result matches, these shapes are compatible. We can see this here: In [11]: a + b Out[11]: array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
int_
Signed integer type, compatible with Python int and C long.
np.split(ary, indices_or_sections, axis=0)
Split an array into multiple sub-arrays as views into ary. Parameters: aryndarray Array to be divided into sub-arrays. indices_or_sectionsint or 1-D array If indices_or_sections is an integer, N, the array will be divided into N equal arrays along axis. If such a split is not possible, an error is raised. If indices_or_sections is a 1-D array of sorted integers, the entries indicate where along axis the array is split. For example, [2, 3] would, for axis=0, result in ary[:2] ary[2:3] ary[3:] If an index exceeds the dimension of the array along axis, an empty sub-array is returned correspondingly. axisint, optional The axis along which to split, default is 0. x = np.arange(9.0) >>> np.split(x, 3) [array([0., 1., 2.]), array([3., 4., 5.]), array([6., 7., 8.])] x = [1, 2, 3, 99, 99, 3, 2, 1] x1, x2, x3 = np.split(x, [3, 5]) print(x1, x2, x3) [1 2 3] [99 99] [3 2 1]
Python list vs. NumPy array
The advantage of the Python dynamic-type list is flexibility: because each list element is a full structure containing both data and type information, the list can be filled with data of any desired type. Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.
and, or vs. &, | (2.6)
The difference is this: and and or gauge the truth or falsehood of entire object, while & and | refer to bits within each object. When you use and or or, it's equivalent to asking Python to treat the object as a single Boolean entity. In Python, all nonzero integers will evaluate as True. hen you have an array of Boolean values in NumPy, this can be thought of as a string of bits where 1 = True and 0 = False, and the result of & and | operates similarly to above: In [37]: A = np.array([1, 0, 1, 0, 1, 0], dtype=bool) B = np.array([1, 1, 1, 0, 1, 1], dtype=bool) A | B Out[37]: array([ True, True, True, False, True, True], dtype=bool) Using or on these arrays will try to evaluate the truth or falsehood of the entire array object, which is not a well-defined value: Similarly, when doing a Boolean expression on a given array, you should use | or & rather than or or and: In [39]: x = np.arange(10) (x > 4) & (x < 8) Out[39]: array([False, False, False, False, False, True, True, True, False, False], dtype=bool) Trying to evaluate the truth or falsehood of the entire array will give the same ValueError we saw previously: In [40]: (x > 4) and (x < 8) So remember this: and and or perform a single Boolean evaluation on an entire object, while & and | perform multiple Boolean evaluations on the content (the individual bits or bytes) of an object. For Boolean NumPy arrays, the latter is nearly always the desired operation.
eye(N[, M, k, dtype])
The eye() function is used to create a 2-D array with ones on the diagonal and zeros elsewhere.The eye() function is commonly used in linear algebra and matrix operations. It is useful for generating matrices to transform, rotate, or scale vectors. It can also be used in scientific computing for solving differential equations, optimization, and signal processing. N Number of rows in the output. M Number of columns in the output. If None, defaults to N. k Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal. dtype Data-type of the returned array. >>> import numpy as np >>>np.eye(2, dtype=int) array([[1, 0], [0, 1]]) >>> np.eye(2,2, dtype=int) array([[1, 0], [0, 1]]) >>> np.eye(2,2, dtype=float) array([[ 1., 0.], [ 0., 1.]])
arange([start, ]stop[, step, dtype])
The numpy.arange() function is used to generate an array with evenly spaced values within a specified interval. The function returns a one-dimensional array of type numpy.ndarray. start Start of interval. The interval includes this value. The default start value is 0. stop End of interval. The interval does not include this value, except in some cases where step is not an integer and floating point round-off affects the length of out. step Spacing between values. For any output out, this is the distance between two adjacent values, out[i+1] - out[i]. The default step size is 1. If step is specified as a position argument, start must also be given. dtytpe The type of the output array. If dtype is not given, infer the data type from the other input arguments. >>> import numpy as np >>> np.arange(5,9) array([5, 6, 7, 8]) >>> np.arange(5,9,3) array([5, 8]) >>> np.arange(5.0) array([ 0., 1., 2., 3., 4.])
array(object[, dtype])
The numpy.array() function is used to create an array. This function takes an iterable object as input and returns a new NumPy array with a specified data type (if provided) and shape. The array() function is useful when working with data that can be converted into an array, such as a list of numbers. It is often used to create a NumPy array from an existing Python list or tuple. object An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. dtype The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to 'upcast' the array. For downcasting, use the .astype(t) method. >>> import numpy as np >>> np.array([2, 4, 6]) array([2, 4, 6]) >>> np.array([2, 4, 6.0]) array([ 2., 4., 6.]) >>> np.array([[2, 3], [4, 5]]) array([[2, 3], [4, 5]])
full(shape, fill_value[, dtype])
The numpy.full() function is used to create a new array of the specified shape and type, filled with a specified value. shape Shape of the new array, e.g., (2, 3) or 2. fill_value Fill value. dtype The desired data-type for the array The default, None, means np.array(fill_value).dtype. >>> import numpy as np >>> np.full((3, 3), np.inf) array([[ inf, inf, inf], [ inf, inf, inf], [ inf, inf, inf]]) >>> np.full((3, 3), 10.1) array([[ 10.1, 10.1, 10.1], [ 10.1, 10.1, 10.1], [ 10.1, 10.1, 10.1]])
np.hstack()
The numpy.hstack() function is used to stack arrays in sequence horizontally (column wise). This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.This function is useful in the scenarios when we have to concatenate two arrays of different shapes along the second axis (column-wise). For example, to combine two arrays of shape (n, m) and (n, l) to form an array of shape (n, m+l). NameDescriptionRequired /OptionaltupThe arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length. >>> import numpy as np >>> x = np.array((3,5,7)) >>> y = np.array((5,7,9)) >>> np.hstack((x,y)) array([3, 5, 7, 5, 7, 9])
linspace(start, stop[, num, dtype])
The numpy.linspace() function is used to create an array of evenly spaced numbers within a specified range. The range is defined by the start and end points of the sequence, and the number of evenly spaced points to be generated between them. start The starting value of the sequence. stop The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False. num Number of samples to generate. Default is 50. Must be non-negative. dtype The type of the output array. If dtype is not given, infer the data type from the other input arguments. np.linspace(2.0, 3.0, num=5) array([2. , 2.25, 2.5 , 2.75, 3. ]) >>> np.linspace(2.0, 3.0, num=5, endpoint=False) array([2. , 2.2, 2.4, 2.6, 2.8]) >>> np.linspace(2.0, 3.0, num=5, retstep=True) (array([2. , 2.25, 2.5 , 2.75, 3. ]), 0.25)
ones(shape[, dtype])
The numpy.ones() function is used to create a new array of given shape and type, filled with ones. The ones() function is useful in situations where we need to create an array of ones with a specific shape and data type, for example in matrix operations or in initializing an array with default values. shape Shape of the new array, e.g., (2, 3) or 2. dtype The desired data-type for the array, e.g., numpy.int8. Default is numpy. float64. >>> import numpy as np >>> np.ones(7) array([ 1., 1., 1., 1., 1., 1., 1.]) >>> np.ones((2, 1)) array([[ 1.], [ 1.]]) >>> np.ones(7,) array([ 1., 1., 1., 1., 1., 1., 1.]) >>> x = (2, 3) >>> np.ones(x) array([[ 1., 1., 1.], [ 1., 1., 1.]]) >>>
Pandas Series (3.1)
This explicit index definition gives the Series object additional capabilities. For example, the index need not be an integer, but can consist of values of any desired type. For example, if we wish, we can use strings as an index: In [7]: data = pd.Series([0.25, 0.5, 0.75, 1.0], index=['a', 'b', 'c', 'd']) data Out[7]: a 0.25 b 0.50 c 0.75 d 1.00 dtype: float64 population_dict = {'California': 38332521, 'Texas': 26448193, 'New York': 19651127, 'Florida': 19552860, 'Illinois': 12882135} population = pd.Series(population_dict) population Out[11]: California 38332521 Florida 19552860 Illinois 12882135 New York 19651127 Texas 26448193 dtype: int64 By default, a Series will be created where the index is drawn from the sorted keys. From here, typical dictionary-style item access can be performed: In [12]: population['California'] Out[12]: 38332521 Unlike a dictionary, though, the Series also supports array-style operations such as slicing: In [13]: population['California':'Illinois'] Out[13]: California 38332521 Florida 19552860 Illinois 12882135 dtype: int64
np.log10(a)
This mathematical function helps user to calculate Base-10 logarithm of x where x belongs to all the input array elements. np.log10([1e-15, -3.]) array([-15., nan])
intc
Unsigned integer type, compatible with C unsigned int.
Constructing Pandas Series objects
We've already seen a few ways of constructing a Pandas Series from scratch; all of them are some version of the following: >>> pd.Series(data, index=index) where index is an optional argument, and data can be one of many entities. For example, data can be a list or NumPy array, in which case index defaults to an integer sequence: In [14]: pd.Series([2, 4, 6]) Out[14]: 0 2 1 4 2 6 dtype: int64 data can be a scalar, which is repeated to fill the specified index: In [15]: pd.Series(5, index=[100, 200, 300]) Out[15]: 100 5 200 5 300 5 dtype: int64 data can be a dictionary, in which index defaults to the sorted dictionary keys: In [16]: pd.Series({2:'a', 1:'b', 3:'c'}) Out[16]: 1 b 2 a 3 c dtype: object In each case, the index can be explicitly set if a different result is preferred: In [17]: pd.Series({2:'a', 1:'b', 3:'c'}, index=[3, 2]) Out[17]: 3 c 2 a dtype: object Notice that in this case, the Series is populated only with the explicitly identified keys.
multidimensional arrays (2.1)
array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) array_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]]) # Create a 3x5 floating-point array filled with ones np.ones((3, 5), dtype=float) # Create a 3x5 floating-point array filled with ones np.ones((3, 5), dtype=float)
np.hsplit()
aryInput arrayRequiredindices_or_sectionsIndices or sections >> import numpy as np >>> a = np.arange(16.0).reshape(4,4) >>> np.hsplit(a, 2) [array([[ 0., 1.], [ 4., 5.], [ 8., 9.], [ 12., 13.]]), array([[ 2., 3.], [ 6., 7.], [ 10., 11.], [ 14., 15.]])]
attributes
attributes are variables that store data within an object or a class. They represent the state or characteristics of the object. Attributes are used to store information about the object and provide a way to access and manipulate this information. Here are some key points about attributes in Python: Instance Attributes: Attributes in Python can be associated with instances of a class. These are often referred to as instance attributes because they belong to a specific instance of the class. python class Dog: def __init__(self, name, age): self.name = name # Instance attribute self.age = age # Instance attribute In this example, name and age are instance attributes of the Dog class. Accessing Attributes: You can access the attributes of an object using dot notation (object.attribute). For example: pythonCopy code my_dog = Dog(name="Buddy", age=3) print(my_dog.name) # Accessing the 'name' attributeSetting Attributes: You can set or modify the values of attributes using assignment. pythonCopy code my_dog.age = 4 # Setting the 'age' attribute to 4 Class Attributes: In addition to instance attributes, classes can also have attributes that are shared among all instances of the class. These are called class attributes. python class Dog: species = "Canis familiaris" # Class attribute Class attributes are accessed using the class name rather than an instance. python print(Dog.species)
numpy.pi
pi = 3.1415926535897932384626433...
random.random([size])
random.random(size=None) Return random floats in the half-open interval [0.0, 1.0). Alias for random_sample to ease forward-porting to the new random API.
composition
class composition = (HAS-A) A class that has attributes with the data type of another class Pandas dataframes with series, data frames, etc. Definition: Composition is a design principle where a class contains objects of other classes, allowing it to use their functionalities. This is achieved by creating instances of other classes within the class, forming a "has-a" relationship. Example: pythonCopy code class Engine: def start(self): return "Engine started" class Car: def __init__(self): self.engine = Engine() # Composition: Car has an Engine def start(self): return f"Car started. {self.engine.start()}" # Instance of the Car class my_car = Car() print(my_car.start()) # Outputs: Car started. Engine started In this example, the Car class has a composition relationship with the Engine class. It contains an instance of the Engine class as an attribute, and it can use the functionalities of the engine.
dtype
dtype tells the data type of the elements of a NumPy array. In NumPy array, all the elements have the same data type. e.g. for this NumPy array [ [3,4,6], [0,8,1]], dtype will be int64
nbytes
function return total bytes consumed by the elements of the array. arr = geek.zeros((1, 2, 3), dtype = geek.complex128) gfg = arr.nbytes 96
minimum and maximum
imilarly, Python has built-in min and max functions, used to find the minimum value and maximum value of any given array: In [5]: min(big_array), max(big_array) Out[5]: (1.1717128136634614e-06, 0.9999976784968716) NumPy's corresponding functions have similar syntax, and again operate much more quickly: In [6]: np.min(big_array), np.max(big_array) Out[6]: (1.1717128136634614e-06, 0.9999976784968716) In [7]: %timeit min(big_array) %timeit np.min(big_array) 10 loops, best of 3: 82.3 ms per loop 1000 loops, best of 3: 497 µs per loop For min, max, sum, and several other NumPy aggregates, a shorter syntax is to use methods of the array object itself: In [8]: print(big_array.min(), big_array.max(), big_array.sum()) 1.17171281366e-06 0.999997678497 499911.628197 Whenever possible, make sure that you are using the NumPy version of these aggregates when operating on NumPy arrays!
dunder methods
in Python, dunder methods are methods that allow instances of a class to interact with the built-in functions and operators of the language. The word "dunder" comes from "double underscore", because the names of dunder methods start and end with two underscores, for example __str__ or __add__. Typically, dunder methods are not invoked directly by the programmer, making it look like they are called by magic. That is why dunder methods are also referred to as "magic methods" sometimes
itemsize
itemsize returns the size (in bytes) of each element of a NumPy array. e.g. for this NumPy array [ [3,4,6], [0,8,1]], itemsize will be 8, because this array consists of integers and size of integer (in bytes) is 8 bytes.
ndim
ndim represents the number of dimensions (axes) of the ndarray. e.g. for this 2-dimensional array [ [3,4,6], [0,8,1]], value of ndim will be 2. This ndarray has two dimensions (axes) - rows (axis=0) and columns (axis=1)
&
np.bitwise_and
~
np.bitwise_not
|
np.bitwise_or
^
np.bitwise_xor
np.vstack((a1, a2, ...)[, dtype])
sequence of ndarrays The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length. Stack arrays in sequence vertically (row wise). This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). Rebuilds arrays divided by vsplit. This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations. a = np.array([1, 2, 3]) >>> b = np.array([4, 5, 6]) >>> np.vstack((a,b)) array([[1, 2, 3], [4, 5, 6]])
shape
shape is a tuple of integers representing the size of the ndarray in each dimension. e.g. for this 2-dimensional array [ [3,4,6], [0,8,1]], value of shape will be (2,3) because this ndarray has two dimensions - rows and columns - and the number of rows is 2 and the number of columns is 3
size
size is the total number of elements in the ndarray. It is equal to the product of elements of the shape. e.g. for this 2-dimensional array [ [3,4,6], [0,8,1]], shape is (2,3), size will be product (multiplication) of 2 and 3 i.e. (2*3) = 6. Hence, the size is 6.
np.vsplit(a, indices)
vsplit is equivalent to split with axis=0 (default), the array is always split along the first axis regardless of the array dimension. x = np.arange(16.0).reshape(4, 4) >>> x array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) >>> np.vsplit(x, 2) [array([[0., 1., 2., 3.], [4., 5., 6., 7.]]), array([[ 8., 9., 10., 11.], [12., 13., 14., 15.]])] >>> np.vsplit(x, np.array([3, 6])) [array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]]), array([[12., 13., 14., 15.]]), array([], shape=(0, 4), dtype=float64)]
array slicing (2.2) One-dimensional subarrays
x = np.arange(10) x Out[16]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) In [17]: x[:5] # first five elements Out[17]: array([0, 1, 2, 3, 4]) In [18]: x[5:] # elements after index 5 Out[18]: array([5, 6, 7, 8, 9]) In [19]: x[4:7] # middle sub-array Out[19]: array([4, 5, 6]) In [20]: x[::2] # every other element Out[20]: array([0, 2, 4, 6, 8]) In [21]: x[1::2] # every other element, starting at index 1 x[::-1] # all elements, reversed Out[22]: array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0]) In [23]: x[5::-2] # reversed every other from index 5 Out[23]: array([5, 3, 1])
Array arithmetic
x = np.arange(4) print("x =", x) print("x + 5 =", x + 5) print("x - 5 =", x - 5) print("x * 2 =", x * 2) print("x / 2 =", x / 2) print("x // 2 =", x // 2) # floor division x = [0 1 2 3] x + 5 = [5 6 7 8] x - 5 = [-5 -4 -3 -2] x * 2 = [0 2 4 6] x / 2 = [ 0. 0.5 1. 1.5] x // 2 = [0 0 1 1] There is also a unary ufunc for negation, and a ** operator for exponentiation, and a % operator for modulus: In [8]: print("-x = ", -x) print("x ** 2 = ", x ** 2) print("x % 2 = ", x % 2) -x = [ 0 -1 -2 -3] x ** 2 = [0 1 4 9] x % 2 = [0 1 0 1] In addition, these can be strung together however you wish, and the standard order of operations is respected: In [9]: -(0.5*x + 1) ** 2 Out[9]: array([-1. , -2.25, -4. , -6.25]) Each of these arithmetic operations are simply convenient wrappers around specific functions built into NumPy; for example, the + operator is a wrapper for the add function: In [10]: np.add(x, 2) Out[10]: array([2, 3, 4, 5])
np.abs(a)
x = np.array([-1.2, 1.2]) >>> np.absolute(x) array([ 1.2, 1.2]) >>> np.absolute(1.2 + 1j) 1.5620499351813308
array slicing (2.2) Multi-dimensional subarrays
x2 Out[24]: array([[12, 5, 2, 4], [ 7, 6, 8, 8], [ 1, 6, 7, 7]]) In [25]: x2[:2, :3] # two rows, three columns Out[25]: array([[12, 5, 2], [ 7, 6, 8]]) In [26]: x2[:3, ::2] # all rows, every other column Out[26]: array([[12, 2], [ 7, 8], [ 1, 7]]) Finally, subarray dimensions can even be reversed together: In [27]: x2[::-1, ::-1] Out[27]: array([[ 7, 7, 6, 1], [ 8, 8, 6, 7], [ 4, 2, 5, 12]]) Accessing array rows and columns One commonly needed routine is accessing of single rows or columns of an array. This can be done by combining indexing and slicing, using an empty slice marked by a single colon (:): In [28]: print(x2[:, 0]) # first column of x2 [12 7 1] In [29]: print(x2[0, :]) # first row of x2 [12 5 2 4] In the case of row access, the empty slice can be omitted for a more compact syntax: In [30]: print(x2[0]) # equivalent to x2[0, :] [12 5 2 4]
array slicing (2.2)
x[start:stop:step] If any of these are unspecified, they default to the values start=0, stop=size of dimension, step=1. We'll take a look at accessing sub-arrays in one dimension and in multiple dimensions.
numpy.log
xarray_like Input value. np.log([1, np.e, np.e**2, 0]) array([ 0., 1., 2., -Inf])
