Tutorial 1: Mathematical Foundations and Terminology¶
Course: Mathematics for Machine Learning Instructor: Mohammed Alnemari
š Learning Objectives¶
By the end of this tutorial, you will understand:
- Essential mathematical notation and terminology
- Basic set theory concepts
- Fundamental number systems
- Vector and matrix terminology
- Key mathematical operations
Part 1: Mathematical Notation¶
1.1 Basic Symbols¶
| Symbol | Meaning | Example |
|---|---|---|
| \(\in\) | "is an element of" / "belongs to" | \(x \in \mathbb{R}\) means "x is a real number" |
| \(\notin\) | "is not an element of" | \(i \notin \mathbb{R}\) |
| \(\subset\) | "is a subset of" | \(\mathbb{N} \subset \mathbb{Z}\) |
| \(\cup\) | "union" | \(A \cup B\) (all elements in A or B) |
| \(\cap\) | "intersection" | \(A \cap B\) (elements in both A and B) |
| \(\emptyset\) | "empty set" | A set with no elements |
| \(\forall\) | "for all" | \(\forall x \in \mathbb{R}\) means "for all x in real numbers" |
| \(\exists\) | "there exists" | \(\exists x : x > 0\) means "there exists an x such that x > 0" |
1.2 Common Notation Conventions¶
Scalars (single numbers): - Lower case letters: \(a, b, c, x, y, z\) - Greek letters: \(\alpha, \beta, \gamma, \lambda, \mu\)
Vectors (ordered lists of numbers): - Bold lower case: \(\mathbf{x}, \mathbf{y}, \mathbf{v}, \mathbf{w}\) - Sometimes with arrows: \(\vec{x}, \vec{y}\)
Matrices (rectangular arrays of numbers): - Bold upper case: \(\mathbf{A}, \mathbf{B}, \mathbf{X}, \mathbf{Y}\) - Sometimes just upper case: \(A, B, X, Y\)
Sets: - Upper case letters: \(A, B, S, V\) - Number systems: \(\mathbb{N}, \mathbb{Z}, \mathbb{Q}, \mathbb{R}, \mathbb{C}\)
Part 2: Number Systems¶
2.1 Common Number Systems¶
| Symbol | Name | Description | Examples |
|---|---|---|---|
| \(\mathbb{N}\) | Natural numbers | Positive integers | \(1, 2, 3, 4, \ldots\) |
| \(\mathbb{Z}\) | Integers | Whole numbers | \(\ldots, -2, -1, 0, 1, 2, \ldots\) |
| \(\mathbb{Q}\) | Rational numbers | Fractions | \(\frac{1}{2}, \frac{3}{4}, -\frac{5}{3}\) |
| \(\mathbb{R}\) | Real numbers | All numbers on number line | \(\pi, \sqrt{2}, -3.5\) |
| \(\mathbb{C}\) | Complex numbers | Numbers with imaginary part | \(2 + 3i, -1 + i\) |
2.2 Vector Spaces¶
\(\mathbb{R}^n\) - The set of all n-dimensional real vectors
- \(\mathbb{R}^1\) = Real numbers (1D)
- \(\mathbb{R}^2\) = Pairs of real numbers (2D plane)
- \(\mathbb{R}^3\) = Triples of real numbers (3D space)
- \(\mathbb{R}^n\) = n-tuples of real numbers (n-dimensional space)
Example: $\(\mathbf{x} = \begin{bmatrix} 2 \\ 3 \end{bmatrix} \in \mathbb{R}^2\)$
This means \(\mathbf{x}\) is a 2-dimensional vector with components 2 and 3.
Part 3: Vectors¶
3.1 What is a Vector?¶
A vector is an ordered list of numbers representing: - A point in space - A direction and magnitude - Features in machine learning
Example in \(\mathbb{R}^3\): $\(\mathbf{v} = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}\)$
3.2 Vector Terminology¶
Dimension: The number of components in a vector
Notation: - Column vector (default): \(\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}\) - Row vector: \(\mathbf{x}^T = \begin{bmatrix} x_1 & x_2 & x_3 \end{bmatrix}\)
Components/Entries: The individual numbers in a vector - \(x_1\) is the first component - \(x_2\) is the second component - \(x_i\) is the i-th component
3.3 Basic Vector Operations¶
Addition¶
Example: $\(\begin{bmatrix} 1 \\ 2 \end{bmatrix} + \begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} 4 \\ 6 \end{bmatrix}\)$
Scalar Multiplication¶
Example: $\(3 \cdot \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 3 \\ 6 \end{bmatrix}\)$
Dot Product (Inner Product)¶
Example: $\(\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix} \cdot \begin{bmatrix} 4 \\ 5 \\ 6 \end{bmatrix} = 1(4) + 2(5) + 3(6) = 4 + 10 + 18 = 32\)$
Part 4: Matrices¶
4.1 What is a Matrix?¶
A matrix is a rectangular array of numbers arranged in rows and columns.
General form: $\(\mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix}\)$
4.2 Matrix Terminology¶
Dimension/Size: \(m \times n\) where - \(m\) = number of rows - \(n\) = number of columns
Example: $\(\mathbf{A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}_{2 \times 3}\)$
This is a \(2 \times 3\) matrix (2 rows, 3 columns).
Entry/Element: \(a_{ij}\) is the element in row \(i\), column \(j\)
Square Matrix: A matrix where \(m = n\) (same number of rows and columns)
Example: $\(\mathbf{B} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}_{2 \times 2}\)$
4.3 Special Matrices¶
Identity Matrix (\(\mathbf{I}\))¶
A square matrix with 1s on the diagonal and 0s elsewhere.
Property: \(\mathbf{A} \mathbf{I} = \mathbf{I} \mathbf{A} = \mathbf{A}\)
Zero Matrix¶
A matrix where all elements are zero.
Diagonal Matrix¶
A square matrix where all off-diagonal elements are zero.
Transpose (\(\mathbf{A}^T\))¶
Flip rows and columns.
Example: $\(\mathbf{A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}, \quad \mathbf{A}^T = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}\)$
Rule: \((a_{ij})^T = a_{ji}\)
Part 5: Matrix Operations¶
5.1 Matrix Addition¶
Add corresponding elements (matrices must have same dimensions).
Example: $\(\begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} + \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix} = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix}\)$
5.2 Scalar Multiplication¶
Multiply every element by a scalar.
Example: $\(2 \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix}\)$
5.3 Matrix Multiplication¶
Rule: To multiply \(\mathbf{A}_{m \times n}\) and \(\mathbf{B}_{n \times p}\): - Number of columns in \(\mathbf{A}\) must equal number of rows in \(\mathbf{B}\) - Result is \(\mathbf{C}_{m \times p}\)
Formula: $\(c_{ij} = \sum_{k=1}^{n} a_{ik} b_{kj}\)$
Example: $\(\begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix} = \begin{bmatrix} 1(5) + 2(7) & 1(6) + 2(8) \\ 3(5) + 4(7) & 3(6) + 4(8) \end{bmatrix} = \begin{bmatrix} 19 & 22 \\ 43 & 50 \end{bmatrix}\)$
Important: Matrix multiplication is NOT commutative - \(\mathbf{AB} \neq \mathbf{BA}\) (in general)
Part 6: Important Concepts¶
6.1 Linear Combination¶
A sum of scalar multiples of vectors.
Example: $\(\mathbf{v} = 2\begin{bmatrix} 1 \\ 0 \end{bmatrix} + 3\begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 \\ 3 \end{bmatrix}\)$
6.2 Linear Independence¶
Vectors are linearly independent if no vector can be written as a linear combination of the others.
Example (Independent): $\(\mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \quad \mathbf{v}_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\)$
Example (Dependent): $\(\mathbf{v}_1 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}, \quad \mathbf{v}_2 = \begin{bmatrix} 2 \\ 4 \end{bmatrix}\)$ (Here \(\mathbf{v}_2 = 2\mathbf{v}_1\))
6.3 Span¶
The span of a set of vectors is all possible linear combinations of those vectors.
6.4 Basis¶
A basis for a vector space is a set of linearly independent vectors that span the space.
Standard basis for \(\mathbb{R}^3\): $\(\mathbf{e}_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \quad \mathbf{e}_2 = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \quad \mathbf{e}_3 = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\)$
6.5 Dimension¶
The dimension of a vector space is the number of vectors in any basis.
- \(\mathbb{R}^1\) has dimension 1
- \(\mathbb{R}^2\) has dimension 2
- \(\mathbb{R}^3\) has dimension 3
- \(\mathbb{R}^n\) has dimension n
Part 7: Functions and Mappings¶
7.1 Function Notation¶
Means: "function \(f\) maps from set \(A\) to set \(B\)" - \(A\) is the domain (input set) - \(B\) is the codomain (output set) - \(f(x)\) is the image of \(x\)
Example: $\(f: \mathbb{R} \to \mathbb{R}, \quad f(x) = x^2\)$
7.2 Linear Transformation¶
A function \(T: \mathbb{R}^n \to \mathbb{R}^m\) is linear if:
- \(T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})\) (additivity)
- \(T(c\mathbf{u}) = cT(\mathbf{u})\) (homogeneity)
Key fact: Every linear transformation can be represented by a matrix!
Part 8: Norms and Distance¶
8.1 Vector Norm¶
A norm measures the "length" or "magnitude" of a vector.
Euclidean Norm (L2 norm): $\(\|\mathbf{x}\|_2 = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2} = \sqrt{\sum_{i=1}^{n} x_i^2}\)$
Example: $\(\left\|\begin{bmatrix} 3 \\ 4 \end{bmatrix}\right\|_2 = \sqrt{3^2 + 4^2} = \sqrt{9 + 16} = \sqrt{25} = 5\)$
Other norms: - L1 norm: \(\|\mathbf{x}\|_1 = |x_1| + |x_2| + \cdots + |x_n|\) - Lā norm: \(\|\mathbf{x}\|_\infty = \max\{|x_1|, |x_2|, \ldots, |x_n|\}\)
8.2 Distance¶
The distance between two vectors: $\(d(\mathbf{x}, \mathbf{y}) = \|\mathbf{x} - \mathbf{y}\|\)$
Example: $\(d\left(\begin{bmatrix} 1 \\ 2 \end{bmatrix}, \begin{bmatrix} 4 \\ 6 \end{bmatrix}\right) = \left\|\begin{bmatrix} -3 \\ -4 \end{bmatrix}\right\| = \sqrt{9 + 16} = 5\)$
Part 9: Summation and Product Notation¶
9.1 Summation (\(\Sigma\))¶
Examples: $\(\sum_{i=1}^{5} i = 1 + 2 + 3 + 4 + 5 = 15\)$
9.2 Product (\(\Pi\))¶
Example: $\(\prod_{i=1}^{4} i = 1 \times 2 \times 3 \times 4 = 24\)$
Part 10: Common Functions¶
10.1 Exponential Function¶
Properties: - \(e^0 = 1\) - \(e^{a+b} = e^a \cdot e^b\) - \((e^x)' = e^x\)
10.2 Logarithm¶
Natural logarithm: \(\ln(x) = \log_e(x)\)
Properties: - \(\ln(1) = 0\) - \(\ln(ab) = \ln(a) + \ln(b)\) - \(\ln(a^b) = b\ln(a)\)
10.3 Sigmoid Function¶
Properties: - Range: \((0, 1)\) - \(\sigma(0) = 0.5\) - Used in logistic regression and neural networks
Summary: Key Takeaways¶
Essential Notation¶
- Vectors: \(\mathbf{x} \in \mathbb{R}^n\)
- Matrices: \(\mathbf{A} \in \mathbb{R}^{m \times n}\)
- Sets: \(A \subset B\), \(x \in S\)
Fundamental Operations¶
- Vector addition, scalar multiplication, dot product
- Matrix addition, multiplication, transpose
- Norms and distances
Core Concepts¶
- Linear independence and span
- Basis and dimension
- Linear transformations
- Functions and mappings
Practice Problems¶
Problem 1¶
Calculate the following: $\(\begin{bmatrix} 2 \\ 3 \\ 1 \end{bmatrix} \cdot \begin{bmatrix} 1 \\ 4 \\ 2 \end{bmatrix}\)$
Problem 2¶
Find the norm: $\(\left\|\begin{bmatrix} 5 \\ 12 \end{bmatrix}\right\|_2\)$
Problem 3¶
Multiply these matrices: $\(\begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix} 2 & 0 \\ 1 & 3 \end{bmatrix}\)$
Problem 4¶
Are these vectors linearly independent? $\(\mathbf{v}_1 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}, \quad \mathbf{v}_2 = \begin{bmatrix} 3 \\ 6 \end{bmatrix}\)$
Solutions¶
Solution 1: $\(2(1) + 3(4) + 1(2) = 2 + 12 + 2 = 16\)$
Solution 2: $\(\sqrt{5^2 + 12^2} = \sqrt{25 + 144} = \sqrt{169} = 13\)$
Solution 3: $\(\begin{bmatrix} 1(2) + 2(1) & 1(0) + 2(3) \\ 3(2) + 4(1) & 3(0) + 4(3) \end{bmatrix} = \begin{bmatrix} 4 & 6 \\ 10 & 12 \end{bmatrix}\)$
Solution 4: No, they are dependent because \(\mathbf{v}_2 = 3\mathbf{v}_1\)
Course: Mathematics for Machine Learning Instructor: Mohammed Alnemari
Next: Tutorial 2 - Vector Spaces and Linear Transformations