행렬 변환과 고윳값

행렬은 공간을 다시 빚는 기계입니다. 행렬이 모든 점에 무엇을 하는지 알면 전체 변환을 이해할 수 있습니다.

유형: Build 언어: Python, Julia 선수 지식: Phase 1, Lessons 01-02 (선형대수 직관, 벡터와 행렬 연산) 예상 시간: 약 75분

학습 목표

회전 행렬(rotation matrix), 스케일링 행렬(scaling matrix), 전단 행렬(shearing matrix), 반사 행렬(reflection matrix)을 만들고 2D와 3D point에 적용합니다.
여러 변환(transformation)을 행렬 곱셈(matrix multiplication)으로 합성(compose)하고, 순서가 중요하다는 것을 확인합니다.
특성방정식(characteristic equation)으로 2x2 matrix의 고유값(eigenvalue)과 고유벡터(eigenvector)를 계산합니다.
고유값(eigenvalue)이 PCA 방향, RNN 안정성(stability), 스펙트럴 클러스터링 동작(spectral clustering behavior)을 결정하는 이유를 설명합니다.

문제

PCA를 읽다 보면 "공분산 행렬(covariance matrix)의 고유벡터(eigenvector)를 찾는다"는 문장을 봅니다. 모델 안정성(model stability)을 읽다 보면 "모든 고유값(eigenvalue)의 크기(magnitude)가 1보다 작은지 확인한다"는 말을 봅니다. 데이터 증강(data augmentation)을 읽다 보면 "무작위 회전(random rotation)을 적용한다"는 표현이 나옵니다. 행렬이 공간에 기하학적으로 무엇을 하는지 이해하기 전까지는 이 말들이 잘 와닿지 않습니다.

행렬은 숫자 격자만이 아닙니다. 공간을 움직이는 기계입니다. 회전 행렬은 점들을 회전시키고, 스케일링 행렬은 늘리거나 줄이고, 전단 행렬은 기울입니다. 신경망이 데이터에 적용하는 모든 변환은 이런 연산(operation)이거나 그 조합입니다. 이 강의에서는 그 연산들을 구체적으로 다룹니다.

사전 테스트

2문제 · 이 강의를 시작하기 전에 얼마나 알고 있는지 확인해보세요

1.행렬(matrix)의 고유벡터(eigenvector)는 무엇인가요?

2.2D 변환 행렬(transformation matrix)의 행렬식(determinant)은 기하학적으로 무엇을 나타내나요?

0/2 답변 완료

개념

변환(Transformation)을 행렬로 표현하기

2D의 모든 선형 변환(linear transformation)은 2x2 행렬로 쓸 수 있습니다. 행렬은 기저 벡터(basis vector) [1, 0]과 [0, 1]이 어디로 가는지 알려 줍니다. 나머지는 그 결과를 따라갑니다.

graph LR
    subgraph Before["표준 기저(Standard Basis)"]
        e1["e1 = [1, 0] (x 방향)"]
        e2["e2 = [0, 1] (y 방향)"]
    end
    subgraph Transform["행렬 M(Matrix M)"]
        M["M = column들이 새로운 기저벡터"]
    end
    subgraph After["변환 M 이후"]
        e1p["e1' = 새로운 x-basis"]
        e2p["e2' = 새로운 y-basis"]
    end
    e1 --> M --> e1p
    e2 --> M --> e2p

회전(Rotation)

각도 theta만큼 2D rotation을 하면 거리와 각도가 유지됩니다. 모든 점은 원호를 따라 이동합니다.

graph LR
    subgraph Before["회전 전"]
        A["A(2, 1)"]
        B["B(0, 2)"]
    end
    subgraph Rot["45도 회전"]
        R["R(θ) = [[cos θ, -sin θ], [sin θ, cos θ]]"]
    end
    subgraph After["회전 후"]
        Ap["A'(0.71, 2.12)"]
        Bp["B'(-1.41, 1.41)"]
    end
    A --> R --> Ap
    B --> R --> Bp

3D에서는 축을 기준으로 회전합니다. 축마다 고유한 rotation matrix가 있습니다.

Rz(theta) = | cos  -sin  0 |     z-axis 기준 회전
            | sin   cos  0 |     (x-y plane이 돌고 z는 유지)
            |  0     0   1 |

Rx(theta) = | 1   0     0    |   x-axis 기준 회전
            | 0  cos  -sin   |   (y-z plane이 돌고 x는 유지)
            | 0  sin   cos   |

Ry(theta) = |  cos  0  sin |     y-axis 기준 회전
            |   0   1   0  |     (x-z plane이 돌고 y는 유지)
            | -sin  0  cos |

스케일링(Scaling)

Scaling은 각 axis를 독립적으로 늘리거나 줄입니다.

graph LR
    subgraph Before["스케일링 전"]
        A["A(2, 1)"]
        B["B(0, 2)"]
    end
    subgraph Scale["스케일 sx=2, sy=0.5"]
        S["S = [[2, 0], [0, 0.5]]"]
    end
    subgraph After["스케일링 후"]
        Ap["A'(4, 0.5)"]
        Bp["B'(0, 1)"]
    end
    A --> S --> Ap
    B --> S --> Bp

전단(Shearing)

Shearing은 한 axis를 고정한 채 다른 axis를 기울입니다. Rectangle을 parallelogram으로 바꿉니다.

graph LR
    subgraph Before["전단 전"]
        A["A(1, 0)"]
        B["B(0, 1)"]
    end
    subgraph Shear["x 방향 전단, k=1"]
        Sh["Shx = [[1, k], [0, 1]]"]
    end
    subgraph After["전단 후"]
        Ap["A(1, 0) 변하지 않음"]
        Bp["B'(1, 1) 이동"]
    end
    A --> Sh --> Ap
    B --> Sh --> Bp

Shear matrix는 다음과 같습니다.

Shx = [[1, k], [0, 1]]: x를 k * y만큼 이동합니다.
Shy = [[1, 0], [k, 1]]: y를 k * x만큼 이동합니다.

반사(Reflection)

Reflection은 axis나 line을 기준으로 점을 mirror합니다.

graph LR
    subgraph Before["반사 전"]
        A["A(2, 1)"]
    end
    subgraph Reflect["y-axis 기준 반사"]
        R["[[-1, 0], [0, 1]]"]
    end
    subgraph After["반사 후"]
        Ap["A'(-2, 1)"]
    end
    A --> R --> Ap

Reflection matrix는 다음과 같습니다.

y-axis 기준 reflection: [[-1, 0], [0, 1]]
x-axis 기준 reflection: [[1, 0], [0, -1]]

합성(Composition): transformation 연결하기

Transformation A를 적용한 뒤 B를 적용하는 것은 matrix를 곱하는 것과 같습니다. result = B @ A @ point입니다. 순서가 중요합니다. 먼저 회전하고 scale하는 것과 먼저 scale하고 회전하는 것은 다른 결과를 냅니다.

graph LR
    subgraph Path1["90도 회전 후 Scale (2, 0.5)"]
        P1["(1, 0)"] -->|"90도 회전"| P2["(0, 1)"] -->|"Scale"| P3["(0, 0.5)"]
    end

합성 결과(Composed): S @ R = [[0, -2], [0.5, 0]]

graph LR
    subgraph Path2["Scale (2, 0.5) 후 90도 회전"]
        Q1["(1, 0)"] -->|"Scale"| Q2["(2, 0)"] -->|"90도 회전"| Q3["(0, 2)"]
    end

합성 결과(Composed): R @ S = [[0, -0.5], [2, 0]]

결과가 다릅니다. Matrix multiplication은 commutative하지 않습니다.

고유값(eigenvalue)과 고유벡터(eigenvector)

대부분의 벡터는 matrix를 만나면 방향이 바뀝니다. 고유벡터(eigenvector)는 특별합니다. Matrix가 그 벡터를 회전시키지 않고 scale만 합니다. 그 scale factor가 고유값(eigenvalue)입니다.

A @ v = lambda * v

v는 고유벡터(eigenvector)입니다. 살아남는 방향입니다.
lambda는 고유값(eigenvalue)입니다. 얼마나 늘어나는지 나타냅니다.

Example: A = | 2  1 |
             | 1  2 |

고유벡터(Eigenvector) [1, 1], 고유값(eigenvalue) 3:
  A @ [1,1] = [3, 3] = 3 * [1, 1]     (same direction, scaled by 3)

고유벡터(Eigenvector) [1, -1], 고유값(eigenvalue) 1:
  A @ [1,-1] = [1, -1] = 1 * [1, -1]  (same direction, unchanged)

이 matrix는 [1, 1] 방향으로 공간을 3배 늘리고, [1, -1] 방향은 그대로 둡니다. 다른 모든 방향은 이 두 방향의 조합입니다.

고유분해(Eigendecomposition)

Matrix가 n개의 선형 독립 고유벡터(eigenvector)를 가지면 다음처럼 분해할 수 있습니다.

A = V @ D @ V^(-1)

V = 고유벡터(eigenvector)를 column으로 둔 matrix
D = 고유값(eigenvalue)을 diagonal에 둔 matrix
V^(-1) = V의 inverse

이 식은 고유벡터(eigenvector) 좌표계로 회전하고, 각 axis 방향으로 scale한 뒤, 다시 원래 좌표계로 돌아온다는 뜻입니다.

고유값(eigenvalue)이 중요한 이유

PCA. Covariance matrix의 고유벡터(eigenvector)는 주성분(principal component)입니다. 고유값(eigenvalue)은 각 component가 얼마나 많은 variance를 설명하는지 알려 줍니다. 고유값 기준으로 정렬하고 top k만 유지하면 dimensionality reduction이 됩니다.

Stability. Recurrent network와 dynamical system에서 magnitude가 1보다 큰 고유값(eigenvalue)은 output을 explode하게 만듭니다. Magnitude가 1보다 작으면 vanish하게 만듭니다. 이것이 vanishing/exploding gradient problem을 한 문장으로 말한 것입니다.

Spectral methods. Graph neural network는 adjacency matrix의 고유값(eigenvalue)을 사용합니다. Spectral clustering은 Laplacian의 고유값을 사용합니다. 고유벡터(eigenvector)는 graph의 구조를 드러냅니다.

행렬식(determinant)은 부피 스케일링 계수(volume scaling factor)입니다

Transformation matrix의 행렬식(determinant)은 area(2D)나 volume(3D)을 얼마나 scale하는지 알려 줍니다.

det = 1:   area preserved (rotation, 면적 보존)
det = 2:   area doubled (면적 두 배)
det = 0:   space crushed to lower dimension (singular, 낮은 차원으로 붕괴)
det = -1:  area preserved but orientation flipped (reflection, 면적은 보존되지만 방향 뒤집힘)

| det(Rotation) | = 1        (always)
| det(Scale sx, sy) | = sx * sy
| det(Shear) | = 1           (area preserved)
| det(Reflection) | = -1     (orientation flipped)

직접 만들기

Step 1: Transformation matrix 직접 만들기 (Python)

import math

def rotation_2d(theta):
    c, s = math.cos(theta), math.sin(theta)
    return [[c, -s], [s, c]]

def scaling_2d(sx, sy):
    return [[sx, 0], [0, sy]]

def shearing_2d(kx, ky):
    return [[1, kx], [ky, 1]]

def reflection_x():
    return [[1, 0], [0, -1]]

def reflection_y():
    return [[-1, 0], [0, 1]]

def mat_vec_mul(matrix, vector):
    return [
        sum(matrix[i][j] * vector[j] for j in range(len(vector)))
        for i in range(len(matrix))
    ]

def mat_mul(a, b):
    rows_a, cols_b = len(a), len(b[0])
    cols_a = len(a[0])
    return [
        [sum(a[i][k] * b[k][j] for k in range(cols_a)) for j in range(cols_b)]
        for i in range(rows_a)
    ]

point = [1.0, 0.0]
angle = math.pi / 4

rotated = mat_vec_mul(rotation_2d(angle), point)
print(f"(1,0)을 45도 회전: ({rotated[0]:.4f}, {rotated[1]:.4f})")

scaled = mat_vec_mul(scaling_2d(2, 3), [1.0, 1.0])
print(f"(1,1)을 (2,3)으로 스케일링: ({scaled[0]:.1f}, {scaled[1]:.1f})")

sheared = mat_vec_mul(shearing_2d(1, 0), [1.0, 1.0])
print(f"(1,1)을 kx=1로 전단: ({sheared[0]:.1f}, {sheared[1]:.1f})")

reflected = mat_vec_mul(reflection_y(), [2.0, 1.0])
print(f"(2,1)을 y-axis 기준 반사: ({reflected[0]:.1f}, {reflected[1]:.1f})")

Step 2: Transformation composition

R = rotation_2d(math.pi / 2)
S = scaling_2d(2, 0.5)

rotate_then_scale = mat_mul(S, R)
scale_then_rotate = mat_mul(R, S)

point = [1.0, 0.0]
result1 = mat_vec_mul(rotate_then_scale, point)
result2 = mat_vec_mul(scale_then_rotate, point)

print(f"90도 회전 후 스케일링: ({result1[0]:.2f}, {result1[1]:.2f})")
print(f"스케일링 후 90도 회전: ({result2[0]:.2f}, {result2[1]:.2f})")
print(f"같은가?: {result1 == result2}")

Step 3: 고유값(eigenvalue) 직접 계산하기 (2x2)

2x2 matrix [[a, b], [c, d]]의 eigenvalue는 characteristic equation lambda^2 - (a+d)*lambda + (ad - bc) = 0의 해입니다.

def eigenvalues_2x2(matrix):
    a, b = matrix[0]
    c, d = matrix[1]
    trace = a + d
    det = a * d - b * c
    discriminant = trace ** 2 - 4 * det
    if discriminant < 0:
        real = trace / 2
        imag = (-discriminant) ** 0.5 / 2
        return (complex(real, imag), complex(real, -imag))
    sqrt_disc = discriminant ** 0.5
    return ((trace + sqrt_disc) / 2, (trace - sqrt_disc) / 2)

def eigenvector_2x2(matrix, eigenvalue):
    a, b = matrix[0]
    c, d = matrix[1]
    if abs(b) > 1e-10:
        v = [b, eigenvalue - a]
    elif abs(c) > 1e-10:
        v = [eigenvalue - d, c]
    else:
        if abs(a - eigenvalue) < 1e-10:
            v = [1, 0]
        else:
            v = [0, 1]
    mag = (v[0] ** 2 + v[1] ** 2) ** 0.5
    return [v[0] / mag, v[1] / mag]

A = [[2, 1], [1, 2]]
vals = eigenvalues_2x2(A)
print(f"Matrix: {A}")
print(f"고유값: {vals[0]:.4f}, {vals[1]:.4f}")

for val in vals:
    vec = eigenvector_2x2(A, val)
    result = mat_vec_mul(A, vec)
    scaled = [val * vec[0], val * vec[1]]
    print(f"  lambda={val:.1f}, v={[round(x,4) for x in vec]}")
    print(f"    A@v = {[round(x,4) for x in result]}")
    print(f"    l*v = {[round(x,4) for x in scaled]}")

Step 4: 행렬식(determinant)을 volume scaling factor로 보기

def det_2x2(matrix):
    return matrix[0][0] * matrix[1][1] - matrix[0][1] * matrix[1][0]

print(f"det(rotation 45) = {det_2x2(rotation_2d(math.pi/4)):.4f}")
print(f"det(scale 2,3)   = {det_2x2(scaling_2d(2, 3)):.1f}")
print(f"det(shear kx=1)  = {det_2x2(shearing_2d(1, 0)):.1f}")
print(f"det(reflect y)   = {det_2x2(reflection_y()):.1f}")

singular = [[1, 2], [2, 4]]
print(f"det(singular)     = {det_2x2(singular):.1f}")
print("Singular: column들이 서로 비례하므로 공간이 한 직선으로 collapse됩니다.")

사용해보기

NumPy는 이 모든 계산을 최적화된 routine으로 처리합니다.

import numpy as np

theta = np.pi / 4
R = np.array([[np.cos(theta), -np.sin(theta)],
              [np.sin(theta),  np.cos(theta)]])

point = np.array([1.0, 0.0])
print(f"(1,0)을 45도 회전: {R @ point}")

S = np.diag([2.0, 3.0])
composed = S @ R
print(f"45도 회전 후 Scale(2,3): {composed @ point}")

A = np.array([[2, 1], [1, 2]], dtype=float)
eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"\n고유값: {eigenvalues}")
print(f"고유벡터(columns):\n{eigenvectors}")

for i in range(len(eigenvalues)):
    v = eigenvectors[:, i]
    lam = eigenvalues[i]
    print(f"  A @ v{i} = {A @ v}, lambda * v{i} = {lam * v}")

print(f"\ndet(R) = {np.linalg.det(R):.4f}")
print(f"det(S) = {np.linalg.det(S):.1f}")

B = np.array([[3, 1], [0, 2]], dtype=float)
vals, vecs = np.linalg.eig(B)
D = np.diag(vals)
V = vecs
reconstructed = V @ D @ np.linalg.inv(V)
print(f"\n고유분해 A = V @ D @ V^-1:")
print(f"원본:\n{B}")
print(f"복원:\n{reconstructed}")

NumPy로 3D rotation 다루기

def rotation_3d_z(theta):
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[c, -s, 0], [s, c, 0], [0, 0, 1]])

def rotation_3d_x(theta):
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[1, 0, 0], [0, c, -s], [0, s, c]])

point_3d = np.array([1.0, 0.0, 0.0])
rotated_z = rotation_3d_z(np.pi / 2) @ point_3d
rotated_x = rotation_3d_x(np.pi / 2) @ point_3d

print(f"\n3D point: {point_3d}")
print(f"z-axis 기준 90도 회전: {np.round(rotated_z, 4)}")
print(f"x-axis 기준 90도 회전: {np.round(rotated_x, 4)}")

산출물 만들기

이 lesson은 PCA(Phase 2)와 neural network weight analysis를 이해하기 위한 기하학적 기반을 만듭니다.

outputs/prompt-transformation-visualizer.md: matrix가 공간을 어떻게 바꾸는지 기하학적으로 설명하도록 돕는 prompt

여기서 만든 고유값/고유벡터(eigenvalue/eigenvector) 코드는 production ML system에서 dimensionality reduction, spectral clustering, stability analysis를 뒷받침하는 알고리즘과 같은 핵심 아이디어를 사용합니다.

연습문제

Unit square의 네 corner [0,0], [1,0], [1,1], [0,1]에 rotation, scaling, shearing을 적용합니다. 각 transformation 이후 corner 좌표를 출력합니다. Rotation이 corner 사이의 거리를 보존하는지 확인합니다.
특성방정식(characteristic equation)을 사용해 matrix [[4, 2], [1, 3]]의 고유값(eigenvalue)을 손으로 구합니다. 그런 다음 직접 만든 function과 NumPy로 검증합니다.
세 transformation을 compose합니다. 30도 rotation, [1.5, 0.8] scaling, kx=0.3 shearing을 조합한 뒤 원 위에 배치한 8개 point에 적용합니다. Before/after 좌표를 출력합니다. Composed matrix의 determinant를 계산하고, 개별 determinant의 곱과 같은지 확인합니다.

핵심 용어

용어	흔한 설명	실제 의미
회전 행렬(Rotation matrix)	회전시키는 것	거리와 각도를 보존하면서 점을 원호를 따라 이동시키는 orthogonal matrix입니다. 행렬식(determinant)은 항상 1입니다.
스케일링 행렬(Scaling matrix)	크게 만드는 것	각 axis 방향으로 독립적으로 늘리거나 줄이는 diagonal matrix입니다. 행렬식(determinant)은 scale factor의 곱입니다.
전단 행렬(Shearing matrix)	기울이는 것	한 coordinate를 다른 coordinate에 비례해 이동시켜 rectangle을 parallelogram으로 바꾸는 matrix입니다. 행렬식(determinant)은 1입니다.
반사(Reflection)	거울처럼 뒤집기	Axis나 plane을 기준으로 공간을 뒤집는 matrix입니다. 행렬식(determinant)은 -1입니다.
합성(Composition)	두 가지를 이어서 하기	Transformation matrix를 곱해 operation을 연결하는 것입니다. 순서가 중요하며 `B @ A`는 A를 먼저 적용하고 B를 적용한다는 뜻입니다.
고유벡터(Eigenvector)	특별한 방향	Matrix가 회전시키지 않고 scale만 하는 방향입니다. Transformation의 fingerprint입니다.
고유값(Eigenvalue)	얼마나 늘어나는지	Matrix가 고유벡터(eigenvector)를 scale하는 scalar factor입니다. 음수일 수도 있고, rotation이 포함되면 complex일 수도 있습니다.
고유분해(Eigendecomposition)	Matrix를 분해하기	`V @ D @ V^(-1)`로 matrix를 써서 fundamental scaling direction과 magnitude를 분리하는 것입니다.
행렬식(Determinant)	Matrix에서 나오는 하나의 숫자	Transformation이 area(2D)나 volume(3D)을 scale하는 factor입니다. 0이면 irreversible transformation입니다.
특성방정식(Characteristic equation)	고유값(Eigenvalue)이 나오는 식	`det(A - lambda * I) = 0`입니다. 이 polynomial의 root가 고유값(eigenvalue)입니다.

더 읽을거리

3Blue1Brown: Linear Transformations — matrix가 공간을 어떻게 바꾸는지 시각적으로 이해할 때 좋습니다.
3Blue1Brown: Eigenvectors and Eigenvalues — eigenvector의 기하학적 의미를 복습할 때 좋습니다.
MIT 18.06 Linear Algebra — Gilbert Strang의 선형대수 강의로 eigenvalue와 eigenvector를 더 깊게 볼 수 있습니다.

실습 코드

이 강의의 실습 코드 2개

transformations

Code

transformations

Code

산출물

이 강의에서 생성된 프롬프트, 스킬, 코드 산출물 1개

prompt-transformation-visualizer

Explain what a matrix transformation does geometrically given its entries

Prompt

확인 문제

3문제 · 모두 맞추면 완료 표시가 가능합니다

1.행렬 변환(matrix transformation)의 순서가 중요한 이유는 무엇인가요? 즉 왜 `R @ S`와 `S @ R`이 다른가요?

2.순환 신경망(recurrent neural network)에서 가중치 행렬(weight matrix)의 고유값 크기(eigenvalue magnitude)가 1보다 크면 어떤 일이 생기나요?

3.행렬 A = [[2, 1], [1, 2]]의 고유값(eigenvalue)이 3과 1입니다. 고유분해(eigendecomposition) `A = V @ D @ V^(-1)`는 무엇을 보여 주나요?

0/3 답변 완료

이전 강의

벡터, 행렬과 연산

다음 강의

머신러닝을 위한 미적분 — 도함수와 기울기