Understanding the fundamentals of random walks and how RandomWalker implements them.
A random walk is a mathematical model describing a path consisting of a succession of random steps. At each point in time, the next step is determined by chance.
Imagine flipping a coin: - Heads: Move one step forward (+1) - Tails: Move one step backward (-1) - Start: Position 0
After 10 flips, you might be at position +2, -4, or anywhere else. This is a random walk!
# Coin flip random walk
coin_walk <- discrete_walk(
.num_walks = 1,
.n = 100,
.upper_bound = 1,
.lower_bound = -1,
.upper_probability = 0.5
)
coin_walk |> visualize_walks(.pluck = "cum_sum")Each step is ±1 with equal probability:
discrete_walk(
.num_walks = 10,
.upper_bound = 1,
.lower_bound = -1,
.upper_probability = 0.5
) |> head(10)
#> # A tibble: 10 × 8
#> walk_number step_number y cum_sum_y cum_prod_y cum_min_y cum_max_y
#> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 1 101 200 101 101
#> 2 1 2 1 102 400 101 101
#> 3 1 3 1 103 800 101 101
#> 4 1 4 -1 102 0 99 101
#> 5 1 5 1 103 0 99 101
#> 6 1 6 -1 102 0 99 101
#> 7 1 7 -1 101 0 99 101
#> 8 1 8 -1 100 0 99 101
#> 9 1 9 -1 99 0 99 101
#> 10 1 10 -1 98 0 99 101
#> # ℹ 1 more variable: cum_mean_y <dbl>Properties: - Symmetric (unbiased) - Steps are independent - Mean position = 0 - Variance grows linearly with time
Steps have a non-zero mean (bias in one direction):
random_normal_drift_walk(
.num_walks = 10,
.drift = 0.1 # Positive drift
) |> head(10)
#> # A tibble: 10 × 8
#> walk_number step_number y cum_sum_y cum_prod_y cum_min_y cum_max_y
#> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 1.62 1.62 0 1.62 1.62
#> 2 1 2 0.376 1.99 0 0.376 1.62
#> 3 1 3 0.105 2.10 0 0.105 1.62
#> 4 1 4 1.47 3.57 0 0.105 1.62
#> 5 1 5 1.04 4.61 0 0.105 1.62
#> 6 1 6 1.27 5.88 0 0.105 1.62
#> 7 1 7 1.97 7.85 0 0.105 1.97
#> 8 1 8 1.26 9.11 0 0.105 1.97
#> 9 1 9 0.311 9.42 0 0.105 1.97
#> 10 1 10 -0.330 9.09 0 -0.330 1.97
#> # ℹ 1 more variable: cum_mean_y <dbl>Properties: - Asymmetric (biased) - Tends to move in one direction - Mean position ≠ 0 - Can model trending data
Continuous-time random walk:
brownian_motion(
.num_walks = 10,
.delta_time = 1
) |> head(10)
#> # A tibble: 10 × 8
#> walk_number step_number y cum_sum_y cum_prod_y cum_min_y cum_max_y
#> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 -0.791 -0.791 0 -0.791 -0.791
#> 2 1 2 0.236 -0.555 0 -0.791 0.236
#> 3 1 3 -0.422 -0.977 0 -0.791 0.236
#> 4 1 4 -0.875 -1.85 0 -0.875 0.236
#> 5 1 5 -2.11 -3.96 0 -2.11 0.236
#> 6 1 6 -0.784 -4.75 0 -2.11 0.236
#> 7 1 7 -1.82 -6.57 0 -2.11 0.236
#> 8 1 8 -0.333 -6.90 0 -2.11 0.236
#> 9 1 9 -0.282 -7.18 0 -2.11 0.236
#> 10 1 10 -1.86 -9.04 0 -2.11 0.236
#> # ℹ 1 more variable: cum_mean_y <dbl>Properties: - Continuous in time - Normally distributed increments - Foundation of stochastic calculus - Used in physics and finance
Multiplicative random walk (always positive):
geometric_brownian_motion(
.num_walks = 10,
.initial_value = 100
) |> head(10)
#> # A tibble: 10 × 8
#> walk_number step_number y cum_sum_y cum_prod_y cum_min_y cum_max_y
#> <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 0.998 101. 200. 101. 101.
#> 2 1 2 0.992 102. 398. 101. 101.
#> 3 1 3 0.976 103. 786. 101. 101.
#> 4 1 4 0.982 104. 1558. 101. 101.
#> 5 1 5 0.992 105. 3105. 101. 101.
#> 6 1 6 0.995 106. 6194. 101. 101.
#> 7 1 7 0.999 107. 12381. 101. 101.
#> 8 1 8 0.996 108. 24717. 101. 101.
#> 9 1 9 0.997 109. 49358. 101. 101.
#> 10 1 10 1.00 110. 98721. 101. 101.
#> # ℹ 1 more variable: cum_mean_y <dbl>Properties: - Cannot go negative - Used for stock prices - Log-normal distribution - Percentage changes are normal
For a symmetric random walk starting at 0:
Expected value after n steps = 0
For standard random walk:
Variance after n steps = n
Expected distance grows as √n:
E[|position|] ∝ √n
# Verify with 2D walk
walks_2d <- random_normal_walk(.num_walks = 100, .n = 500, .dimensions = 2)
walks_2d |>
euclidean_distance(.x = x, .y = y) |>
group_by(step_number) |>
reframe(
mean_distance = mean(distance),
theoretical = sqrt(step_number)
) |>
filter(step_number %% 50 == 0) |>
head(10)
#> # A tibble: 10 × 3
#> step_number mean_distance theoretical
#> <int> <dbl> <dbl>
#> 1 50 0.186 7.07
#> 2 50 0.186 7.07
#> 3 50 0.186 7.07
#> 4 50 0.186 7.07
#> 5 50 0.186 7.07
#> 6 50 0.186 7.07
#> 7 50 0.186 7.07
#> 8 50 0.186 7.07
#> 9 50 0.186 7.07
#> 10 50 0.186 7.07For 1D symmetric walk: - Probability of eventual return = 1 (certain to return) - Expected return time = ∞ (infinite expected time!)
For 2D symmetric walk: - Probability of eventual return = 1
For 3D symmetric walk: - Probability of eventual return ≈ 0.34 (not certain!)
Random walks exhibit scaling invariance: - If you zoom out by factor k - Time scales by k² - Position scales by k
Position after n steps:
X(n) = X(0) + Σ(i=1 to n) Δᵢ
Where Δᵢ are independent random steps.
For standard normal walk: - Δᵢ ~ N(0, 1) - X(n) ~ N(0, n) - E[X(n)] = 0 - Var[X(n)] = n
Continuous-time stochastic process:
dX(t) = μ dt + σ dW(t)
Where: - μ = drift coefficient - σ = volatility coefficient - W(t) = standard Wiener process
Properties: - W(0) = 0 - W(t) ~ N(0, t) - W(t) - W(s) ~ N(0, t-s) for t > s - Independent increments
For stock prices:
dS(t) = μ S(t) dt + σ S(t) dW(t)
Solution:
S(t) = S(0) exp((μ - σ²/2)t + σW(t))
Properties: - Always positive - Log-normal distribution - Used in Black-Scholes model
# What rw30() does internally:
# 1. Generate random steps
steps <- rnorm(100, mean = 0, sd = 1)
# 2. Compute cumulative sum
positions <- cumsum(c(0, steps[-100]))
# 3. Add to tibble
walk_data <- tibble::tibble(
step_number = 1:100,
y = steps,
cum_sum = positions
)
# 4. Add more cumulative functions
walk_data <- walk_data |>
mutate(
cum_prod = cumprod(1 + y),
cum_min = cummin(y),
cum_max = cummax(y),
cum_mean = cumsum(y) / step_number
)
walk_data |> head(10)
#> # A tibble: 10 × 7
#> step_number y cum_sum cum_prod cum_min cum_max cum_mean
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 -0.230 0 0.770 -0.230 -0.230 -0.230
#> 2 2 -1.21 -0.230 -0.160 -1.21 -0.230 -0.719
#> 3 3 -1.28 -1.44 0.0442 -1.28 -0.230 -0.905
#> 4 4 1.50 -2.71 0.111 -1.28 1.50 -0.303
#> 5 5 0.177 -1.21 0.130 -1.28 1.50 -0.207
#> 6 6 1.44 -1.03 0.318 -1.28 1.50 0.0681
#> 7 7 -1.05 0.409 -0.0174 -1.28 1.50 -0.0923
#> 8 8 1.14 -0.646 -0.0372 -1.28 1.50 0.0622
#> 9 9 -2.02 0.498 0.0381 -2.02 1.50 -0.170
#> 10 10 0.136 -1.53 0.0433 -2.02 1.50 -0.1391D Walk: - Single value per step: y -
Position: cum_sum
2D Walk: - Two values per step: x,
y - Position: (cum_sum_x, cum_sum_y) -
Distance: sqrt(cum_sum_x² + cum_sum_y²)
3D Walk: - Three values per step: x,
y, z - Position:
(cum_sum_x, cum_sum_y, cum_sum_z) - Distance:
sqrt(cum_sum_x² + cum_sum_y² + cum_sum_z²)
| Term | Definition | Example |
|---|---|---|
| Walk | A single realization of the random process | One stock price path |
| Step | One random increment | Daily price change |
| Trajectory | Path taken by the walk | Price history |
| Cumulative sum | Running total of steps | Stock price level |
| Displacement | Distance from starting point | Profit/loss |
| Excursion | Distance from reference point | Drawdown |
| First passage time | Time to first reach a level | Time to profit |
| Return time | Time to return to starting point | Recovery time |
| Term | Definition |
|---|---|
| Mean | Average value |
| Variance | Spread of values |
| Standard deviation | √Variance |
| Skewness | Asymmetry measure |
| Kurtosis | Tail heaviness |
| Quantile | Percentile value |
| Confidence interval | Range containing true value with probability |
| Distribution | Use Case | Parameters |
|---|---|---|
| Normal | General purpose | μ (mean), σ (sd) |
| Uniform | Equal probabilities | min, max |
| Exponential | Waiting times | λ (rate) |
| Poisson | Event counts | λ (rate) |
| Cauchy | Heavy tails | location, scale |
| Binomial | Success counts | n (trials), p (prob) |
# Generate many walks
walks <- random_normal_walk(.num_walks = 1000, .n = 100)
# Property 1: Mean = 0
walks |>
summarize(overall_mean = mean(cum_sum_y))
#> # A tibble: 1 × 1
#> overall_mean
#> <dbl>
#> 1 -0.00659
# Property 2: Variance = n
walks |>
filter(step_number == 80) |>
summarize(
variance = var(cum_sum_y),
theoretical = 80
)
#> # A tibble: 1 × 2
#> variance theoretical
#> <dbl> <dbl>
#> 1 1.44 80
# Property 3: Distance ∝ √n
walks |>
group_by(step_number) |>
reframe(
mean_abs_position = mean(abs(cum_sum_y)),
theoretical = sqrt(2/pi) * sqrt(step_number) # Exact for normal
) |>
filter(step_number %% 20 == 0) |>
head(5)
#> # A tibble: 5 × 3
#> step_number mean_abs_position theoretical
#> <int> <dbl> <dbl>
#> 1 20 0.373 3.57
#> 2 20 0.373 3.57
#> 3 20 0.373 3.57
#> 4 20 0.373 3.57
#> 5 20 0.373 3.57# Generate walks
walks <- random_normal_walk(.num_walks = 10000, .n = 100)
# Get final positions
final_pos <- walks |>
group_by(walk_number) |>
slice_max(step_number) |>
pull(cum_sum_y)
# Plot
tibble::tibble(position = final_pos) |>
ggplot(aes(x = position)) +
geom_histogram(aes(y = after_stat(density)), bins = 50,
fill = "steelblue", alpha = 0.7) +
stat_function(fun = dnorm, args = list(mean = 0, sd = 1),
color = "red", linewidth = 1) +
theme_minimal() +
labs(
title = "Distribution of Final Positions (n=100)",
subtitle = "Theoretical N(0, 1) in red",
x = "Final Position",
y = "Density"
)Random walks are path-dependent - the ending doesn’t tell you the route:
# Generate walks ending at similar positions
set.seed(123)
walks <- random_normal_walk(.num_walks = 100, .n = 100)
# Find walks ending near 10
similar_end <- walks |>
group_by(walk_number) |>
filter(step_number == 80, abs(cum_sum_y - 1) < 0.5)
# Plot their paths - very different!
walks |>
filter(walk_number %in% similar_end$walk_number) |>
visualize_walks(.pluck = "cum_sum", .alpha = 0.5)Now that you understand the basics:
Ready to generate walks? Head to the Getting Started vignette!