---
title: "Basic Concepts"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Basic Concepts}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)
```

```{r setup, echo=FALSE, message=FALSE}
library(RandomWalker)
library(dplyr)
library(ggplot2)
```

Understanding the fundamentals of random walks and how RandomWalker implements them.

## Table of Contents

- [What is a Random Walk?](#what-is-a-random-walk)
- [Types of Random Walks](#types-of-random-walks)
- [Key Properties](#key-properties)
- [Mathematical Background](#mathematical-background)
- [RandomWalker Implementation](#randomwalker-implementation)
- [Common Terminology](#common-terminology)

## What is a Random Walk?

A **random walk** is a mathematical model describing a path consisting of a succession of random steps. At each point in time, the next step is determined by chance.

### Simple Example

Imagine flipping a coin:
- **Heads**: Move one step forward (+1)
- **Tails**: Move one step backward (-1)
- **Start**: Position 0

After 10 flips, you might be at position +2, -4, or anywhere else. This is a random walk!

```{r coin_flip_example, fig.width=7, fig.height=4}
#| fig.alt: >
#|   Line plot showing a single random walk simulating coin flips over 100 steps.
#|   The walk moves up or down by 1 with equal probability at each step, starting
#|   from position 0. The x-axis shows the step number and the y-axis shows the
#|   cumulative sum position.
# Coin flip random walk
coin_walk <- discrete_walk(
  .num_walks = 1,
  .n = 100,
  .upper_bound = 1,
  .lower_bound = -1,
  .upper_probability = 0.5
)

coin_walk |> visualize_walks(.pluck = "cum_sum")
```

### Real-World Analogies

- **Stock prices**: Daily price changes are like random steps
- **Particle motion**: Molecules moving due to thermal energy
- **Drunk person walking**: Each step in a random direction
- **Photon path**: Light scattering through a medium

## Types of Random Walks

### 1. Simple Random Walk

Each step is ±1 with equal probability:

```{r simple_walk}
discrete_walk(
  .num_walks = 10,
  .upper_bound = 1,
  .lower_bound = -1,
  .upper_probability = 0.5
) |> head(10)
```

**Properties:**
- Symmetric (unbiased)
- Steps are independent
- Mean position = 0
- Variance grows linearly with time

### 2. Random Walk with Drift

Steps have a non-zero mean (bias in one direction):

```{r drift_walk}
random_normal_drift_walk(
  .num_walks = 10,
  .drift = 0.1  # Positive drift
) |> head(10)
```

**Properties:**
- Asymmetric (biased)
- Tends to move in one direction
- Mean position ≠ 0
- Can model trending data

### 3. Brownian Motion (Wiener Process)

Continuous-time random walk:

```{r brownian_motion}
brownian_motion(
  .num_walks = 10,
  .delta_time = 1
) |> head(10)
```

**Properties:**
- Continuous in time
- Normally distributed increments
- Foundation of stochastic calculus
- Used in physics and finance

### 4. Geometric Brownian Motion

Multiplicative random walk (always positive):

```{r geometric_brownian}
geometric_brownian_motion(
  .num_walks = 10,
  .initial_value = 100
) |> head(10)
```

**Properties:**
- Cannot go negative
- Used for stock prices
- Log-normal distribution
- Percentage changes are normal

## Key Properties

### Property 1: Mean Displacement

For a symmetric random walk starting at 0:

**Expected value after n steps = 0**

```{r mean_displacement}
# Verify empirically
walks <- random_normal_walk(.num_walks = 1000, .n = 100)

walks |>
  summarize(overall_mean = mean(cum_sum_y))
```

### Property 2: Variance Growth

For standard random walk:

**Variance after n steps = n**

```{r variance_growth}
# Verify empirically
walks <- random_normal_walk(.num_walks = 1000, .n = 100)

walks |>
  filter(step_number == 80) |>
  summarize(
    variance = var(cum_sum_y),
    theoretical = 80
  )
```

### Property 3: Distance from Origin

Expected distance grows as √n:

**E[|position|] ∝ √n**

```{r distance_origin}
# Verify with 2D walk
walks_2d <- random_normal_walk(.num_walks = 100, .n = 500, .dimensions = 2)

walks_2d |>
  euclidean_distance(.x = x, .y = y) |>
  group_by(step_number) |>
  reframe(
    mean_distance = mean(distance),
    theoretical = sqrt(step_number)
  ) |>
  filter(step_number %% 50 == 0) |>
  head(10)
```

### Property 4: First Return to Origin

For 1D symmetric walk:
- **Probability of eventual return = 1** (certain to return)
- **Expected return time = ∞** (infinite expected time!)

For 2D symmetric walk:
- **Probability of eventual return = 1**

For 3D symmetric walk:
- **Probability of eventual return ≈ 0.34** (not certain!)

### Property 5: Scaling

Random walks exhibit **scaling invariance**:
- If you zoom out by factor k
- Time scales by k²
- Position scales by k

## Mathematical Background

### One-Dimensional Random Walk

**Position after n steps:**
```
X(n) = X(0) + Σ(i=1 to n) Δᵢ
```

Where Δᵢ are independent random steps.

**For standard normal walk:**
- Δᵢ ~ N(0, 1)
- X(n) ~ N(0, n)
- E[X(n)] = 0
- Var[X(n)] = n

### Brownian Motion

**Continuous-time stochastic process:**
```
dX(t) = μ dt + σ dW(t)
```

Where:
- μ = drift coefficient
- σ = volatility coefficient
- W(t) = standard Wiener process

**Properties:**
- W(0) = 0
- W(t) ~ N(0, t)
- W(t) - W(s) ~ N(0, t-s) for t > s
- Independent increments

### Geometric Brownian Motion

**For stock prices:**
```
dS(t) = μ S(t) dt + σ S(t) dW(t)
```

**Solution:**
```
S(t) = S(0) exp((μ - σ²/2)t + σW(t))
```

**Properties:**
- Always positive
- Log-normal distribution
- Used in Black-Scholes model

## RandomWalker Implementation

### How RandomWalker Works

1. **Generate random steps** from specified distribution
2. **Compute cumulative sum** (position over time)
3. **Add cumulative statistics** (min, max, mean, product)
4. **Return tidy tibble** for analysis

### Example: Behind the Scenes

```{r behind_scenes}
# What rw30() does internally:

# 1. Generate random steps
steps <- rnorm(100, mean = 0, sd = 1)

# 2. Compute cumulative sum
positions <- cumsum(c(0, steps[-100]))

# 3. Add to tibble
walk_data <- tibble::tibble(
  step_number = 1:100,
  y = steps,
  cum_sum = positions
)

# 4. Add more cumulative functions
walk_data <- walk_data |>
  mutate(
    cum_prod = cumprod(1 + y),
    cum_min = cummin(y),
    cum_max = cummax(y),
    cum_mean = cumsum(y) / step_number
  )

walk_data |> head(10)
```

### Dimensions

**1D Walk:**
- Single value per step: `y`
- Position: `cum_sum`

**2D Walk:**
- Two values per step: `x`, `y`
- Position: `(cum_sum_x, cum_sum_y)`
- Distance: `sqrt(cum_sum_x² + cum_sum_y²)`

**3D Walk:**
- Three values per step: `x`, `y`, `z`
- Position: `(cum_sum_x, cum_sum_y, cum_sum_z)`
- Distance: `sqrt(cum_sum_x² + cum_sum_y² + cum_sum_z²)`

## Common Terminology

### Terms Used in RandomWalker

| Term | Definition | Example |
|------|------------|---------|
| **Walk** | A single realization of the random process | One stock price path |
| **Step** | One random increment | Daily price change |
| **Trajectory** | Path taken by the walk | Price history |
| **Cumulative sum** | Running total of steps | Stock price level |
| **Displacement** | Distance from starting point | Profit/loss |
| **Excursion** | Distance from reference point | Drawdown |
| **First passage time** | Time to first reach a level | Time to profit |
| **Return time** | Time to return to starting point | Recovery time |

### Statistical Terms

| Term | Definition |
|------|------------|
| **Mean** | Average value |
| **Variance** | Spread of values |
| **Standard deviation** | √Variance |
| **Skewness** | Asymmetry measure |
| **Kurtosis** | Tail heaviness |
| **Quantile** | Percentile value |
| **Confidence interval** | Range containing true value with probability |

### Probability Distributions

| Distribution | Use Case | Parameters |
|--------------|----------|------------|
| **Normal** | General purpose | μ (mean), σ (sd) |
| **Uniform** | Equal probabilities | min, max |
| **Exponential** | Waiting times | λ (rate) |
| **Poisson** | Event counts | λ (rate) |
| **Cauchy** | Heavy tails | location, scale |
| **Binomial** | Success counts | n (trials), p (prob) |

## Worked Examples

### Example 1: Verify Properties

```{r verify_properties}
# Generate many walks
walks <- random_normal_walk(.num_walks = 1000, .n = 100)

# Property 1: Mean = 0
walks |>
  summarize(overall_mean = mean(cum_sum_y))

# Property 2: Variance = n
walks |>
  filter(step_number == 80) |>
  summarize(
    variance = var(cum_sum_y),
    theoretical = 80
  )

# Property 3: Distance ∝ √n
walks |>
  group_by(step_number) |>
  reframe(
    mean_abs_position = mean(abs(cum_sum_y)),
    theoretical = sqrt(2/pi) * sqrt(step_number)  # Exact for normal
  ) |>
  filter(step_number %% 20 == 0) |>
  head(5)
```

### Example 2: Distribution of Final Position

```{r final_position_dist, fig.width=7, fig.height=4}
#| fig.alt: >
#|   Histogram showing the distribution of final positions for 10,000 random walks
#|   after 100 steps each. The histogram uses blue bars showing the empirical density,
#|   overlaid with a red curve representing the theoretical normal distribution
#|   N(0, 1). The distribution is centered near 0 with spread approximately 1,
#|   demonstrating that final positions follow a normal distribution.
# Generate walks
walks <- random_normal_walk(.num_walks = 10000, .n = 100)

# Get final positions
final_pos <- walks |>
  group_by(walk_number) |>
  slice_max(step_number) |>
  pull(cum_sum_y)

# Plot
tibble::tibble(position = final_pos) |>
  ggplot(aes(x = position)) +
  geom_histogram(aes(y = after_stat(density)), bins = 50,
                 fill = "steelblue", alpha = 0.7) +
  stat_function(fun = dnorm, args = list(mean = 0, sd = 1),
                color = "red", linewidth = 1) +
  theme_minimal() +
  labs(
    title = "Distribution of Final Positions (n=100)",
    subtitle = "Theoretical N(0, 1) in red",
    x = "Final Position",
    y = "Density"
  )
```

### Example 3: Path Dependency

Random walks are **path-dependent** - the ending doesn't tell you the route:

```{r path_dependency, fig.width=7, fig.height=4}
#| fig.alt: >
#|   Line plot showing multiple random walk trajectories that pass through similar positions
#|   (near 1 at step 80) but take very different paths. Each semi-transparent line shows the
#|   complete 100-step trajectory of one walk, demonstrating path dependency - walks passing
#|   through the same point can have very different histories and futures.
# Generate walks ending at similar positions
set.seed(123)
walks <- random_normal_walk(.num_walks = 100, .n = 100)

# Find walks ending near 10
similar_end <- walks |>
  group_by(walk_number) |>
  filter(step_number == 80, abs(cum_sum_y - 1) < 0.5)

# Plot their paths - very different!
walks |>
  filter(walk_number %in% similar_end$walk_number) |>
  visualize_walks(.pluck = "cum_sum", .alpha = 0.5)
```

## Next Steps

Now that you understand the basics:

- **Quick Start Guide** - Start using RandomWalker (see Getting Started vignette)
- **Continuous Distribution Generators** - Explore distributions (see API Reference)
- **Statistical Analysis Guide** - Analyze properties
- **Use Cases and Examples** - Real-world applications

## Further Reading

### Academic Resources

- **Books:**
  - "Random Walks and Electric Networks" by Doyle & Snell
  - "A Guide to Brownian Motion" by Mörters & Peres
  - "Stochastic Processes" by Ross

- **Papers:**
  - Einstein's 1905 paper on Brownian motion
  - Pearson's 1905 paper introducing the term "random walk"

### Online Resources

- [Wikipedia: Random Walk](https://en.wikipedia.org/wiki/Random_walk)
- [Wikipedia: Brownian Motion](https://en.wikipedia.org/wiki/Brownian_motion)
- [MIT OpenCourseWare: Stochastic Processes](https://ocw.mit.edu/)

---

**Ready to generate walks?** Head to the **Getting Started** vignette!