Predicting Bitcoin Price through ML and Technical Indicators

20 hours ago 6

Imagine for a moment that you’re a detective tasked with predicting Bitcoin’s price tomorrow — no small feat considering the digital currency behaves more like an unpredictable celebrity than a stable asset. Bitcoin, notorious for its dramatic ups and downs, makes predicting its next move feel akin to forecasting human emotions based purely on past interactions: intriguing yet uncertain.

Our trusty ally in this cryptic pursuit is a Python script, aptly named bitcoinML.py, which acts like our detective toolkit or perhaps a magical cookbook. It smartly blends history, mathematics, computer magic, and a dash of intuition to forecast Bitcoin’s next price.

import pandas as pd
import numpy as np
np.NaN = np.nan # For pandas_ta compatibility
import matplotlib.pyplot as plt
import requests
from datetime import datetime, timedelta
import pandas_ta as ta
import mplfinance as mpf
import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
import warnings
warnings.filterwarnings('ignore')

class BitcoinTechnicalAnalysisML:
def __init__(self):
"""Set up our tools for predicting Bitcoin prices"""
self.data = None # Where we'll store Bitcoin info
self.api_url = "https://api.coingecko.com/api/v3/coins/bitcoin/market_chart"
# Our crystal ball: a smart model that learns patterns, using the GPU for speed
self.model = xgb.XGBRegressor(n_estimators=100, tree_method='hist', device='cuda', random_state=42)
self.scaler = StandardScaler() # A tool to make numbers easier for the model

def fetch_bitcoin_data(self, days=365, vs_currency='usd'):
"""Step 1: Grab Bitcoin's price history, like checking a year's worth of receipts"""
try:
params = {'vs_currency': vs_currency, 'days': days, 'interval': 'daily'}
response = requests.get(self.api_url, params=params)
response.raise_for_status()

data = response.json()

df = pd.DataFrame({
'Timestamp': [x[0] for x in data['prices']],
'Close': [x[1] for x in data['prices']],
'Volume': [x[1] for x in data['total_volumes']]
})

df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit='ms')
df.set_index('Timestamp', inplace=True)

df['High'] = df['Close'] * 1.02 # Guess the day's high (a little above close)
df['Low'] = df['Close'] * 0.98 # Guess the day's low (a little below close)
df['Open'] = df['Close'].shift(1) # Yesterday’s close is today’s open

self.data = df.dropna() # Remove any incomplete days
print(f"We’ve grabbed {len(self.data)} days of Bitcoin prices, from {self.data.index[0].strftime('%Y-%m-%d')} to {self.data.index[-1].strftime('%Y-%m-%d')}—like a year-long diary of Bitcoin’s ups and downs!")
return self.data

except requests.exceptions.RequestException as e:
print(f"Oops! Couldn’t get the Bitcoin data because: {e}. Maybe the internet’s down?")
return None

def calculate_indicators(self):
"""Step 2: Add clues to guess where Bitcoin’s price is heading"""
if self.data is None:
print("Hold on! We need Bitcoin data first. Run fetch_bitcoin_data() to get it.")
return None

df = self.data.copy()
print("Now, we’re adding some smart clues—like checking Bitcoin’s mood, speed, and patterns—to help us predict its next move.")

# Moving averages: Like smoothing out a bumpy road to see the trend
df['SMA7'] = ta.sma(df['Close'], length=7) # 7-day average
df['SMA25'] = ta.sma(df['Close'], length=25) # 25-day average
df['SMA50'] = ta.sma(df['Close'], length=50)
df['SMA99'] = ta.sma(df['Close'], length=99)
df['SMA200'] = ta.sma(df['Close'], length=200)

df['EMA12'] = ta.ema(df['Close'], length=12) # Quick 12-day trend
df['EMA26'] = ta.ema(df['Close'], length=26) # Slower 26-day trend

df['MA111'] = ta.sma(df['Close'], length=111)
df['MA350x2'] = ta.sma(df['Close'], length=350) * 2 # Long-term doubled

macd = ta.macd(df['Close'], fast=12, slow=26, signal=9) # Momentum checker
df['MACD'] = macd['MACD_12_26_9']
df['MACD_Signal'] = macd['MACDs_12_26_9']
df['MACD_Hist'] = macd['MACDh_12_26_9']

df['SAR'] = ta.psar(df['High'], df['Low'], df['Close'])['PSARl_0.02_0.2'] # Trend direction

df['RSI'] = ta.rsi(df['Close'], length=14) # Is Bitcoin overexcited or sleepy?

stoch = ta.stoch(df['High'], df['Low'], df['Close'], k=14, d=3, smooth_k=3) # Speed gauge
df['StochK'] = stoch['STOCHk_14_3_3']
df['StochD'] = stoch['STOCHd_14_3_3']

bbands = ta.bbands(df['Close'], length=20, std=2) # Price range bands
df['BB_Upper'] = bbands['BBU_20_2.0']
df['BB_Middle'] = bbands['BBM_20_2.0']
df['BB_Lower'] = bbands['BBL_20_2.0']

df['CCI'] = ta.cci(df['High'], df['Low'], df['Close'], length=14) # Overbought/oversold

df['OBV'] = ta.obv(df['Close'], df['Volume']) # Volume trend
df['CMF'] = ta.adosc(df['High'], df['Low'], df['Close'], df['Volume'], fast=3, slow=10) # Money flow

df['ForceIndex'] = df['Close'].diff(1) * df['Volume'] # Price push
df['ForceIndex13'] = ta.ema(df['ForceIndex'], length=13)

df['ATR'] = ta.atr(df['High'], df['Low'], df['Close'], length=14) # Volatility

recent_high = df['High'].iloc[-100:].max() # Last 100 days' peak
recent_low = df['Low'].iloc[-100:].min() # Last 100 days' dip
df['Fib_0'] = recent_low
df['Fib_23.6'] = recent_low + 0.236 * (recent_high - recent_low) # Fibonacci levels
df['Fib_38.2'] = recent_low + 0.382 * (recent_high - recent_low)
df['Fib_50'] = recent_low + 0.5 * (recent_high - recent_low)
df['Fib_61.8'] = recent_low + 0.618 * (recent_high - recent_low)
df['Fib_100'] = recent_high

self.data = df
print(f"Done! We’ve added {len(df.columns)} clues—like Bitcoin’s mood swings and spending habits—to make our prediction smarter.")
return df

def prepare_ml_data(self):
"""Step 3: Get our clues ready for the prediction machine"""
if self.data is None:
print("Oops! We need the clues first. Run calculate_indicators() after fetching data.")
return None

df = self.data.copy()
print("We’re setting up the puzzle: tomorrow’s price is what we want to guess, using today’s clues.")

df['Target'] = df['Close'].shift(-1) # Tomorrow’s price is our goal
df = df.dropna() # Skip days with missing pieces

features = ['Open', 'High', 'Low', 'Close', 'Volume', 'SMA7', 'SMA25', 'SMA50', 'SMA99', 'SMA200',
'EMA12', 'EMA26', 'MA111', 'MA350x2', 'MACD', 'MACD_Signal', 'MACD_Hist', 'SAR',
'RSI', 'StochK', 'StochD', 'BB_Upper', 'BB_Middle', 'BB_Lower', 'CCI', 'OBV',
'CMF', 'ForceIndex', 'ForceIndex13', 'ATR']

X = df[features] # Our clue pile
y = df['Target'] # The answer we’re after

X_scaled = self.scaler.fit_transform(X) # Make clues easier to compare, like putting them all in the same language
print(f"Puzzle ready! We have {len(X)} days to learn from, with {len(features)} clues each—like ingredients for a Bitcoin price recipe.")
return X_scaled, y, X.index

def train_model(self, test_size=0.2):
"""Step 4: Teach our crystal ball to predict Bitcoin prices"""
X_scaled, y, dates = self.prepare_ml_data()
if X_scaled is None:
return None

X_train, X_test, y_train, y_test, dates_train, dates_test = train_test_split(
X_scaled, y, dates, test_size=test_size, shuffle=False
)
print(f"We’re teaching our prediction machine with {len(X_train)} days of history and testing it on the last {len(X_test)} days—like practicing with old weather forecasts before predicting tomorrow’s rain.")

self.model.fit(X_train, y_train) # Let the machine learn the patterns

y_pred = self.model.predict(X_test) # Test its guesses
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
print(f"Training done! Our machine’s guesses were off by about ${rmse:.2f} on average (that’s the RMSE). The MSE ({mse:.2f}) is a bigger number showing the total error squared—smaller is better!")

return X_test, y_test, y_pred, dates_test

def predict_next_day(self):
"""Step 5: Look into the future with our trained crystal ball"""
if self.data is None:
print("Wait! We need data and clues first. Run the earlier steps.")
return None

last_data = self.data.tail(1)
features = ['Open', 'High', 'Low', 'Close', 'Volume', 'SMA7', 'SMA25', 'SMA50', 'SMA99', 'SMA200',
'EMA12', 'EMA26', 'MA111', 'MA350x2', 'MACD', 'MACD_Signal', 'MACD_Hist', 'SAR',
'RSI', 'StochK', 'StochD', 'BB_Upper', 'BB_Middle', 'BB_Lower', 'CCI', 'OBV',
'CMF', 'ForceIndex', 'ForceIndex13', 'ATR']

X_last = last_data[features]
X_last_scaled = self.scaler.transform(X_last)
prediction = self.model.predict(X_last_scaled)[0]

last_date = self.data.index[-1]
next_date = last_date + timedelta(days=1)
print(f"Looking at yesterday ({last_date.strftime('%Y-%m-%d')}, price was ${last_data['Close'].values[0]:.2f}), our crystal ball says tomorrow ({next_date.strftime('%Y-%m-%d')}) will be ${prediction:.2f}. It’s using all those clues we gathered!")
return prediction

def plot_predictions(self, X_test, y_test, y_pred, dates_test):
"""Step 6: Draw a picture of our guesses vs. reality"""
plt.figure(figsize=(14, 7))
plt.plot(dates_test, y_test, label='Actual Price', color='blue')
plt.plot(dates_test, y_pred, label='Predicted Price', color='red', linestyle='--')
plt.title('Bitcoin Price Prediction (Our Smart Guess vs. What Really Happened)')
plt.xlabel('Date')
plt.ylabel('Price (USD)')
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
print("Here’s a picture! The blue line is what Bitcoin actually did in the test days. The red dashed line is what our machine guessed. Closer lines mean better guesses!")

if __name__ == "__main__":
print("Let’s predict Bitcoin’s next price, step-by-step, like baking a cake with a magic recipe!")
btc = BitcoinTechnicalAnalysisML()
btc.fetch_bitcoin_data(days=365)
btc.calculate_indicators()

result = btc.train_model(test_size=0.2)
if result is not None:
X_test, y_test, y_pred, dates_test = result
btc.plot_predictions(X_test, y_test, y_pred, dates_test)
btc.predict_next_day()

Step 0: Preparing the Detective’s Desk (Setup Phase)

Before solving any mystery, a detective gathers clues, tools, and resources. Our Python script starts similarly — collecting powerful libraries:

Pandas: Think of it as our organized ledger, meticulously noting Bitcoin’s daily prices.
NumPy: Our mathematical genius, making quick work of complex calculations.
Matplotlib: A talented artist drawing detailed sketches of our predictions.
Requests: Our internet informant, fetching historical Bitcoin data.
Datetime and Timedelta: The diligent timekeepers ensuring we track every important day precisely.
Pandas_ta: A specialized book containing secret formulas to decipher market sentiment.
Mplfinance: Artistic flair for financial charts — unused frequently here but readily available.
XGBoost: Our machine-learning crystal ball, analyzing past behavior to predict the future efficiently, powered by GPUs for added speed.
Scikit-learn (StandardScaler, train_test_split, mean_squared_error): Our quality control team, standardizing numbers, splitting data logically, and evaluating prediction accuracy.
Warnings: The mute button silencing irrelevant messages, ensuring a smooth investigation.

We adjust a minor detail — defining np.NaN as np.nan — to maintain seamless interactions with our clue-generating tools, avoiding confusion.

Step 1: Retrieving Bitcoin’s Historical Story (Data Acquisition)

Every detective starts by examining history. Our script, through a function called fetch_bitcoin_data, politely requests a year's worth of Bitcoin’s price from the CoinGecko API. Imagine this as flipping through an extensive archive:

The request fetches daily Bitcoin prices, trading volumes, and timestamps for the past 365 days.
We neatly organize these into a structured DataFrame, clearly marking each day’s closing price, volume, and date.
To fill in missing daily details (like high and low prices), we make educated assumptions, estimating daily highs as 2% above and lows as 2% below the closing price. Yesterday’s close sets today’s open, symbolically passing the baton in a relay race.
Days lacking essential data are gracefully omitted, leaving a coherent, useful 364-day record.

Our script cheerfully announces the successful completion of this stage, stating precisely the collected date range. If the data fetch fails, perhaps due to network hiccups, it politely acknowledges the difficulty.

Step 2: Enhancing Our Investigation (Indicator Calculation)

Having acquired Bitcoin’s detailed historical diary, we enhance it with insightful notes — technical indicators reflecting the cryptocurrency’s emotional and financial health:

Simple Moving Averages (SMAs): Smooth out Bitcoin’s volatility over various periods (7, 25, 50, 99, 200 days), clearly illustrating trends.
Exponential Moving Averages (EMAs): Provide sharper, more responsive indicators by emphasizing recent data.
MACD (Moving Average Convergence Divergence): Indicates momentum shifts, much like sensing Bitcoin’s acceleration or deceleration.
RSI (Relative Strength Index): Acts like a mood detector, signaling when Bitcoin feels overly exuberant (above 70) or exhausted (below 30).
Stochastic Oscillator: Measures Bitcoin’s speed, assessing if the cryptocurrency moves energetically or lethargically.
Bollinger Bands: Visualize volatility ranges, revealing if Bitcoin is poised to bounce or consolidate.
On-Balance Volume (OBV) & Chaikin Money Flow (CMF): Reflect trading enthusiasm, analogous to monitoring crowd intensity at an event.
Fibonacci Retracements: Derived from nature’s elegant patterns, these levels help predict Bitcoin’s potential pauses or reversals.

By incorporating these 30+ carefully selected indicators using pandas_ta, our script enriches Bitcoin’s narrative, transforming numerical data into meaningful insights.

Step 3: Structuring the Predictive Puzzle (Data Preparation)

Now, our script arranges this extensive data into a structured format suitable for predictive analysis:

We shift the closing prices forward by one day, setting the stage clearly: today’s features predict tomorrow’s price.
After cleaning the dataset (removing incomplete entries), we select 30 significant features (e.g., Open, High, RSI, MACD).
To ensure consistent comparisons, we standardize these features using StandardScaler — much like translating different dialects into one universal language.

This meticulous preparation produces a well-structured dataset with 363 complete days ready for the predictive model.

Step 4: Training the Predictive Engine (Model Training)

Here’s where excitement intensifies: we train our crystal ball — XGBoost — by showing it past examples to guide future predictions:

The data is logically divided: 80% for training (290 days), 20% for testing (73 days). We keep chronological integrity — like reading chapters sequentially rather than randomly.
XGBoost swiftly analyzes training data, creating numerous decision trees — logic paths like “if RSI high and MACD rising, anticipate price increase.”
After learning, XGBoost predicts test period prices, evaluated against actual outcomes.
We calculate prediction accuracy using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE). While MSE numerically measures total squared prediction errors, RMSE provides an intuitive, average dollar difference — clearly interpreting predictive precision.

Step 5: Foreseeing Tomorrow (Making Predictions)

Now prepared and educated, XGBoost confidently predicts the next day’s Bitcoin price:

Taking today’s (last day’s) standardized features, XGBoost generates a calculated prediction, extrapolated from learned patterns.
This forecast, clearly communicated by the script, reflects Bitcoin’s expected trajectory, informed by all available indicators and past behaviors.

Step 6: Visual Confirmation (Plotting Results)

Visualization elegantly captures the prediction’s effectiveness:

A graph juxtaposes actual prices (blue line) and predicted prices (red dashed line), clearly displaying XGBoost’s predictive accuracy.
The graphical clarity offers intuitive insights into the predictive performance, vividly showing close predictions as confidence boosters and distant deviations as learning opportunities.

With graceful execution, our script sequentially orchestrates data retrieval, indicator calculation, training, prediction, and visualization, narrating each step clearly and humorously. Any hiccups, like network interruptions, are acknowledged with wit, ensuring transparency.

Our model’s precise numerical prediction (e.g., $90,604.13) emerges from comprehensive analysis of recent indicators, like rising RSI or strong MACD momentum. RMSE serves as a reminder of inherent uncertainty, indicating potential variations around this prediction, given Bitcoin’s legendary volatility.

This analytical journey, powered by Python and enhanced with nuanced indicators, transforms seemingly chaotic Bitcoin movements into understandable narratives. While predictive perfection remains elusive due to Bitcoin’s unpredictable nature, our sophisticated, math-driven approach significantly reduces uncertainty, adding clarity and confidence to investment decisions.

This script represents more than a mere predictive tool — it’s a detective story, a philosophical reflection, and a sophisticated mathematical endeavor, all rolled into one. By dissecting Bitcoin’s history, interpreting emotional signals, and forecasting its future, we turn complexity into elegance, chaos into understanding, and mystery into informed intrigue.

Read Entire Article