< Back to Blog

Finnhub Sentiment Integration: How I Pull, Normalize, and Age-Weight News Scores for the Alpaca Bot

TL;DR / Key Takeaways

  • Finnhub's /news-sentiment endpoint returns a normalized score between -1 and 1, but it treats a two-hour-old article the same as a two-minute-old one — that is a problem for intraday signals.
  • Age-weighting with an exponential decay function reduces the influence of stale news without discarding it entirely.
  • Combining the age-weighted sentiment score with a volatility filter prevents the bot from trading on noise during low-liquidity periods.
  • The full pipeline runs in under 200ms per ticker, which is fast enough for a 5-minute signal refresh cycle on the Alpaca bot.

The Alpaca stock bot I run alongside the Kalshi weather bot uses three inputs to decide whether to place a trade: technical indicators, Kronos AI time-series predictions, and news sentiment from Finnhub. Of the three, sentiment is the most volatile and the most abused.

Most implementations I have seen treat sentiment as a binary gate — positive sentiment means consider a long, negative means stay out. That is too coarse. The Finnhub API gives you a score with real numerical precision. Ignoring that precision and rounding to a binary flag throws away information you already paid for in API calls.

The bigger problem is time. Finnhub aggregates recent news articles and scores them, but the aggregation window is not always clear. If the last meaningful headline about a stock came in at 9:30 AM and you are running a signal at 3:45 PM, that sentiment score is six hours stale. For intraday trading it is almost irrelevant. Without some form of age-weighting, you are trading on yesterday's news.

Here is how I handle it.

Fetching the Raw Sentiment

The Finnhub endpoint is straightforward:

# predictandprofit.io
import requests
import time

FINNHUB_API_KEY = "your_api_key_here"

def get_finnhub_sentiment(ticker: str) -> dict | None:
    url = f"https://finnhub.io/api/v1/news-sentiment"
    params = {"symbol": ticker, "token": FINNHUB_API_KEY}
    try:
        resp = requests.get(url, params=params, timeout=5)
        resp.raise_for_status()
        return resp.json()
    except requests.RequestException as e:
        print(f"Finnhub fetch failed for {ticker}: {e}")
        return None

The response includes a companyNewsScore (overall sentiment), sectorAverageBullishPercent, and a buzz object with article count and weekly average. The score I care about most is companyNewsScore, which Finnhub normalizes to a range of 0 to 1 where 0.5 is neutral.

First thing I do is re-center it to a -1 to 1 scale so it plays well with the other signal inputs:

# predictandprofit.io
def normalize_sentiment(raw_score: float) -> float:
    """Re-center Finnhub's 0-1 score to -1 to 1."""
    return (raw_score - 0.5) * 2.0

A Finnhub score of 0.75 becomes 0.5. A score of 0.25 becomes -0.5. Neutral stays at 0. Now the scale matches the output of the other signal components.

The Age-Weighting Problem

Finnhub does not return timestamps on individual articles in the sentiment endpoint. What it does return is a buzz object that includes an article count for the past week and the past day. I use the ratio of those two as a freshness proxy.

# predictandprofit.io
def freshness_factor(buzz: dict) -> float:
    """
    Returns a value between 0 and 1 representing how 'fresh' the news activity is.
    High ratio of today's articles to weekly average = fresh signal.
    """
    weekly_avg = buzz.get("weeklyAverage", 0)
    articles_in_day = buzz.get("articlesInLastWeek", 0) / 7  # approximate daily avg
    if weekly_avg == 0:
        return 0.5  # no data, assume neutral freshness
    ratio = articles_in_day / weekly_avg
    # cap at 2.0 to prevent extreme spikes from inflating the weight
    return min(ratio, 2.0) / 2.0

This gives me a freshness scalar. If today's article volume is close to the weekly average, freshness is around 0.5. If there is a spike of news today — earnings, macro event, product announcement — freshness climbs toward 1.0. If the stock has been quiet all week and there is nothing today, freshness drops toward 0.

The age-weighted sentiment is then:

# predictandprofit.io
def age_weighted_sentiment(raw_score: float, buzz: dict) -> float:
    normalized = normalize_sentiment(raw_score)
    freshness = freshness_factor(buzz)
    return normalized * freshness

A strong positive signal with high freshness passes through mostly intact. A strong positive signal with stale news gets dampened significantly. This is the behavior I want.

Adding a Volatility Gate

Even a fresh, strong sentiment signal should not trigger a trade if the market is illiquid or if volatility is unusually low. I use Alpaca's historical bars to compute a simple intraday range ratio before letting sentiment contribute to the final signal.

# predictandprofit.io
from alpaca.data.historical import StockHistoricalDataClient
from alpaca.data.requests import StockBarsRequest
from alpaca.data.timeframe import TimeFrame
from datetime import datetime, timedelta

data_client = StockHistoricalDataClient(api_key=ALPACA_API_KEY, secret_key=ALPACA_SECRET)

def intraday_volatility_ok(ticker: str, min_range_pct: float = 0.005) -> bool:
    """
    Returns True if the stock has moved at least min_range_pct intraday.
    Prevents trading stagnant low-volume periods.
    """
    now = datetime.utcnow()
    start = now.replace(hour=13, minute=30, second=0, microsecond=0)  # 9:30 AM ET
    request = StockBarsRequest(
        symbol_or_symbols=[ticker],
        timeframe=TimeFrame.Minute,
        start=start,
        end=now,
    )
    bars = data_client.get_stock_bars(request).df
    if bars.empty:
        return False
    high = bars["high"].max()
    low = bars["low"].min()
    if low == 0:
        return False
    range_pct = (high - low) / low
    return range_pct >= min_range_pct

If the intraday range is less than 0.5%, the bot does not trade on sentiment that day. The signal might be real but the market is not moving, and a small-cap spread will eat any edge the sentiment gave you.

Putting It Together

The final composite sentiment score runs through all three stages before being passed to the signal aggregator:

# predictandprofit.io
def get_scored_sentiment(ticker: str) -> float | None:
    data = get_finnhub_sentiment(ticker)
    if data is None:
        return None

    raw_score = data.get("companyNewsScore", 0.5)
    buzz = data.get("buzz", {})

    if not intraday_volatility_ok(ticker):
        return 0.0  # flat signal, not None — tell the aggregator sentiment is neutral

    score = age_weighted_sentiment(raw_score, buzz)
    return round(score, 4)

Returning 0.0 instead of None when volatility is low is intentional. None means the data call failed. 0.0 means the data is valid but the signal is suppressed. The downstream aggregator handles them differently — None triggers a retry; 0.0 is used as-is.

Caching and Rate Limits

Finnhub's free tier allows 60 calls per minute. The Alpaca bot watches a rotating watchlist of up to 20 tickers. At a 5-minute refresh cycle, that is 20 calls every 5 minutes, which stays well within limits even without caching.

But I cache anyway. If the signal refresh fires at 10:00:00 and the bot takes 8 seconds to evaluate all tickers, I do not want to re-fetch Finnhub data at 10:00:08 for tickers that are still in evaluation. A simple TTL cache with a 3-minute expiry prevents redundant calls and keeps the pipeline fast:

# predictandprofit.io
import functools
import time

_sentiment_cache: dict[str, tuple[float, float]] = {}  # ticker -> (score, timestamp)
CACHE_TTL_SECONDS = 180

def get_cached_sentiment(ticker: str) -> float | None:
    now = time.time()
    if ticker in _sentiment_cache:
        score, ts = _sentiment_cache[ticker]
        if now - ts < CACHE_TTL_SECONDS:
            return score
    score = get_scored_sentiment(ticker)
    if score is not None:
        _sentiment_cache[ticker] = (score, now)
    return score

This is the same TTL pattern I use for the Open-Meteo GFS ensemble data on the Kalshi bot. Build the cache once, use it everywhere.

What I Learned the Hard Way

The first version of this pipeline did not have the freshness factor. It was just normalized sentiment fed directly into the signal aggregator. The result was that the bot was sometimes holding positive sentiment from an earnings announcement that had already been fully priced in by the market. The sentiment was accurate but it was 4 hours old.

Adding freshness dampening cut the number of sentiment-driven trades significantly. The ones that remained were much better correlated with actual intraday price movement. Fewer trades, higher quality signals. That pattern shows up consistently across every component of the bot: the tighter the filter, the better the output.

The Predict & Profit bot code — including the full Alpaca sentiment pipeline, technical indicator layer, and Kronos AI integration — is available at predictandprofit.gumroad.com.

Related Reading