Awesome ETF Analysis

A practical guide to ETF selection and analysis — expense ratio impact, tracking error measurement, tax efficiency, liquidity assessment, and Python tools for systematic comparison.

Why this list exists: As of 2026 there are 10,000+ ETFs globally. Most investors use 3–5 of them, yet the process of selecting those 3–5 is poorly documented in open-source tools. Morningstar and ETFdb are excellent but paywalled or ad-heavy. This list provides Python scripts and free data sources to systematically compare ETFs on the metrics that actually matter.

ETF Selection Framework

The decision sequence for choosing between similar ETFs:

Step 1: Define what you need
  → What index/exposure do you want?
  → Accumulating (no dividend) or distributing?
  → Account type (taxable vs. IRA)?

Step 2: Filter for adequate liquidity
  → AUM > $500M (minimum) or > $1B (preferred)
  → Average daily volume > 100,000 shares
  → Bid-ask spread < 0.10% for large-caps, < 0.20% for others

Step 3: Compare expense ratios among survivors
  → Difference of 0.05% compounds to real money over decades

Step 4: Check tracking difference (not just tracking error)
  → Tracking difference = actual annual return vs. index annual return
  → A negative tracking difference (ETF outperforms its index) is possible due to securities lending

Step 5: Assess tax efficiency in taxable accounts
  → Capital gains distribution history (past 5 years)
  → Dividend tax treatment (qualified vs. ordinary)
  → ETF structure (open-end vs. UIT vs. grantor trust)

Step 6: Verify index methodology
  → Market-cap weighted vs. equal-weighted vs. fundamental-weighted
  → Rebalancing frequency and reconstitution rules

Critical ETF Metrics Explained

Expense Ratio (ER)

The annual fee deducted from assets. The most important factor for long-term, passive ETFs.

Net Return = Index Return - Expense Ratio - Other Costs

Note: The expense ratio is a ceiling on cost drag. Securities lending income can partially or fully offset it, making the effective cost (tracking difference) lower than the stated ER.

Tracking Difference vs. Tracking Error

These are related but different concepts that are frequently confused:

Tracking Difference = ETF Annual Return − Index Annual Return
  (Negative = ETF outperformed the index net of costs; positive = underperformed)
  → This is what you actually care about

Tracking Error = Standard Deviation of daily return differences
  → Measures consistency of tracking; high TE means erratic performance vs. index
  → A low TE with a consistently negative tracking difference is the ideal

Securities Lending Income

Vanguard, Fidelity, and iShares ETFs lend shares to short sellers and return the income to ETF shareholders. This can partially or fully offset the expense ratio:

Effective Cost = Expense Ratio − Securities Lending Income

Example: Vanguard VTI
  Stated ER:                  0.03%
  Securities Lending Income: -0.02%
  Effective Cost:             ~0.01%

Premium/Discount to NAV

ETFs can trade above (premium) or below (discount) their net asset value. For large, liquid ETFs on major US stocks, this difference is typically < 0.05%. For illiquid or international ETFs, it can be significant.

import yfinance as yf

def check_premium_discount(etf_ticker: str) -> dict:
    """
    Check ETF's approximate premium/discount to NAV.
    Note: Intraday NAV (iNAV) is more accurate but harder to access freely.
    """
    etf = yf.Ticker(etf_ticker)
    info = etf.info

    market_price = info.get('regularMarketPrice', None)
    nav = info.get('navPrice', None)

    if market_price and nav:
        premium_pct = (market_price - nav) / nav * 100
        return {
            'Ticker': etf_ticker,
            'Market Price': f"${market_price:.2f}",
            'NAV': f"${nav:.2f}",
            'Premium/Discount': f"{premium_pct:+.3f}%"
        }
    return {'Ticker': etf_ticker, 'Note': 'NAV not available via yfinance'}

The Long-Term Cost of Expense Ratios

Small differences in expense ratios compound to significant amounts over time:

import numpy as np
import pandas as pd

def expense_ratio_drag(
    investment: float,
    gross_return: float,    # Annual return before fees (e.g., 0.10 for 10%)
    years: int,
    expense_ratios: list    # e.g., [0.0003, 0.0007, 0.0020, 0.0075]
) -> pd.DataFrame:
    """
    Show the compounded cost of different expense ratios over time.
    """
    rows = []
    for er in expense_ratios:
        net_return = gross_return - er
        final_value = investment * (1 + net_return) ** years
        gross_value = investment * (1 + gross_return) ** years
        cost = gross_value - final_value

        rows.append({
            'Expense Ratio': f"{er:.2%}",
            'Net Annual Return': f"{net_return:.2%}",
            f'Value after {years}yr': f"${final_value:,.0f}",
            'Cost vs. No-Fee': f"${cost:,.0f}",
            'Cost %': f"{cost / gross_value:.1%}",
        })

    df = pd.DataFrame(rows)
    print(f"\nExpense Ratio Impact: ${investment:,.0f} invested, {gross_return:.0%} gross return, {years} years\n")
    print(df.to_string(index=False))
    return df


expense_ratio_drag(
    investment=100_000,
    gross_return=0.10,
    years=30,
    expense_ratios=[0.0003, 0.0006, 0.0020, 0.0050, 0.0075, 0.0100]
)

# Output (approximate):
# ER 0.03% → $1,742k (cost: $5k)
# ER 0.06% → $1,724k (cost: $23k)
# ER 0.20% → $1,607k (cost: $140k)
# ER 0.50% → $1,433k (cost: $314k)
# ER 0.75% → $1,308k (cost: $439k)
# ER 1.00% → $1,193k (cost: $554k)

Key takeaways:

Going from 1.00% to 0.03% ER saves ~$550,000 over 30 years on a $100,000 investment (at 10% gross return)
Even the difference between 0.03% (VTI) and 0.20% (some actively managed index funds) is $135,000 over 30 years

Tracking Error & Tracking Difference

Calculating Tracking Difference

import yfinance as yf
import pandas as pd

def calculate_tracking_difference(etf: str, benchmark: str,
                                   start: str = '2020-01-01') -> dict:
    """
    Estimate tracking difference between an ETF and its benchmark proxy.

    Args:
        etf: ETF ticker (e.g., 'VTI')
        benchmark: Benchmark proxy ticker (e.g., '^VTI' or related index ETF)

    Note: True tracking difference requires the actual index return, which
    is only available from fund providers. This uses a comparable ETF as proxy.
    """
    data = yf.download([etf, benchmark], start=start, auto_adjust=True)['Close']
    returns = data.pct_change().dropna()

    etf_annual = (1 + returns[etf]).prod() ** (252 / len(returns)) - 1
    bmark_annual = (1 + returns[benchmark]).prod() ** (252 / len(returns)) - 1

    tracking_diff = etf_annual - bmark_annual
    tracking_error = (returns[etf] - returns[benchmark]).std() * (252 ** 0.5)

    return {
        'ETF': etf,
        'Benchmark Proxy': benchmark,
        'ETF Annual Return': f"{etf_annual:.3%}",
        'Benchmark Annual Return': f"{bmark_annual:.3%}",
        'Tracking Difference': f"{tracking_diff:+.3%}",
        'Tracking Error (Ann.)': f"{tracking_error:.3%}",
        'Period': f"{start} to {returns.index[-1].date()}",
    }

# Compare two S&P 500 ETFs (VOO vs. IVV as proxies for each other)
print(calculate_tracking_difference('VOO', 'IVV', start='2015-01-01'))

# Compare total market ETFs
print(calculate_tracking_difference('VTI', 'ITOT', start='2015-01-01'))

Annual Tracking Difference: Real-World Examples (2026 estimates)

ETF	Index	Stated ER	Approx. Tracking Difference	Explanation
VTI	CRSP US Total Market	0.03%	~-0.01% (ETF outperforms)	Securities lending offsets cost
VOO	S&P 500	0.03%	~0.00%	Near-perfect tracking
SCHB	Dow Jones US Broad	0.03%	~+0.02%	Slight underperformance
IVV	S&P 500	0.03%	~-0.01%	iShares lending program
SPY	S&P 500 (SPDR)	0.0945%	~+0.03%	Higher ER, UIT structure
QQQ	NASDAQ-100	0.20%	~+0.10%	High ER, less lending offset

Note: Tracking difference changes year to year. Verify with fund provider's annual reports.

Tax Efficiency Analysis

ETF Structures and Tax Treatment

Structure	Examples	Capital Gains Dist. Risk	Dividend Treatment
Open-End Fund (ETF)	VTI, VOO, SCHB	Very low (in-kind creation/redemption)	Qualified if held 61+ days
Unit Investment Trust (UIT)	SPY, QQQ, DIA	Low-medium (must hold all index stocks)	Ordinary (dividends held in cash, not reinvested)
Grantor Trust	GLD, SLV	None (pass-through)	N/A (physically held asset)
Exchange-Traded Note (ETN)	Some commodity, VIX products	No dividends (price appreciation)	Different tax treatment — verify

Why open-end ETFs are most tax-efficient: The in-kind creation/redemption mechanism allows large investors to swap ETF shares for the underlying basket of stocks, eliminating the need to sell securities and realize gains. This is why equity ETFs almost never distribute capital gains.

Screening for Capital Gains Distributions

import yfinance as yf

def check_capital_gains_history(tickers: list) -> pd.DataFrame:
    """
    Check capital gains distribution history.
    Note: yfinance combines dividends and capital gains in .dividends
    For accurate CG data, check fund provider websites or Morningstar.
    """
    # Practical approach: check dividend yield relative to category
    results = []
    for ticker in tickers:
        info = yf.Ticker(ticker).info
        results.append({
            'ETF': ticker,
            'Fund Type': info.get('quoteType', 'N/A'),
            'Structure': info.get('fundInceptionDate', 'N/A'),
            'Dividend Yield': f"{info.get('dividendYield', 0) * 100:.2f}%",
            'Tax Note': 'Verify capital gains at fund provider website'
        })
    return pd.DataFrame(results)

# For actual capital gains history, use:
# - Vanguard: personal.vanguard.com → Tax center
# - iShares: ishares.com → Tax information
# - Schwab: schwabfunds.com → Tax information

Tax-Efficient ETF Ranking by Category

Most tax-efficient (for taxable accounts):

Broad US market ETFs (VTI, SCHB, ITOT) — near-zero capital gains distributions
Developed international (IEFA, VEA) — very low capital gains
S&P 500 ETFs (VOO, IVV, SPLG) — essentially zero capital gains

Less tax-efficient (better in tax-advantaged accounts):

Bond ETFs (BND, AGG) — interest income taxed as ordinary income
REIT ETFs (VNQ, SCHH) — dividends mostly ordinary income
High-dividend ETFs (SCHD, HDV) — higher dividend yield = more taxable events
Actively managed ETFs — higher turnover = more potential gain distributions
Commodity ETFs via limited partnership (most use K-1 filing)

Liquidity & Bid-Ask Spread Assessment

For large-cap broad market ETFs (VTI, SPY, QQQ), liquidity is not a concern. It matters most for:

Sector ETFs
Emerging market ETFs
Small/mid-cap ETFs
Thematic/niche ETFs

import yfinance as yf

def liquidity_assessment(tickers: list) -> pd.DataFrame:
    """Assess ETF liquidity metrics."""
    results = []
    for ticker in tickers:
        info = yf.Ticker(ticker).info
        aum = info.get('totalAssets', 0)
        avg_volume = info.get('averageVolume', 0)
        price = info.get('regularMarketPrice', 1)

        # Estimate daily dollar volume
        daily_dollar_vol = avg_volume * price

        # Rough bid-ask estimate (yfinance doesn't provide real-time spread)
        # For actual spread: check Bloomberg, Morningstar, or broker platform
        liquidity_tier = (
            'Excellent' if aum > 5e9 else
            'Good' if aum > 1e9 else
            'Adequate' if aum > 500e6 else
            'Caution' if aum > 100e6 else
            'High Risk of Closure'
        )

        results.append({
            'ETF': ticker,
            'AUM': f"${aum/1e9:.1f}B" if aum > 1e9 else f"${aum/1e6:.0f}M",
            'Avg Daily Volume': f"{avg_volume:,.0f}",
            'Daily $ Volume': f"${daily_dollar_vol/1e6:.1f}M",
            'Liquidity': liquidity_tier,
        })

    return pd.DataFrame(results)

# Compare similar ETFs
print(liquidity_assessment(['VTI', 'SCHB', 'ITOT', 'SPTM']))

ETF closure risk: ETFs with < $50M AUM are at risk of being liquidated by their issuer (not a permanent loss, but forces a taxable event). Prefer ETFs with > $500M AUM.

Python Tools & Scripts

Full ETF Comparison Tool

import yfinance as yf
import pandas as pd

def compare_etfs(tickers: list, start: str = '2015-01-01') -> pd.DataFrame:
    """
    Comprehensive ETF comparison: returns, cost, and basic structure.
    """
    price_data = yf.download(tickers, start=start, auto_adjust=True)['Close']
    returns = price_data.pct_change().dropna()

    rows = []
    for ticker in tickers:
        info = yf.Ticker(ticker).info
        r = returns[ticker].dropna()

        annual_ret = (1 + r).prod() ** (252 / len(r)) - 1
        annual_vol = r.std() * (252 ** 0.5)
        sharpe = (annual_ret - 0.05) / annual_vol  # Assume 5% risk-free (adjust to current)
        dd = ((1 + r).cumprod() / (1 + r).cumprod().cummax() - 1).min()
        er = info.get('annualReportExpenseRatio') or info.get('netExpenseRatio') or 0

        rows.append({
            'ETF': ticker,
            'Category': info.get('category', 'N/A'),
            'AUM': f"${info.get('totalAssets', 0)/1e9:.1f}B",
            'Expense Ratio': f"{(er or 0):.2%}",
            f'Ann. Return ({start[:4]}-)': f"{annual_ret:.2%}",
            'Annual Volatility': f"{annual_vol:.2%}",
            'Sharpe (5% RF)': f"{sharpe:.2f}",
            'Max Drawdown': f"{dd:.2%}",
        })

    df = pd.DataFrame(rows)
    return df


# US Total Market comparison
print(compare_etfs(['VTI', 'SCHB', 'ITOT', 'FSKAX'], start='2015-01-01'))

# Core bond comparison
print(compare_etfs(['BND', 'AGG', 'SCHZ', 'FXNAX'], start='2015-01-01'))

ETF Reference: Key Categories

U.S. Equity — Core Holdings

Category	Cheapest Options	AUM Tier	Index
Total US Market	VTI (0.03%), SCHB (0.03%), ITOT (0.03%)	$300B+	CRSP / DJ / S&P TMI
S&P 500	VOO (0.03%), IVV (0.03%), SPLG (0.02%)	$1T+	S&P 500
S&P 500 (legacy)	SPY (0.0945%)	$550B+	S&P 500 (UIT)
US Small Cap	VB (0.05%), SCHA (0.04%)	$50B+	CRSP / DJ
US Mid Cap	VO (0.04%), SCHM (0.04%)	$50B+	CRSP / DJ
NASDAQ-100	QQQ (0.20%), QQQM (0.15%)	$200B+	NASDAQ-100

International Equity

Category	Options	ER	Index
Total International	VXUS (0.07%), IXUS (0.07%)	Low	FTSE All-World ex-US / MSCI ACWI ex-US
Developed Markets	VEA (0.05%), IEFA (0.07%), SCHF (0.06%)	Low	FTSE / MSCI EAFE
Emerging Markets	VWO (0.08%), IEMG (0.09%)	Low	FTSE / MSCI EM

Fixed Income — Core

Category	Options	ER
US Total Bond	BND (0.03%), AGG (0.03%), SCHZ (0.03%)	Very low
Short-Term	VGSH, SHY, SCHO	Very low
Intermediate-Term	VGIT, IEI, SCHR	Very low
Long-Term	VGLT, TLH	Low
TIPS (Inflation)	VTIP (0.04%), SCHP (0.03%)	Very low

Specialty / Factor

Category	Options	ER
Dividend Growth	VIG (0.06%), DGRO (0.08%), SCHD (0.06%)	Low
Value Factor	VTV (0.04%), SCHV (0.04%), IVE (0.18%)	Low-medium
Small-Cap Value	VBR (0.07%), IJS (0.18%)	Low-medium
Momentum	MTUM (0.15%), QMOM (0.35%)	Medium
Quality	QUAL (0.15%), JQUA (0.12%)	Low-medium

Free Data Sources for ETF Research

Source	What You Get	Access
ETF provider websites	Exact expense ratios, tracking difference, capital gains history	Free, no API
yfinance	Basic ETF info, price data, dividends	`pip install yfinance`
ETFdb.com	Comparison tools, screener	Free basic, paid premium
etf.com	Fund flows, analytics	Free basic
Morningstar	Fund ratings, portfolio analytics	Free basic, paid for full data
SEC EDGAR (N-CEN, N-PORT)	Fund holdings, expense disclosures	Free API
FRED	ETF price series for some major funds	`VFIAX`, `SP500` etc.

Common ETF Misconceptions

"Higher AUM = Better ETF" — Not always. $500M+ is sufficient for most purposes; beyond that, AUM doesn't affect tracking quality.

"The ETF with the lowest ER is always best" — Tracking difference matters more than stated ER. An ETF with 0.06% ER but -0.02% tracking difference costs less than one with 0.03% ER and +0.03% tracking difference.

"All S&P 500 ETFs are identical" — Nearly identical returns, but structural differences matter: UIT (SPY) can't reinvest dividends intraday, which slightly reduces returns vs. open-end ETFs (VOO, IVV) in rising markets.

"ETFs are always tax-efficient" — Open-end equity ETFs are. Bond ETFs generate ordinary interest income. Commodity ETFs, actively managed ETFs, and some leveraged ETFs can distribute taxable events.

"Inverse/leveraged ETFs are long-term holds" — Volatility decay makes leveraged ETFs unsuitable for long-term holding. They are designed for short-term tactical use. A 2× ETF does not deliver 2× the long-term return.

Contributing

See CONTRIBUTING.md. Most welcome:

Tracking difference data with dates and sources (fund provider annual reports)
Non-US ETF equivalents (European UCITS ETFs, Australian ETFs)
Updated expense ratio data when providers cut fees

License

CC0 1.0 Universal — Public domain.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome ETF Analysis

Table of Contents

ETF Selection Framework

Critical ETF Metrics Explained

Expense Ratio (ER)

Tracking Difference vs. Tracking Error

Securities Lending Income

Premium/Discount to NAV

The Long-Term Cost of Expense Ratios

Tracking Error & Tracking Difference

Calculating Tracking Difference

Annual Tracking Difference: Real-World Examples (2026 estimates)

Tax Efficiency Analysis

ETF Structures and Tax Treatment

Screening for Capital Gains Distributions

Tax-Efficient ETF Ranking by Category

Liquidity & Bid-Ask Spread Assessment

Python Tools & Scripts

Full ETF Comparison Tool

ETF Reference: Key Categories

U.S. Equity — Core Holdings

International Equity

Fixed Income — Core

Specialty / Factor

Free Data Sources for ETF Research

Common ETF Misconceptions

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome ETF Analysis

Table of Contents

ETF Selection Framework

Critical ETF Metrics Explained

Expense Ratio (ER)

Tracking Difference vs. Tracking Error

Securities Lending Income

Premium/Discount to NAV

The Long-Term Cost of Expense Ratios

Tracking Error & Tracking Difference

Calculating Tracking Difference

Annual Tracking Difference: Real-World Examples (2026 estimates)

Tax Efficiency Analysis

ETF Structures and Tax Treatment

Screening for Capital Gains Distributions

Tax-Efficient ETF Ranking by Category

Liquidity & Bid-Ask Spread Assessment

Python Tools & Scripts

Full ETF Comparison Tool

ETF Reference: Key Categories

U.S. Equity — Core Holdings

International Equity

Fixed Income — Core

Specialty / Factor

Free Data Sources for ETF Research

Common ETF Misconceptions

Contributing

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages