Major institutional investors oversee significant amounts of capital and often possess resources and information beyond the reach of individual investors. By monitoring shifts in their portfolios, we can glean valuable insights into which stocks are being favored or falling out of favor.
Moreover, organizations managing more than $100 million in assets are mandated to report their portfolio holdings quarterly through regulatory disclosures, such as the SEC’s Form 13F.
Extracting meaningful insights from these regulatory filings can be challenging. This article seeks to simplify that process by presenting a systematic approach to tracking these portfolios, leveraging publicly available data sources alongside specialized APIs.
Before we begin, I extend an invitation for you to join my dividend investing community. By joining, you won’t miss any crucial articles and can enhance your skills as an investor. As a bonus, you’ll receive a complimentary welcome gift: A 2024 Model Book With The Newest Trading Strategies (both Options and Dividends)
- Obtaining Structured Data Through Public Resources and the FMP API
- Analyzing Portfolio and Asset Holdings Variations Over Time
- Evaluating Buy and Sell Activities by Asset Type and Investment Firms
- Assessing Performance Metrics Over Time
1.1 Retrieving Data Manually
Manually extracting data from SEC filings can be a labor-intensive process, but it provides valuable insights into the structure and details of the disclosed information. Follow these steps:
Step 1: Search for a Company in the SEC EDGAR Database
Visit the SEC EDGAR platform at www.sec.gov/edgar.shtml and use the search functionality to locate the institution or company of interest. You can input the company name or its Central Index Key (CIK) to find its filings.
Step 2: Review the Latest 13F Filings and Open the ‘Filing’ Link
After identifying the company, search for the most recent 13F filing. These filings are usually designated as ‘Form 13F-HR’ or ‘Form 13F-HR/A’ (indicating an amended filing). Click the ‘Filing’ link to access the complete report.
Step 3: Open the HTML Version of the Information Table
Within the 13F filing, you will typically find several document types, such as the primary report and any accompanying exhibits. Locate the ‘Information Table’ section and select the HTML link to view the portfolio holdings presented in a more accessible and readable format.
Step 4: Voilà! Review and Analyze the Investor’s Latest Portfolio Holdings
The HTML version of the Information Table will display a detailed breakdown of the portfolio holdings, including information such as the securities, number of shares held, market value, and each holding’s percentage of the total portfolio. This data can be manually examined and analyzed to uncover the institutional investor’s strategies and identify prevailing investment trends.
The Financial Modeling Prep (FMP) API offers a streamlined approach to accessing institutional investors’ portfolio holdings data. This includes details on acquisitions, dispositions, and fluctuations in portfolio values over time.
The API enriches the dataset by providing key metrics, such as securities held, shares owned, market valuations, and changes in portfolio weights across different periods. This allows for a detailed analysis of the investment strategies and patterns followed by monitored institutional investors.
Below is a Python code snippet illustrating how to utilize the FMP API to retrieve portfolio holdings data for a selection of top institutional investors, based on their Central Index Keys (CIKs) and specific quarter-end dates:
import requestsimport pandas as pd# Base endpoint for accessing the API
api_endpoint = 'https://financialmodelingprep.com/api/v4/institutional-ownership/portfolio-holdings'# Dates corresponding to quarter-ends for both 2022 and 2023
quarter_dates = [
'2022-03-31', '2022-06-30', '2022-09-30', '2022-12-31',
'2023-03-31', '2023-06-30', '2023-09-30', '2023-12-31',
'2024-03-31'
]# List of Central Index Keys (CIKs) for target institutions
institution_ciks = [
'0001067983',
'0000019617',
'0000070858',
'0000895421',
'0000886982'
]# Container DataFrame for collecting results
combined_holdings = pd.DataFrame()# API authentication token (replace with actual token)
authentication_key = ''# Loop through each CIK and quarter-end date to retrieve data
for institution_id in institution_ciks:
for quarter_date in quarter_dates:
# Build request parameters dynamically
request_parameters = {
'date': quarter_date, # Current quarter-end
'cik': institution_id, # Institution's CIK
'page': 0, # Fetching the first page only
'apikey': authentication_key
}
# Perform the GET request
api_response = requests.get(api_endpoint, params=request_parameters)
# Check if the API request succeeded
if api_response.status_code == 200:
# Parse the response JSON and convert it to a DataFrame
institution_data = api_response.json()
holdings_data = pd.DataFrame(institution_data)
# Append the retrieved data to the main DataFrame
combined_holdings = pd.concat([combined_holdings, holdings_data], ignore_index=True)
else:
print(f"Error retrieving data for CIK {institution_id} on {quarter_date}: {api_response.status_code}")
"reportDate": "2023-06-30",
"investorCIK": "0001067983",
"filingDate": "2023-08-14",
"ticker": "AAPL",
"companyName": "APPLE INC",
"securityType": "COMMON STOCK",
"cusipCode": "037833100",
"shareCategory": "SH",
"positionType": "Share",
"investmentControl": "DEFINED",
"currentWeight": 51.0035,
"previousWeight": 46.4386,
"weightChange": 4.5649,
"weightChangePercent": 9.8301,
"currentShares": 915560382,
"previousShares": 915560382,
"sharesDifference": 0,
"sharesChangePercent": 0,
"isNewHolding": false,
"initialReportDate": "2016-03-31",
"quarterClosePrice": 193.97,
"averagePurchasePrice": 41.18,
"isCompletelySold": false,
"currentOwnershipPercent": 5.7994,
"previousOwnershipPercent": 5.7994,
"ownershipChange": 0,
"ownershipChangePercent": 0,
"holdingDuration": 30,
"isIncludedInPerformance": true,
"currentMarketValue": 177591247296,
"previousMarketValue": 150975906991,
"marketValueDifference": 26615340305,
"marketValueChangePercent": 17.6289,
"currentPerformanceValue": 26615340304,
"previousPerformanceValue": 31302912039,
"performanceDifference": -4687571734,
"performanceRatePercent": 17.6289,
"holderName": "BERKSHIRE HATHAWAY INC",
"sectorTitle": "ELECTRONIC COMPUTERS"
}
The Financial Modeling Prep (FMP) API is an efficient solution for accessing institutional portfolio data, but there are scenarios where a completely open-source, automated method becomes necessary. In such cases, web scraping serves as a powerful alternative to collect data directly from the SEC’s EDGAR database.
Although effective, web scraping introduces specific challenges:
- Website Fragility: A scraped website may undergo structural changes, causing scrapers to break.
- Legal Considerations: The SEC’s terms of service and rate limits must be respected to avoid misuse.
- Maintenance Costs: Regular updates to the scraping logic are required to keep the tool functional.
Follow these streamlined steps to extract SEC data using Selenium:
Step 1 : Prepare the Setup and Retrieve Initial Data
- Objective: Configure the scraping script with relevant input variables, including the target institution’s Central Index Key (CIK) and HTTP headers to emulate browser behavior.
- Goal: Gather a list of filing URLs from the EDGAR database for further processing.
Here’s how:
- Set up Python along with Selenium for the web browser automation tasks.
- Input necessary credentials, such as the institution’s CIK, into your code as parameters.
- Use request headers to emulate legitimate browser activity, helping to minimize blocking by servers.
cik = '0001067983' # Example CIK for Berkshire Hathaway
headers = {'User-Agent': '[email protected]'}# Fetch initial filing URLs
response = requests.get(f"https://data.sec.gov/submissions/CIK{cik}.json", headers=headers)
# Handling of the response goes here
Phase 2 — Retrieving Comprehensive Holdings Data with Selenium: In this phase, the script goes through each filing URL to find and gather the XML URLs that contain information about the holdings. This step requires real-time interaction with the web page to guarantee that all required links are identified and collected.
# Initialize Selenium WebDriverdriver = webdriver.Chrome()
for filing_url in filing_urls:
driver.get(filing_url)
# Code to locate and extract XML URL
# Example snippet:
xml_url = driver.find_element(By.XPATH, "//a[contains(@href, '.xml')]").get_attribute('href')
Phase 3 — Data Compilation and Refinement: Once the detailed data is collected, it is parsed and organized into a well-structured format. This process includes cleaning and reformatting the data to ensure its usefulness and integrity, akin to API results but with more manual intervention for adjustment.
# Example snippet to parse and clean datadataframes = []
for xml_url in xml_urls:
data = pd.read_xml(xml_url)
cleaned_data = data.dropna() # Simplified cleaning example
dataframes.append(cleaned_data)final_df = pd.concat(dataframes)
2. Analyzing Investor Portfolio Evolution Over Time
After successfully retrieving institutional investors’ portfolio data, we can begin examining how their holdings change over time.
The first step involves converting the ‘date’ column into a datetime format. This enables us to easily extract the year and quarter from each data entry. Next, we select the relevant columns from the original DataFrame, renaming some for clarity. Specifically, we retain ‘year’, ‘quarter’, ‘investorName’, ‘symbol’ (which we rename to ‘ticker’), ‘weight’, and ‘marketValue’ (which we rename to ‘value’).
A critical task is calculating each investor’s total portfolio value for every quarter. We use the groupby() function to aggregate the 'value' by 'year', 'quarter', and 'investorName', and store the total portfolio value in a new column called 'total_portfolio_value'.
With the total portfolio value computed, we can then derive the percentage weight of each asset within the portfolio. This is done by creating a new column, ‘computed_weight’, which represents this percentage. Optionally, we can rename this column back to ‘weight’ to maintain consistency.
At this point, the transformed DataFrame contains all the essential information for examining changes in investor portfolios over time, including the year, quarter, investor name, ticker, weight, and total portfolio value.
import requestsimport pandas as pd
import matplotlib.pyplot as plt# Set the base URL for API access
api_endpoint = 'https://financialmodelingprep.com/api/v4/institutional-ownership/portfolio-holdings'# Define the quarter-end dates for the data we want to fetch
dates_of_interest = [
'2022-03-31', '2022-06-30', '2022-09-30', '2022-12-31',
'2023-03-31', '2023-06-30', '2023-09-30', '2023-12-31',
'2024-03-31'
]# List of institutional identifiers (CIKs) to query data for
investor_ciks = [
'0001067983',
'0000019617',
'0000070858',
'0000895421',
'0000886982'
]# Initialize an empty DataFrame to store all the retrieved data
consolidated_data = pd.DataFrame()# Placeholder for your actual API key
api_token = ''# Loop over the CIKs and quarter-end dates to retrieve data
for cik in investor_ciks:
for date in dates_of_interest:
# Set up the parameters for the API request
params = {
'date': date, # Use the current quarter-end date
'cik': cik, # Use the current investor CIK
'page': 0, # We assume you're interested only in the first page of results
'apikey': api_token
}
# Make the API request for the current CIK and quarter
api_response = requests.get(api_endpoint, params=params)
if api_response.status_code == 200:
# If the request succeeds, convert the data to a DataFrame and append it
fetched_data = api_response.json()
cik_data_df = pd.DataFrame(fetched_data)
consolidated_data = pd.concat([consolidated_data, cik_data_df], ignore_index=True)
else:
print(f"Failed to fetch data for CIK {cik} on {date}: {api_response.status_code}")
consolidated_data['date'] = pd.to_datetime(consolidated_data['date'])
consolidated_data['year'] = consolidated_data['date'].dt.year
consolidated_data['quarter'] = consolidated_data['date'].dt.quarter# Transform the data by selecting the relevant columns
portfolio_data = consolidated_data[['year', 'quarter', 'investorName', 'symbol', 'weight', 'marketValue']].copy()# Rename columns for better clarity
portfolio_data.rename(columns={'symbol': 'ticker', 'marketValue': 'value'}, inplace=True)# Convert the 'value' column to numeric for further calculations
portfolio_data['value'] = pd.to_numeric(portfolio_data['value'])# Calculate total portfolio value by grouping data by investor and quarter
portfolio_totals = portfolio_data.groupby(['year', 'quarter', 'investorName'])['value'].sum().reset_index(name='total_portfolio_value')# Merge the total portfolio value back to the original data
portfolio_data = pd.merge(portfolio_data, portfolio_totals, on=['year', 'quarter', 'investorName'])# Calculate each asset's weight in the total portfolio
portfolio_data['calculated_weight'] = portfolio_data['value'] / portfolio_data['total_portfolio_value']# Drop the original 'weight' column and rename the new weight column
portfolio_data.drop('weight', axis=1, inplace=True)
portfolio_data.rename(columns={'calculated_weight': 'weight'}, inplace=True)# Filter out assets with weight less than 5% for the final dataset
final_portfolio_data = portfolio_data[portfolio_data['weight'] > 0.05].copy()# Create a string for the quarter period
final_portfolio_data['period'] = final_portfolio_data['year'].astype(str) + ' Q' + final_portfolio_data['quarter'].astype(str)# Sort data for better clarity in the plot
final_portfolio_data.sort_values(by=['period', 'investorName', 'ticker'], inplace=True)# Calculate total value for each investor for each period
total_portfolio_values = final_portfolio_data.groupby(['period', 'investorName'])['value'].sum().reset_index(name='total_value')# Merge total portfolio value back to the final data
final_portfolio_data = pd.merge(final_portfolio_data, total_portfolio_values, on=['period', 'investorName'], how='left')# Format the total value as a string for easy reading
final_portfolio_data['total_value_text'] = '$' + (final_portfolio_data['total_value'] / 1e9).round(2).astype(str) + 'B'# Initialize a plot for the bar chart
fig, ax = plt.subplots(figsize=(24, 12))# Create dictionaries to track the unique labels
unique_periods = final_portfolio_data['period'].unique()
unique_investors = final_portfolio_data['investorName'].unique()
unique_tickers = final_portfolio_data['ticker'].unique()# Assign colors to each unique ticker
ticker_colors = {ticker: plt.cm.tab20(i/len(unique_tickers)) for i, ticker in enumerate(unique_tickers)}# Create index mappings for easy position plotting
period_index_map = {period: i for i, period in enumerate(unique_periods)}
investor_index_map = {investorName: i for i, investorName in enumerate(unique_investors)}# Set parameters for the bar plot
bar_width = 0.2
spacing_between_bars = 0.02# Create the bars for each period and investor's portfolio
for period in unique_periods:
period_index = period_index_map[period]
for investorName in unique_investors:
investor_portfolio = final_portfolio_data[(final_portfolio_data['period'] == period) & (final_portfolio_data['investorName'] == investorName)]
if not investor_portfolio.empty:
bar_bottom_position = 0
total_portfolio_text = investor_portfolio['total_value_text'].iloc[0]
for _, row in investor_portfolio.iterrows():
bar_position = period_index + investor_index_map[investorName]*(bar_width + spacing_between_bars)
asset_weight = row['weight']
ax.bar(bar_position, asset_weight, bottom=bar_bottom_position, width=bar_width, color=ticker_colors[row['ticker']], edgecolor='white')
bar_bottom_position += asset_weight
# Add text annotations within the bar
ax.text(bar_position, bar_bottom_position - (asset_weight / 2), f"{row['ticker']}\n{row['weight']*100:.1f}%\n${row['value']/1e9:.1f}B", ha='center', va='center', fontsize=9, rotation=90)
# Add total portfolio value beside the investor name
investor_label = f"{investorName} ({total_portfolio_text})"
ax.text(bar_position, bar_bottom_position + 0.03, investor_label, ha='center', va='bottom', fontsize=12, rotation=90, transform=ax.transData)
ax.set_xticks([i + (bar_width + spacing_between_bars) * len(unique_investors) / 2 - bar_width/2 for i in range(len(unique_periods))])
ax.set_ylim(0, ax.get_ylim()[1] * 1.4)
ax.set_xticklabels(unique_periods)
ax.set_ylabel('Weight (%)')
ax.set_title('Quarterly Investor Portfolio Breakdown (Assets > 5% Weight)', fontsize=20)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
Assessing the changes in the portfolios of institutional investors on a quarterly basis reveals valuable insights into their trading patterns.
1. Define a custom colormap
We initiate a custom colormap using LinearSegmentedColormap from Matplotlib’s colors module. This colormap visualizes relative portfolio weight fluctuations, where:
- Red indicates a reduction.
- White signifies minimal or no change.
- Green suggests an increase.
2. Extract unique quarters
Next, we extract all unique quarter-end dates from the DataFrame all_data_df and sort them in reverse order, starting with the latest quarter.
3. Process each quarter’s data
For each quarter, we apply the following steps:
- Filter the data for that specific quarter.
- Group the data by investor name and stock ticker (symbol), computing the average value of the relative_change_pct column.
- Transform the grouped data into a matrix format, where:
- Rows represent investors.
- Columns represent stock tickers.
- Each cell holds the average change in portfolio weight for the corresponding investor and stock.
4. Generate the heatmap plot
We use Seaborn’s heatmap() function to plot the data as a heatmap. The cells are colored using our previously defined colormap, with the midpoint representing no change (value = 0).
5. Enhance the heatmap’s appearance
The figure’s size, gridlines, and tick labels are customized to enhance visual clarity:
- Set the linewidths and line colors for the grid.
- Label the x and y-axes.
- Apply proper title and rotation settings for axis tick labels.
6. Display the heatmap
Finally, the plot is displayed with plt.show(), providing an accessible view of the investor activity across the quarter, including the buy and sell actions for each ticker.
import matplotlib.pyplot as pltimport seaborn as sns
from matplotlib.colors import LinearSegmentedColormap# Define a custom colormap from red (decrease) to white (no change) to green (increase)
colormap = ["red", "white", "green"] # Represents decrease, no change, and increase
color_map_name = "custom_map"
custom_colormap = LinearSegmentedColormap.from_list(color_map_name, colormap)# Extract all distinct quarter-year combinations
quarters_available = all_data_df['quarter_year'].unique()
quarters_available = sorted(quarters_available, key=lambda x: pd.Period(x), reverse=True)# Iterate over the quarters to generate the heatmap for each one
for current_quarter in quarters_available:
# Filter the data for the current quarter
current_quarter_data = all_data_df[all_data_df['quarter_year'] == current_quarter]
# Group the data by investor and asset ticker, calculating the mean relative change percentage
grouped_data = current_quarter_data.groupby(['investorName', 'symbol'])['relative_change_pct'].mean().reset_index()
# Pivot the grouped data into a matrix with investors as rows, tickers as columns, and relative change percentage as values
pivot_data = grouped_data.pivot_table(index='investorName', columns='symbol', values='relative_change_pct', fill_value=0)
# Plot the heatmap with custom figure size and gridline adjustments
plt.figure(figsize=(30, 15)) # Increase the figure size for clearer visibility
ax = sns.heatmap(pivot_data, cmap=custom_colormap, center=0, annot=False, cbar=True,
linewidths=0.05, linecolor='gray') # Customize gridline appearance
plt.title(f'Portfolio Change for {current_quarter} (Green: Increase, Red: Decrease, White: Little to No Change)', fontsize=16)
plt.xlabel('Ticker (Asset)', fontsize=14)
plt.ylabel('Investor', fontsize=14)
plt.xticks(rotation=90, fontsize=12) # Rotate and size x-axis labels
plt.yticks(rotation=0, fontsize=12) # Adjust y-axis label appearance
plt.show()
Below is a paraphrased approach to calculate the performance of an investment company’s portfolio, accounting for both price changes and the impact of buying or selling shares.
Step-by-Step Approach:
- Calculate Periodic Holding Changes
- Track changes in the number of shares held for each asset between consecutive periods. This helps identify when buying or selling activities took place.
- Calculate Periodic Market Value Changes
- Determine the changes in market value for each asset due to market price movements, excluding the effects of buy or sell actions.
3. Calculate Period Returns for Each Asset
- For each period, calculate the return for each asset, considering both changes in holding size and market value changes.
- The formula for the return of each asset can be represented as:
Where Net Cash Flow is the amount spent or received from buying/selling shares. This is approximated by multiplying the change in the number of shares by the average price in the period.
4. Aggregate Asset Returns to Compute Investment Company Returns
- For each period, compute the total return of the investment company by aggregating the individual asset returns.
5. Calculate Time-Weighted Return (TWR) for Each Investment Company (CIK)
- To calculate the overall performance, compound the returns across each period using the following formula:
- This reflects the true performance of an investment company, regardless of when cash flows occurred.
6. Adjust for Cash Flows
- If the net cash flows cannot be precisely tracked due to the lack of detailed transaction data, we simplify this step by assuming that all market value changes not explained by price fluctuations come from buying or selling shares.
df['quarter_period'] = pd.to_datetime(df['date']).dt.to_period('Q') # Extracting quarterly periods from dates
df['initialPricePerUnit'] = df['previousMarketValue'] / df['previousShareCount'] # Price at the start of the period
df['finalPricePerUnit'] = df['currentMarketValue'] / df['currentShareCount'] # Price at the end of the period
df['averagePricePerUnit'] = (df['initialPricePerUnit'] + df['finalPricePerUnit']) / 2 # Average price during the period
df['shareDifference'] = df['currentShareCount'] - df['previousShareCount'] # Difference in shareholding between periods
df['cashFlowFromTransaction'] = df['shareDifference'] * df['averagePricePerUnit'] # Cash inflow/outflow from transactions
df['marketValueAdjustment'] = (df['currentMarketValue'] - df['previousMarketValue']) - df['cashFlowFromTransaction'] # Adjustment for market value changes
df['adjustedReturnForPeriod'] = df['marketValueAdjustment'] / (df['previousMarketValue'] - df['cashFlowFromTransaction']) # Adjusted return formula# Aggregate the results by investment firm and quarter, averaging adjusted returns
quarterly_performance_by_firm = df.groupby(['firmName', 'quarter_period'])['adjustedReturnForPeriod'].mean().reset_index()# Reshape data to show investment firms as rows and quarters as columns for easier plotting
quarterly_returns_matrix = quarterly_performance_by_firm.pivot(index='firmName', columns='quarter_period', values='adjustedReturnForPeriod')# Convert PeriodIndex columns to strings for proper labeling in the plot
quarterly_returns_matrix.columns = quarterly_returns_matrix.columns.astype(str)# Plot the returns over each quarter for all firms
plt.figure(figsize=(25, 8)) # Set plot size
for firm in quarterly_returns_matrix.index: # Iterate through each firm
plt.plot(quarterly_returns_matrix.columns, quarterly_returns_matrix.loc[firm], marker='o', label=firm) # Plot return for each firm
plt.title('Quarterly Performance by Investment Firm', fontsize=16) # Plot title
plt.xlabel('Quarter', fontsize=12) # X-axis label
plt.ylabel('Adjusted Return', fontsize=12) # Y-axis label
plt.xticks(rotation=45) # Rotate x-axis labels to avoid overlap
plt.legend(title='Firm Name', bbox_to_anchor=(1.05, 1), loc='upper left') # Adjust legend to avoid overlap
plt.grid(True) # Add grid lines for readability
plt.tight_layout() # Adjust layout to make space for legend
plt.show() # Display the plot
The study of institutional investor portfolios holds various practical benefits for market participants.
5.1 Spotting Investment Opportunities
By monitoring the buying patterns and trading activities of major institutions, we can uncover valuable investment opportunities. Noticing the sectors, industries, or individual stocks catching the interest of sophisticated investors can signal potential undervalued or growing investments.
5.2 Gauging Market Sentiment
Changes in institutional portfolios, particularly their buying and selling behaviors, can offer a useful gauge of market sentiment. Watching how these large investors adjust their allocations provides insight into perceived risks and opportunities, helping investors make more informed decisions about their own portfolios.
5.3 Tracking Industry Movements
Assessing how portfolio allocations and weights shift across industries can highlight evolving trends and sectors with high growth potential. This analysis can be valuable not just for investors and analysts, but also for policymakers seeking to understand broader market dynamics and make well-informed choices.
Understanding the moves of institutional investors gives us a competitive advantage when identifying promising prospects or avoiding potential pitfalls. While it’s essential to keep pace with data sources and changes in reporting standards, staying current with these shifts can lead to insights that significantly impact investment success.