Indie Market Strategy: Uncovering Underserved Genres for Smart Investment

A Data-Driven Case Study on Steam Genre Performance

Executive Summary

This analysis identifies Open World Survival Craft as the leading candidate for investment. With low saturation in specialized niches, the genre delivers a strong sales floor and massive upside—with top performers exceeding $6M in revenue. Additionally, its high engagement ensures the retention needed to sustain profitability long after launch

Tools used: PostgreSQL, Power BI, Python, Excel

PHASE 1 : ASK

For this fictional case study, I am working with Raw Digital, an indie game publisher, to provide data-driven insights that guide their investment strategy.

This analysis seeks to answer the following business questions:

Capital Allocation: For our next project which game genre we should invest in to maximize the chances of commercial success while avoiding market saturation?

Risk Mitigation: Which genre offer the best balance of market opportunity and downside protection, while still supporting long-term player engagement?

This analysis is prepared directly for the Portfolio Director at Raw Digital as Primary Stakeholder.

PHASE 2: PREPARE

Data Sourcing Strategy I engineered a custom extraction strategy aggregating data from three industry-standard sources. 

  • SteamDB (Commercial Performance): Extracted review volumes and pricing history to calculate Sales Potential. (Source)

  • SteamSpy (Engagement Depth): Sourced median and average playtime minutes to quantify player retention. (Source)

  • Steam API (Taxonomy): Retrieved official game tags and metadata for accurate genre classification. (Source)

Data Integrity & Limitations

  • Reliability: Data is sourced via industry-standard tracking services (SteamDB, SteamSpy). These metrics serve as reliable comparative indicators for genre performance and saturation.

  • Proxies: In the absence of public revenue data, Total Review Counts were utilized as a validated proxy for unit sales.

PHASE 3: PROCESS

Tools & Architecture

  • Pipeline: Custom Python-based ETL pipeline (Extract, Transform, Load) to scrape, validate, and normalize raw data.

    Data extraction adhered to API rate limits and robots.txt protocols to ensure ethical data collection and prevent server load.

  • Warehousing: The resulting dataset was unified in a PostgreSQL data wharehouse to enable complex relational queries between performance metrics and genre tags.

Scope & Filtering Criteria To ensure the analysis focused on the Viable Indie Market relevant to Raw Digital, the raw dataset underwent strict filtering:

  1. Recency Filter: Limited to titles released within the 2024–2025 window.

  2. Traction Filter: Titles with <500 reviews were excluded to filter out hobbyist projects and noise.

  3. Genre Filter: Applied strictly to titles with the user-defined "Indie" tag on Steam.

  • Result: This process refined an initial pool of 2,671 releases down to a final number of 867 Verified Indie Titles.

Defining the Metrics I calculated specific custom metrics to align with business goals:

  • Market Saturation: Aggregated volume of annual releases per genre.

  • Sales Potential: Review counts to estimate revenue tiers.

  • Engagement Baseline: Playtime in minutes to estimate player retention.

Data Cleaning Process :

  • Validated that all pricing metadata was extracted directly in USD, ensuring financial consistency across the dataset without the need for exchange rate normalization.

  • Parsed raw JSON arrays (e.g., ['Action', 'Indie', 'RPG']) into a relational SQL schema. This structure allows a single game to be accurately counted across multipe.

  • Transformed raw string inputs into strict SQL data formats (converting text dates to DATE objects and numeric strings to INTEGERS) to enable time-series analysis and mathematical aggregation.

  • Collected raw Steam Game IDs and tags, restructuring them into a custom relational schema to optimize query performance and workflow fluidity.

Data limitation :

  • Missing Tags: About 10% of the games didn't return any genre tags (mainly due to store delistings or data gaps), so I excluded them from the genre charts to prevent errors.

  • Missing Playtime: SteamSpy didn't have playtime data for 12% of the list, so I calculated the retention benchmarks using the remaining 88% to keep the numbers accurate.

To get access to the full data cleaning process and data wharehouse structure, please visit the GitHub page.

PHASE 4 : ANALYSIS

1. Stability vs. Scalability: What Genre Offers the Greatest Revenue Upside.

I use Median Review Count as a proxy for the "Sales Floor" and the 75th Percentile (Q3) as a proxy for the "Revenue Ceiling." To ensure statistical reliability, genres with fewer than 20 releases were excluded.

The Star Performer: Open World Survival Craft (OWSC)

While Open World Survival Craft has a slightly lower median than some competitors, its ceiling potential is vastly superior—reaching up to 8.5x its median. This combination of high retention and explosive growth potential makes it our primary investment target.

The Watchlist: Stealth, Souls-like, Third Person Shooter

These genres exhibit a very strong "floor," meaning the average title performs well. However, a lower Q3 ceiling compared to OWSC suggests limited scalability, positioning them as secondary "Watchlist" candidates.

Next, I looked at the median selling price to see which genre offers the best revenue potential. Here again, Open World Survival Craft is the clear winner.This higher price point allows for better profit margins and higher gross revenue per user.


2. Avoiding Market Saturation : What genre are not overcrowded.

Overlap : Because games contain multiple genres simultaneously, genre release counts will total more than unique game releases.

To classify these overlapping categories, I applied K-Means clustering. This method avoids human bias by letting the data define the natural boundaries between Overcrowded and Niche markets based on statistical fracture points.

The top 6 highest-rated genres all belong to the ‘Niche' segment—revealing a high-value opportunity to capture engaged audiences with minimal market competition.


3. Engagement & Retention: Find What Genre Offer the Strongest Engagement.

High engagement is a leading indicator for long-term revenue. Genres with deep retention are ideal candidates for post-launch monetization (DLC, expansions, cosmetics) and sustained community growth. To evaluate this, I compared the median vs. 75th percentile playtime for the 6 high-potential candidates identified in the previous section.

Star Performer: Open World Survival Craft is once again the clear winner. It combines a solid baseline with an exceptional engagement ceiling (Q3), confirming it as the most promising choice for long-term retention.

High Engagement: indicating potential core mechanics that can be leveraged to secure good playtime.


4. Comparative Analysis: Ranking the Final Candidates

To compare the different metrics I normalized 1–5 based on percentile ranks (median/Q3 reviews, median playtime, and release counts). 

Weights were assigned based on the relative importance of each metric to investment decisions. Floor, Upside, and Engagement are weighted equally at 18% each to balance downside protection, breakout potential, selling price and player retention. Saturation is given a lower weight (10%) because while market crowding affects risk, it is less determinative of overall performance in comparison to these core metrics.

As espected Open World Survical Craft is our winner here with dominating value accross all our metrics.

Loother Shooter and Third Person Shooter are our plan B offering lower Review Count for Initial Sale but strong Saturation with potential post release profit.

While our genre that perform better upon release tend to have a weeker retention, making riskier choice, if the inital launch is not a success, the chance of a postlaunch revenue way lower.

PHASE 5 : ACT & RECOMMENDATION

Data show that Raw digital should focus on is Open World Survival Craft as primary target:

  • Why: This genre offers the best mix between all our metrics—combining High Safety (Top 6 Median), High Ceiling (8.5x Multiplier), and High Retention and higher median selling price.

  • Context: The market is currently underserved. With low release volume in 2024–2025, a polished entry faces minimal competition while tapping into a high-demand player base.

  • Strategy: Scout for teams with strong technical foundations to capture the post-launch revenue opportunity (DLC/Live Ops).

Using a standard Steam Revenue Calculator model and the genre's median price, we estimate gross revenue between $1M (Median performance) and $6M (Q3 'Hit' performance). Furthermore, engagement metrics show a median playtime of 8 hours, doubling to 16 hours for top-quartile titles creating opportunity for post launch profit.

Despite our winning game in the watchlist could be secondary investment options. While they don’t outperform Open World Survival Craft in overall potential, they offer portfolio diversification

Looter Shooter & Third Person Shooter offers strong retention potential but a lower sales ceiling indicating loss in potential initial revenue, but could offer good potential long term revenue.