# Universe Selection Strategy Question: Why Only 585 Stocks Instead of 900?

## Problem Summary

Our algorithm is designed to monitor **900 stocks** (`UNIVERSE_SIZE = 900`), but consistently selects only **585 stocks**, which are then saved to ObjectStore and reused on every restart.

**Key Questions:**
1. Why does the system select 900 stocks in Coarse Selection but only 585 pass Fine Selection?
2. What is the best strategy for stock selection and storage?
3. Should we change the approach to select from ALL available stocks instead of pre-filtering to 900?

---

## Current Behavior

### Configuration
```python
# In config.py
UNIVERSE_SIZE = 900  # Target number of stocks
MAX_SHARES_OUTSTANDING = 30_000_000  # 30 million shares (Low Float)
MIN_MARKET_CAP = 5_000_000  # $5M (Micro-cap)
MAX_MARKET_CAP = 100_000_000  # $100M (Micro-cap)
OBJECT_STORE_KEY = "filtered_universe_900"
USE_OBJECT_STORE = True
```

### Actual Results from Logs
```
[CoarseSelection] Total available stocks: 10,000
[CoarseSelection] After price filter ($1-$50): 900 stocks ✅

[FineSelection] Received 900 stocks from Coarse Selection ✅
[FineSelection] After Market Cap + Shares filter: 585 stocks ⚠️
[FineSelection] Sorted and selected best 585 stocks (Low Float)
[FineSelection] Sending 585 stocks to on_securities_changed

[OnSecuritiesChanged] Added 586, Removed 0
[OnSecuritiesChanged] Total active stocks: 585

[ObjectStore] Saved 585 stocks to ObjectStore for next startup
```

**Result:** Only **585 stocks** (65% of target) are monitored, and this number remains fixed across all restarts.

---

## Complete Function Code

### Function 1: `coarse_selection()`

```python
def coarse_selection(self, coarse):
   """
   فلترة أولية بناءً على السعر والحجم
   """
   try:
       # تخطي إذا تم التحميل من ObjectStore
       if self.bootstrap_completed:
           return Universe.UNCHANGED
       
       self.log(f"[CoarseSelection] called at {self.time}")
       
       coarse_list = list(coarse)
       self.log(f"[CoarseSelection] إجمالي الأسهم المتاحة: {len(coarse_list):,}")
       
       # فلترة حسب السعر فقط
       filtered = [
           x for x in coarse_list
           if config.MIN_PRICE <= x.price <= config.MAX_PRICE
           and x.has_fundamental_data
       ]
       
       self.log(f"[CoarseSelection] بعد فلترة السعر: {len(filtered)} سهم")
       
       # اختيار أفضل 900 سهم حسب الحجم
       sorted_by_volume = sorted(filtered, key=lambda x: x.dollar_volume, reverse=True)
       selected = sorted_by_volume[:config.UNIVERSE_SIZE]
       
       self.log(f"[CoarseSelection] انتهى Coarse Selection - إرسال {len(selected)} سهم إلى fine_selection")
       
       return [x.symbol for x in selected]
       
   except Exception as e:
       self.error(f"[CoarseSelection] خطأ: {e}")
       return []
```

**Issues:**
1. ✅ Selects 900 stocks based on price and dollar volume
2. ⚠️ Only considers stocks with `has_fundamental_data = True`
3. ⚠️ Sorts by dollar volume, not by shares outstanding
4. ❌ Doesn't apply shares outstanding filter here

---

### Function 2: `fine_selection()`

```python
def fine_selection(self, fine):
   """
   فلترة نهائية بناءً على Market Cap و Shares Outstanding
   """
   try:
       fine_list = list(fine)
       
       self.log(f"[FineSelection] called at {self.time}")
       self.log(f"[FineSelection] استلام {len(fine_list)} سهم من Coarse Selection")
       
       # فلترة حسب Market Cap + Shares Outstanding
       filtered = []
       debug_count = 0
       
       for x in fine_list:
           try:
               # الحصول على البيانات
               shares_outstanding = self._get_shares_outstanding(x)
               market_cap = x.MarketCap if hasattr(x, 'MarketCap') else 0
               
               # طباعة أول 10 أسهم للتشخيص
               if debug_count < 10:
                   from helpers import safe_format
                   shares_str = safe_format(shares_outstanding, ",.0f", log_warning=True, 
                                           algo=self, field_name=f"{x.symbol.value}_shares")
                   market_cap_str = "$" + safe_format(market_cap, ",.0f", log_warning=True, 
                                                      algo=self, field_name=f"{x.symbol.value}_market_cap")
                   self.log(f"[FineSelection] {x.symbol.value}: Shares={shares_str}, MarketCap={market_cap_str}")
                   debug_count += 1
               
               # تطبيق الفلاتر
               if (shares_outstanding and shares_outstanding <= config.MAX_SHARES_OUTSTANDING
                   and market_cap >= config.MIN_MARKET_CAP
                   and market_cap <= config.MAX_MARKET_CAP):
                   filtered.append((x, shares_outstanding))
                   
           except Exception as e:
               if debug_count < 10:
                   self.log(f"[FineSelection] {x.symbol.value}: Error - {e}")
                   debug_count += 1
               continue
       
       self.log(f"[FineSelection] بعد فلتر Market Cap + Shares: {len(filtered)} سهم")
       
       # ترتيب حسب Shares Outstanding (الأقل أولاً - Low Float)
       sorted_by_shares = sorted(filtered, key=lambda x: x[1])
       
       # اختيار أفضل 900 سهم
       selected = [x[0].symbol for x in sorted_by_shares[:config.UNIVERSE_SIZE]]
       
       self.log(f"[FineSelection] بعد الترتيب: اختيار أفضل {len(selected)} سهم (Low Float)")
       self.log(f"[FineSelection] انتهى Fine Selection - إرسال {len(selected)} سهم إلى on_securities_changed")
       
       return selected
       
   except Exception as e:
       self.error(f"[FineSelection] خطأ: {e}")
       return []
```

**Issues:**
1. ✅ Applies Market Cap and Shares Outstanding filters
2. ✅ Sorts by shares outstanding (Low Float focus)
3. ❌ **Only 585 stocks pass the filters** (315 rejected = 35%)
4. ⚠️ Returns 585 stocks instead of 900

---

### Function 3: `_get_shares_outstanding()`

```python
def _get_shares_outstanding(self, fine_fundamental):
   """
   V7.3.5: استخراج عدد الأسهم الحرة من البيانات الأساسية (محسّن)
   
   يحاول عدة طرق للحصول على البيانات:
   1. CompanyReference.SharesOutstanding (الأكثر موثوقية)
   2. EarningReports.BasicAverageShares
   3. FinancialStatements.SharesOutstanding
   4. CompanyProfile.SharesOutstanding
   """
   try:
       # محاولة 1: CompanyReference.SharesOutstanding
       if hasattr(fine_fundamental, 'CompanyReference') and hasattr(fine_fundamental.CompanyReference, 'SharesOutstanding'):
           shares = fine_fundamental.CompanyReference.SharesOutstanding
           if shares and shares > 0:
               return shares
       
       # محاولة 2: EarningReports.BasicAverageShares
       if hasattr(fine_fundamental, 'EarningReports') and hasattr(fine_fundamental.EarningReports, 'BasicAverageShares'):
           shares = fine_fundamental.EarningReports.BasicAverageShares.ThreeMonths
           if shares and shares > 0:
               return shares
       
       # محاولة 3: FinancialStatements.SharesOutstanding
       if hasattr(fine_fundamental, 'FinancialStatements') and hasattr(fine_fundamental.FinancialStatements, 'SharesOutstanding'):
           shares = fine_fundamental.FinancialStatements.SharesOutstanding.ThreeMonths
           if shares and shares > 0:
               return shares
       
       # محاولة 4: CompanyProfile.SharesOutstanding
       if hasattr(fine_fundamental, 'CompanyProfile') and hasattr(fine_fundamental.CompanyProfile, 'SharesOutstanding'):
           shares = fine_fundamental.CompanyProfile.SharesOutstanding
           if shares and shares > 0:
               return shares
       
       return None  # لم يتم العثور على البيانات
       
   except Exception as e:
       return None
```

---

### Function 4: `on_securities_changed()` - Storage Logic

```python
def on_securities_changed(self, changes):
   """معالجة تغييرات الكون"""
   try:
       self.universe_updates_count += 1
       
       # ... (code for adding/removing stocks)
       
       # V7.2: حفظ القائمة الجديدة في ObjectStore للمرة القادمة (توصية البوت)
       if config.USE_OBJECT_STORE and self.universe_updates_count >= 1:
           try:
               symbols_list = [symbol.value for symbol in self.symbol_data_dict.keys() if symbol != self.spy_symbol]
               if len(symbols_list) > 0:
                   symbols_str = ','.join(symbols_list)
                   self.object_store.save(config.OBJECT_STORE_KEY, symbols_str)
                   self.log(f"[ObjectStore] Saved {len(symbols_list)} stocks to ObjectStore for next startup")
           except Exception as e:
               self.error(f"[ObjectStore] Save error: {e}")
       
   except Exception as e:
       self.error(f"[OnSecuritiesChanged] خطأ: {e}")
```

**Storage Behavior:**
1. ✅ Saves whatever stocks were selected (585 in this case)
2. ⚠️ Saves immediately after first universe update
3. ⚠️ Every restart loads the same 585 stocks
4. ❌ No mechanism to refresh or update the list

---

### Function 5: `_load_target_symbols_from_objectstore()` - Loading Logic

```python
def _load_target_symbols_from_objectstore(self):
   """V7.3.7: تحميل قائمة الأسهم المستهدفة من ObjectStore (Fixed)"""
   try:
       if self.object_store.contains_key(config.OBJECT_STORE_KEY):
           symbols_str = self.object_store.read(config.OBJECT_STORE_KEY)
           
           if not symbols_str or len(symbols_str) == 0:
               self.log("[ObjectStore] Empty data in ObjectStore")
               return False
           
           self.target_symbols = symbols_str.split(',')
           
           if not self.target_symbols or len(self.target_symbols) == 0:
               self.log("[ObjectStore] No symbols after split")
               return False
           
           self.log(f"[ObjectStore] تم تحميل {len(self.target_symbols)} سهم")
           
           if len(self.target_symbols) >= 10:
               sample = ', '.join(self.target_symbols[:10])
               self.log(f"[ObjectStore] Sample: {sample}...")
           
           return True  # ✅ Success
           
       else:
           self.log("[ObjectStore] لا توجد قائمة محفوظة")
           return False
           
   except Exception as e:
       self.error(f"[ObjectStore] خطأ في التحميل: {e}")
       import traceback
       self.error(traceback.format_exc())
       return False
```

---

## Analysis: Why Only 585 Stocks?

### Breakdown of Rejected Stocks (315 out of 900)

From the logs, we can see:

```
[FineSelection] AAT: Shares=60,540,125.0, MarketCap=$1,240,173,551
```

**Rejection Reasons:**

| Reason | Estimated Count | Percentage |
| :--- | ---: | ---: |
| **Shares Outstanding > 30M** | ~250 | 28% |
| **Market Cap > $100M** | ~50 | 5% |
| **Market Cap < $5M** | ~10 | 1% |
| **Missing fundamental data** | ~5 | 1% |
| **Total Rejected** | **315** | **35%** |
| **Accepted** | **585** | **65%** |

---

## Questions for the Bot

### Question 1: Current Strategy - Is It Optimal?

**Current Approach:**
```
Step 1: Coarse Selection
 - Filter by price ($1-$50)
 - Sort by dollar volume
 - Select top 900 stocks
 
Step 2: Fine Selection
 - Filter by shares outstanding (<= 30M)
 - Filter by market cap ($5M-$100M)
 - Sort by shares outstanding (Low Float)
 - Select top 900 (but only 585 pass filters)
 
Step 3: ObjectStore
 - Save 585 stocks
 - Reuse on every restart
```

**Is this the best approach for finding Low Float stocks?**

---

### Question 2: Alternative Strategy - Process ALL Stocks?

**Proposed Alternative:**
```
Step 1: Coarse Selection
 - Return ALL stocks (10,000+)
 - Or return all stocks with has_fundamental_data = True
 
Step 2: Fine Selection
 - Filter by price ($1-$50)
 - Filter by shares outstanding (<= 30M)
 - Filter by market cap ($5M-$100M)
 - Sort by shares outstanding (Low Float)
 - Select top 900 stocks
```

**Advantages:**
- ✅ Considers ALL available stocks
- ✅ More likely to find 900 stocks that meet criteria
- ✅ Better Low Float selection

**Disadvantages:**
- ❌ Higher computational cost in Fine Selection
- ❌ Longer processing time
- ❌ More data to process

**Question:** Is this approach feasible in QuantConnect? Are there performance or cost implications?

---

### Question 3: Why Pre-filter to 900 in Coarse Selection?

**Current Logic:**
```python
# In coarse_selection()
sorted_by_volume = sorted(filtered, key=lambda x: x.dollar_volume, reverse=True)
selected = sorted_by_volume[:config.UNIVERSE_SIZE]  # Only top 900 by volume
```

**Problem:**
- We sort by **dollar volume**, not by **shares outstanding**
- High dollar volume stocks often have **high float** (many shares)
- We're selecting the WRONG 900 stocks for Low Float strategy!

**Example:**
- Stock A: $50M dollar volume, 100M shares outstanding → Selected ✅ (but high float!)
- Stock B: $10M dollar volume, 10M shares outstanding → Rejected ❌ (but low float!)

**Question:** Should we change the sorting criteria in Coarse Selection to prioritize Low Float stocks?

---

### Question 4: Optimal MAX_SHARES_OUTSTANDING Value

**Current:** `MAX_SHARES_OUTSTANDING = 30_000_000` (30M)

**Analysis:**
- Only 585 stocks have <= 30M shares
- This is **too restrictive**

**Question:** What is a reasonable value for Low Float stocks?

**Market Standards:**
- **Micro Float:** < 10M shares
- **Low Float:** 10M - 50M shares
- **Medium Float:** 50M - 100M shares
- **High Float:** > 100M shares

**Proposed Values:**
- Conservative: 50M (will get ~750 stocks)
- Moderate: 75M (will get ~850 stocks)
- Aggressive: 100M (will get ~900 stocks)

**Question:** What value do you recommend for a Low Float strategy targeting 900 stocks?

---

### Question 5: Storage Strategy - When to Refresh?

**Current Behavior:**
- Save 585 stocks to ObjectStore after first selection
- Load same 585 stocks on every restart
- **Never refresh the list**

**Problems:**
1. Stocks may no longer meet criteria (float increased, market cap changed)
2. New IPOs or stocks that now meet criteria are never added
3. Delisted stocks remain in the list

**Question:** What is the best strategy for refreshing the universe?

**Options:**

**Option A: Weekly Refresh**
```python
if self.time.weekday() == 0 and self.time.hour == 0:  # Monday midnight
   self.object_store.delete(config.OBJECT_STORE_KEY)
   self.bootstrap_completed = False
   # Trigger new universe selection
```

**Option B: Monthly Refresh**
```python
if self.time.day == 1 and self.time.hour == 0:  # First day of month
   self.object_store.delete(config.OBJECT_STORE_KEY)
   self.bootstrap_completed = False
```

**Option C: Conditional Refresh**
```python
# Refresh if:
# - Number of active stocks drops below threshold (e.g., < 500)
# - Manual trigger
# - Significant market event
```

**Option D: No ObjectStore (Always Use Universe Selection)**
```python
USE_OBJECT_STORE = False
# Always run universe selection (slower startup but always fresh)
```

---

### Question 6: Performance and Cost Considerations

**Current Setup:**
- Coarse Selection: ~10,000 stocks → 900 stocks
- Fine Selection: 900 stocks → 585 stocks
- Processing time: 5-20 minutes

**If we process ALL stocks in Fine Selection:**
- Coarse Selection: ~10,000 stocks → ~10,000 stocks (no filtering)
- Fine Selection: ~10,000 stocks → 900 stocks
- Processing time: ??? (unknown)

**Questions:**
1. Is there a performance penalty for processing 10,000 stocks in Fine Selection?
2. Are there QuantConnect API rate limits or costs for accessing fundamental data?
3. What is the recommended approach for large-scale universe selection?

---

### Question 7: Best Practice for Low Float Strategy

**Our Goal:**
- Monitor **900 Low Float stocks** (< 50M shares outstanding)
- Focus on **Micro-cap to Small-cap** ($5M - $100M market cap)
- Prioritize **lowest float first** (most volatile)

**Current Results:**
- Only **585 stocks** meet criteria
- **35% shortfall** from target

**Questions:**
1. Is 900 Low Float stocks a realistic target in the US market?
2. Should we adjust our criteria to reach 900 stocks?
3. Or should we accept 585 stocks as the maximum available?
4. What do professional Low Float scanners typically monitor?

---

## Proposed Solutions

### Solution 1: Increase MAX_SHARES_OUTSTANDING (Simple)

```python
# In config.py
MAX_SHARES_OUTSTANDING = 75_000_000  # 75M instead of 30M
```

**Expected Result:** ~850 stocks

---

### Solution 2: Change Coarse Selection to Filter by Shares (Better)

```python
def coarse_selection(self, coarse):
   # Filter by price
   filtered = [
       x for x in coarse
       if config.MIN_PRICE <= x.price <= config.MAX_PRICE
       and x.has_fundamental_data
   ]
   
   # NEW: Sort by shares outstanding (if available in coarse data)
   # If not available, sort by dollar volume as proxy
   sorted_by_volume = sorted(filtered, key=lambda x: x.dollar_volume, reverse=False)  # ASCENDING
   selected = sorted_by_volume[:config.UNIVERSE_SIZE * 2]  # Select 1800 stocks
   
   return [x.symbol for x in selected]
```

**Expected Result:** More Low Float stocks in Fine Selection

---

### Solution 3: Process ALL Stocks in Fine Selection (Best)

```python
def coarse_selection(self, coarse):
   # Only filter by has_fundamental_data
   filtered = [x for x in coarse if x.has_fundamental_data]
   return [x.symbol for x in filtered]  # Return ALL stocks

def fine_selection(self, fine):
   # Apply ALL filters here
   filtered = []
   
   for x in fine:
       shares_outstanding = self._get_shares_outstanding(x)
       market_cap = x.MarketCap
       price = x.Price
       
       if (config.MIN_PRICE <= price <= config.MAX_PRICE
           and shares_outstanding and shares_outstanding <= config.MAX_SHARES_OUTSTANDING
           and config.MIN_MARKET_CAP <= market_cap <= config.MAX_MARKET_CAP):
           filtered.append((x, shares_outstanding))
   
   # Sort by shares outstanding (Low Float)
   sorted_by_shares = sorted(filtered, key=lambda x: x[1])
   selected = [x[0].symbol for x in sorted_by_shares[:config.UNIVERSE_SIZE]]
   
   return selected
```

**Expected Result:** 900 stocks (or close to it)

---

## Summary of Questions

1. **Is the current two-stage filtering strategy optimal for Low Float stocks?**
2. **Should we process ALL stocks in Fine Selection instead of pre-filtering to 900?**
3. **Why do we sort by dollar volume in Coarse Selection instead of shares outstanding?**
4. **What is the optimal MAX_SHARES_OUTSTANDING value for 900 Low Float stocks?**
5. **What is the best strategy for refreshing the universe (weekly, monthly, never)?**
6. **Are there performance or cost implications for processing 10,000 stocks in Fine Selection?**
7. **Is 900 Low Float stocks a realistic target, or should we accept 585?**

---

## Request

Please analyze our current universe selection strategy and provide recommendations on:

1. **Best approach** for selecting 900 Low Float stocks
2. **Optimal configuration values** (MAX_SHARES_OUTSTANDING, etc.)
3. **Performance considerations** for processing large universes
4. **Storage and refresh strategy** for ObjectStore
5. **Any QuantConnect-specific best practices** we should follow

Thank you!