DSM - Market Data Retrieval Workflow
Source: Notion | Last edited: 2025-05-15 | ID: 1c22d2dc-3ef...
graph TB %% Define four columns to maximize horizontal space subgraph "Initial Request" A["Start: Data Request<br/>symbol, time range, interval"] --> B["**Check Cache (Daily)?**<br/>use_cache=True<br/><br/><sup>User preference & config</sup>"] end
subgraph "Cache Check" B -- Yes --> C["**Cache Hit (Daily)?**<br/>Valid & Recent Data for Day?<br/><br/><sup>Metadata & checksum validation</sup><br/><sup>Data freshness threshold</sup>"] C -- Yes --> E["**Load Data from Cache**<br/>UnifiedCacheManager.load_from_cache<br/><br/><sup>Fast daily retrieval</sup><br/><sup>REST API boundary aligned</sup>"] E --> F["Return Data<br/>DataFrame from Cache"] end
subgraph "API Strategy" B -- No --> D["**Data Source Selection**<br/>_should_use_vision_api<br/><br/><sup>Estimate data points</sup><br/><sup>Vision API for large requests</sup>"] C -- No --> D D --> G1["**Vision API (Primary)**<br/>VisionDataClient.fetch<br/><br/><sup>Download-First Approach</sup><br/><sup>No pre-checking - faster retrieval</sup>"] G1 --> G{"**Vision API Fetch**<br/>VisionDataClient._download_data<br/><br/><sup>Direct download with dynamic concurrency</sup><br/><sup>Aligned boundaries via ApiBoundaryValidator</sup>"} end
subgraph "Results & Caching" G -- Success --> I{"**Save to Cache (Daily)?**<br/>UnifiedCacheManager.save_to_cache<br/><br/><sup>Saves with REST API-aligned boundaries</sup><br/><sup>using TimeRangeManager.align_vision_api_to_rest</sup>"} G -- Fail --> H["**Automatic Fallback**<br/>RestDataClient.fetch<br/><br/><sup>Transparent fallback for the user</sup><br/><sup>Same consistent interface</sup>"]
H -- Success --> K{"**Save to Cache (Daily)?**<br/>UnifiedCacheManager.save_to_cache<br/><br/><sup>Caches successful REST API data</sup><br/><sup>Same format as Vision API data</sup>"} H -- Fail --> M["**Error Handling**<br/>raise Exception<br/><br/><sup>Retrieval failure</sup><br/><sup>Logged error details</sup>"]
I --> J["Return Data<br/>DataFrame from Vision API<br/><br/><sup>Aligned with REST API boundaries</sup>"] K --> L["Return Data<br/>DataFrame from REST API"] end
%% Connect across subgraphs F --> N["End: Data Retrieval<br/>Returns DataFrame"] J --> N L --> N M --> N
%% Styling style I fill:#f9f,stroke:#333,stroke-width:2px,color:#000 style K fill:#f9f,stroke:#333,stroke-width:2px,color:#000 style B fill:#ccf,stroke:#333,stroke-width:2px,color:#000,shape:rect style C fill:#ccf,stroke:#333,stroke-width:2px,color:#000,shape:rect style D fill:#ccf,stroke:#333,stroke-width:2px,color:#000,shape:rect style G1 fill:#cfc,stroke:#333,stroke-width:2px,color:#000 style H fill:#cfc,stroke:#333,stroke-width:2px,color:#000,stroke-dasharray: 5, 5 style E fill:#cfc,stroke:#333,stroke-width:2px,color:#000 style G fill:#eee,stroke:#333,stroke-width:2px,color:#000 style M fill:#fee,stroke:#333,stroke-width:2px,color:#000
%% Define larger font class classDef largeText fontSize:18px;
%% Apply large text to all nodes class A,B,C,D,E,F,G,G1,H,I,J,K,L,M,N largeText;Updated Workflow Overview
Section titled “Updated Workflow Overview”This diagram illustrates the improved market data retrieval workflow with two key optimizations:
- Download-First Approach: The Vision API client now uses a direct download-first approach without pre-checking file existence, significantly improving performance.
- Automatic Fallback: If Vision API fails to retrieve data, the system automatically and transparently falls back to REST API. The workflow retains the existing advantages while adding these performance and reliability improvements.
Process Description
Section titled “Process Description”The data retrieval process begins with a user request for market data. The system first checks for valid REST API-aligned cached data. If found, it’s immediately returned.
Otherwise, the data source selection process is triggered:
- Primary Path (Vision API with Download-First):
- The system tries Vision API first for most requests, especially larger historical ones
- Uses download-first approach (no pre-checking) for optimal performance
- Applies dynamic concurrency optimization based on batch size
- Downloads data by day, combines results, and caches with REST API-aligned boundaries
- Automatic Fallback Path (REST API):
- If Vision API fails or returns no data, the system automatically falls back to REST API
- This fallback is transparent to the user - same interface and data format
- REST API data is also cached for future retrieval All data sources (Vision API, REST API, and cache) deliver consistent results with identical time boundaries, ensuring a seamless experience regardless of which source ultimately provides the data.
Key Benefits
Section titled “Key Benefits”- Improved Performance: The download-first approach eliminates unnecessary HEAD requests
- Higher Reliability: Automatic fallback ensures data retrieval even when Vision API is unavailable
- Optimized Resource Usage: Dynamic concurrency adjustment based on batch size
- Consistent Data Format: All sources return identical data structure
- Transparent Experience: Users don’t need to worry about which source provides the data
Update
flowchart TD A[Application] --> B[DataSourceManager.get_data] B --> C[Check Cache] C -->|Hit| D[Return Cached Data] C -->|Miss| E[Time Range Analysis] E --> F[Split into Sub-Ranges if Needed] F --> G[For Each Sub-Range]
G --> H[Try Vision API] H -->|Success| I[Format Data] H -->|Failure| J[Try REST API]
J -->|Success| I J -->|Failure| K[Error]
I --> L[Merge Data] L --> M[Cache Result] M --> N[Return Data] D --> N
%% Place improvements as a vertical list subgraph Improvements [Key Improvements] direction TB R1[1s intervals in SPOT] R2[Smart Chunking] R3[Single Cache Entry] R4[Better Error Handling] end
%% Place priority as a vertical list subgraph Priority [Source Priority] direction TB P1[1. Cache] P2[2. Vision API] P3[3. REST API] end
%% Position subgraphs Improvements -.- Priority