Why OOP Matters for BFSI AI Work
On Day 1 I wrote procedural code — one script, top to bottom, downloading data and printing results. That's fine for exploration. But every production AI system in banking or financial services is built around objects: a Transaction, a LoanApplication, a RiskScore. OOP is the language that financial engineers speak.
Today I built a FinancialRecord class — the simplest possible representation of one row of market data — and made each CSV row an instance of it. This is exactly the pattern that scales to ML pipelines: instead of passing raw dicts around, you pass structured objects with validated fields and methods.
BFSI relevance: Fraud detection models, credit scoring systems, and trade reconciliation tools at banks are all built on class hierarchies. A Transaction object carries its own validation, formatting, and risk methods. Learning this pattern on Day 2 means you're building the right mental model from the start.
The Dataset — NIFTY50 Historical Data
Day 1 used AAPL from yfinance. Day 2 shifts to NIFTY50 — India's benchmark index, directly relevant for BFSI roles in the Indian market. The data is free, no login required, downloaded directly using yfinance just like AAPL.
# Activate your environment first conda activate ai_dev python day2_financial_project.py CSV already exists. Skipping download. Date: 2024-01-02 | Open: 21665.599609 | Close: 21710.800781 Bullish Day ---------------------------------------- Date: 2024-01-03 | Open: 21719.800781 | Close: 21517.349609 Bearish Day ---------------------------------------- Date: 2024-01-04 | Open: 21519.199219 | Close: 21737.599609 Bullish Day ----------------------------------------
Why "CSV already exists. Skipping download"? The script checks if the file is already on disk before calling yfinance. This is a real engineering habit — never make a network request you don't need. In production pipelines, redundant downloads waste time and can hit API rate limits.
The KeyError Bug — and Why It Happened
The first run crashed. This is the actual error from the terminal:
Traceback (most recent call last):
File "day2_financial_project.py", line 38, in <module>
row["Date"],
~~~^^^^^^^^
KeyError: 'Date'
The cause: yfinance sets the Date as the DataFrame index, not a regular column. When you export to CSV without calling reset_index(), the date gets written with a cryptic label — not "Date". So row["Date"] in csv.DictReader raises a KeyError because the key simply doesn't exist.
Fix — one line: Call data.reset_index(inplace=True) before saving the CSV. This promotes the index into a proper column named "Date", and row["Date"] works perfectly from then on.
The Fix
# Without this, Date is the index — not a column data = yf.download("^NSEI", start="2024-01-01") # This promotes the index into a proper "Date" column data.reset_index(inplace=True) # ← the fix data.to_csv("nifty50.csv", index=False) # Now csv.DictReader sees: Date, Open, High, Low, Close, Volume # And row["Date"] works correctly
The FinancialRecord Class
Here is the full day2_financial_project.py — every line I wrote today, with the bug fixed:
import os import yfinance as yf import csv file_name = "nifty50.csv" # ── Step 1: Download only if CSV doesn't already exist ─── if os.path.exists(file_name) and os.path.getsize(file_name) > 0: print("CSV already exists. Skipping download.") else: print("Downloading NIFTY50 data...") data = yf.download("^NSEI", start="2024-01-01") data.reset_index(inplace=True) # moves Date from index → column data.to_csv("nifty50.csv", index=False) print("NIFTY50 CSV saved successfully") # ── Step 2: Define the FinancialRecord class ───────────── class FinancialRecord: def __init__(self, date, open_price, high, low, close): self.date = date self.open_price = open_price self.high = high self.low = low self.close = close def summary(self): print( f"Date: {self.date} | " f"Open: {self.open_price} | " f"Close: {self.close}" ) def is_bullish(self): return self.close > self.open_price # ── Step 3: Load CSV rows into FinancialRecord objects ─── records = [] with open(file_name, "r") as file: csv_reader = csv.DictReader(file) for row in csv_reader: record = FinancialRecord( row["Date"], row["Open"], row["High"], row["Low"], row["Close"] ) records.append(record) # ── Step 4: Print first 5 records with bullish/bearish ── for record in records[:5]: record.summary() if record.is_bullish(): print("Bullish Day") else: print("Bearish Day") print("-" * 40)
Breaking Down What Each Part Does
os.path.exists() + os.path.getsize() — Guard the Download
Before calling yfinance, check if the CSV is already on disk and non-empty. Both conditions matter — an empty file from a failed download would otherwise fool the check. This prevents redundant API calls every run, which matters in production pipelines that execute daily.
data.reset_index(inplace=True) — Why This Fixes the KeyError
yfinance sets the Date as the DataFrame index, not a regular column. Without reset_index(), the CSV has no Date column — it gets a cryptic label instead. Calling reset_index() promotes the index into a proper column named "Date", so row["Date"] works cleanly.
summary() and is_bullish() — Two Methods, Two Responsibilities
summary() handles display — it prints the record's key fields directly. is_bullish() handles logic — it returns True or False based on whether close beat open. This separation — display logic vs business logic — is a core OOP principle. Later you might swap summary() for a JSON formatter without touching is_bullish() at all.
records[:5] — Slicing the List of Objects
records[:5] gives you the first 5 objects from the list. Each iteration calls summary() to print the row, then is_bullish() to print the market signal, then a divider line. This pattern — iterate over objects, call methods — is how every production data pipeline operates at its core.
Why records[-1]? Negative indexing in Python counts from the end. records[-1] is always the last element regardless of list length. In time-series financial data, the last record is always the most recent date — useful for "get the latest price" patterns.
Setting Up Poetry
On Day 1, packages were installed with bare pip. That works but it's not reproducible — there's no record of which versions were installed or why. Poetry solves this by creating a pyproject.toml (the project manifest) and a poetry.lock (the exact version lock). Anyone who clones the project runs poetry install and gets the exact same environment.
Installing Poetry
# Install Poetry using the official installer curl -sSL https://install.python-poetry.org | python3 - # Add Poetry to PATH (add this line to ~/.zshrc too) export PATH="$HOME/.local/bin:$PATH" # Verify poetry --version Poetry (version 2.4.1)
Initialising the Project
# Run from your project root (AI-Architect-Roadmap/) poetry init # Answer the prompts: # Package name: ai-architect-roadmap # Version: 0.1.0 # Description: AI Architect learning roadmap — BFSI Edition # License: (leave empty, press Enter) # Define dependencies interactively? → no # Define dev dependencies? → no # Confirm generation? → yes Generated file # Now register your actual packages poetry add pandas yfinance matplotlib Updating dependencies Resolving dependencies... (1.2s) Writing lock file
The pyproject.toml That Was Generated
[project] name = "ai-architect-roadmap" version = "0.1.0" description = "AI Architect learning roadmap projects using Python, AI, and BFSI examples" authors = [ {name = "Prabhu"} ] requires-python = ">=3.11" dependencies = [ "pandas (>=3.0.3,<4.0.0)", "yfinance (>=1.3.0,<2.0.0)", "matplotlib (>=3.10.9,<4.0.0)" ] [build-system] requires = ["poetry-core>=2.0.0,<3.0.0"] build-backend = "poetry.core.masonry.api"
What does poetry.lock do? It records the exact version of every package and every transitive dependency — even packages your packages depend on. This means poetry install on any machine produces an identical environment to yours, forever. This is what makes builds reproducible — a requirement for any production AI system in banking.
Project Structure at End of Day 2
This is what the project directory looks like after two days:
AI-Architect-Roadmap/ │ ├── pyproject.toml # ← NEW: Poetry project manifest ├── poetry.lock # ← NEW: Exact dependency versions │ ├── ai_dev/ # conda environment (not committed to git) │ ├── Day1/ │ ├── basics.py │ ├── day1_stock_project.py │ ├── aapl_stock_data.csv │ └── chart.pdf │ └── Day2/ ├── day2_financial_project.py # ← NEW: OOP + CSV loading └── nifty50.csv # ← NEW: NIFTY50 historical data
Key OOP Concepts Internalised Today
# Class — the blueprint class FinancialRecord: pass # Instance — one specific object created from the blueprint record = FinancialRecord(...) # __init__ — constructor; called automatically on creation def __init__(self, date, close): self.date = date # instance attribute self.close = close # each instance has its own copy # Method — a function that belongs to the class # Always takes self as first argument def summary(self): return f"{self.date}: {self.close}" # Calling a method on an instance print(record.summary()) # Storing many instances in a list — the standard pattern records = [] records.append(FinancialRecord(...)) records.append(FinancialRecord(...)) # Iterating — same as any list for r in records: print(r.summary())
The Writing Prompt — Day 2 Reflection
Today's prompt: "Why does structuring data as objects (OOP) produce more maintainable code than raw dictionaries for financial data?"
"A raw dictionary like {'Date': '2020-01-02', 'Close': '12282.20'} is just data — it carries no behaviour, no validation, and no guaranteed structure. If I rename a key, every piece of code that touches that dictionary breaks silently. A FinancialRecord object, by contrast, defines its fields once in __init__ and exposes behaviour through methods. For BFSI systems where a Transaction object might pass through fraud detection, accounting, and reporting in the same pipeline, encapsulating data and behaviour together means each stage only needs to call a method — it doesn't need to know the internal structure of the object. That separation is what makes financial AI systems auditable and maintainable at scale."
What I Built by End of Day 2
One honest note: The KeyError bug took longer to debug than expected. But that's realistic — real engineering is mostly reading error messages and inspecting data. Printing list(row.keys()) was the fix. The lesson: when data doesn't match your expectation, inspect the data before editing the code.
What's Coming on Day 3
Day 3 covers NumPy + Pandas Foundations. I'll move beyond csv.DictReader and load the NIFTY50 data directly into a Pandas DataFrame. Then I'll compute real financial statistics — rolling averages, daily returns, volatility — using NumPy operations. The FinancialRecord objects from today will evolve into a proper DataFrame-based pipeline.