Why OOP Matters for BFSI AI Work
On Day 1 I wrote procedural code — one script, top to bottom, downloading data and printing results. That's fine for exploration. But every production AI system in banking or financial services is built around objects: a Transaction, a LoanApplication, a RiskScore. OOP is the language that financial engineers speak.
Today I built a FinancialRecord class — the simplest possible representation of one row of market data — and made each CSV row an instance of it. This is exactly the pattern that scales to ML pipelines: instead of passing raw dicts around, you pass structured objects with validated fields and methods.
Fraud detection models, credit scoring systems, and trade reconciliation tools at banks are all built on class hierarchies. A Transaction object carries its own validation, formatting, and risk methods. Learning this pattern on Day 2 means you're building the right mental model from the start.
The Dataset — NIFTY50 Historical Data
Day 1 used AAPL from yfinance. Day 2 shifts to NIFTY50 — India's benchmark index, directly relevant for BFSI roles in the Indian market. The data is free, no login required, downloaded directly using yfinance with the ticker ^NSEI.
# Activate your environment first conda activate ai_dev python day2_financial_project.py CSV already exists. Skipping download. Date: 2024-01-02 | Open: 21665.599609 | Close: 21710.800781 Bullish Day ---------------------------------------- Date: 2024-01-03 | Open: 21719.800781 | Close: 21517.349609 Bearish Day ---------------------------------------- Date: 2024-01-04 | Open: 21519.199219 | Close: 21737.599609 Bullish Day ----------------------------------------
The script checks if the file is already on disk before calling yfinance. This is a real engineering habit — never make a network request you don't need. In production pipelines, redundant downloads waste time and can hit API rate limits.
The KeyError Bug — and Why It Happened
The first run crashed with this traceback:
Traceback (most recent call last):
File "day2_financial_project.py", line 38, in <module>
row["Date"],
~~~^^^^^^^^
KeyError: 'Date'
The cause: yfinance sets the Date as the DataFrame index, not a regular column. When you export to CSV without calling reset_index(), the date gets written with a cryptic multi-level label — not "Date". So row["Date"] inside csv.DictReader raises a KeyError because the key simply doesn't exist in the CSV headers.
Call data.reset_index(inplace=True) before saving the CSV. This promotes the index into a proper column named "Date", and row["Date"] works correctly from then on. The lesson: when data doesn't match your expectation, print list(row.keys()) before editing the code.
The One-Line Fix
# Without this, Date is the DataFrame index — not a column data = yf.download("^NSEI", start="2024-01-01") # This promotes the index into a proper "Date" column data.reset_index(inplace=True) # ← the fix data.to_csv("nifty50.csv", index=False) # Now csv.DictReader sees: Date, Open, High, Low, Close, Volume # And row["Date"] works correctly
The FinancialRecord Class — Full Code
Here is the complete day2_financial_project.py — every line written on Day 2, with the bug fixed and all four steps in one script:
import os import yfinance as yf import csv file_name = "nifty50.csv" # ── Step 1: Download only if CSV doesn't already exist ─── if os.path.exists(file_name) and os.path.getsize(file_name) > 0: print("CSV already exists. Skipping download.") else: print("Downloading NIFTY50 data...") data = yf.download("^NSEI", start="2024-01-01") data.reset_index(inplace=True) # moves Date from index → column data.to_csv("nifty50.csv", index=False) print("NIFTY50 CSV saved successfully") # ── Step 2: Define the FinancialRecord class ───────────── class FinancialRecord: def __init__(self, date, open_price, high, low, close): self.date = date self.open_price = open_price self.high = high self.low = low self.close = close def summary(self): print( f"Date: {self.date} | " f"Open: {self.open_price} | " f"Close: {self.close}" ) def is_bullish(self): return self.close > self.open_price # ── Step 3: Load CSV rows into FinancialRecord objects ─── records = [] with open(file_name, "r") as file: csv_reader = csv.DictReader(file) for row in csv_reader: record = FinancialRecord( row["Date"], row["Open"], row["High"], row["Low"], row["Close"] ) records.append(record) # ── Step 4: Print first 5 records with bullish/bearish ── for record in records[:5]: record.summary() if record.is_bullish(): print("Bullish Day") else: print("Bearish Day") print("-" * 40)
Breaking Down What Each Part Does
os.path.exists() + os.path.getsize() — Guard the Download
Before calling yfinance, check if the CSV is already on disk and non-empty. Both conditions matter — an empty file from a failed download would otherwise fool the check. This prevents redundant API calls every run, which matters in production pipelines that execute daily.
data.reset_index(inplace=True) — Why This Fixes the KeyError
yfinance sets the Date as the DataFrame index, not a regular column. Without reset_index(), the CSV has no Date column — it gets a cryptic multi-level label instead. Calling reset_index() promotes the index into a proper column named "Date", so row["Date"] works cleanly in csv.DictReader.
summary() and is_bullish() — Two Methods, Two Responsibilities
summary() handles display — it prints the record's key fields. is_bullish() handles logic — it returns True or False based on whether close beat open. This separation of display logic from business logic is a core OOP principle. Later you might swap summary() for a JSON formatter without touching is_bullish() at all.
records[:5] — Slicing the List of Objects
records[:5] gives you the first 5 objects from the list. Each iteration calls summary() to print the row, then is_bullish() for the market signal, then a divider line. This pattern — iterate over objects, call methods — is how every production data pipeline operates at its core. records[-1] always returns the most recent date in time-series data.
Setting Up Poetry
On Day 1, packages were installed with bare pip. That works but it's not reproducible — there's no record of which versions were installed or why. Poetry solves this by creating a pyproject.toml (the project manifest) and a poetry.lock (the exact version lock). Anyone who clones the project runs poetry install and gets the exact same environment.
Installing Poetry
# Install Poetry using the official installer curl -sSL https://install.python-poetry.org | python3 - # Add Poetry to PATH (add this line to ~/.zshrc too) export PATH="$HOME/.local/bin:$PATH" # Verify the installation poetry --version Poetry (version 2.4.1)
Initialising the Project
# Run from your project root (AI-Architect-Roadmap/) poetry init # Answer the prompts: # Package name: ai-architect-roadmap # Version: 0.1.0 # Description: AI Architect learning roadmap — BFSI Edition # License: (leave empty, press Enter) # Define dependencies interactively? → no # Confirm generation? → yes Generated file # Now register your actual packages poetry add pandas yfinance matplotlib Updating dependencies Resolving dependencies... (1.2s) Writing lock file
The pyproject.toml That Was Generated
[project] name = "ai-architect-roadmap" version = "0.1.0" description = "AI Architect learning roadmap projects using Python, AI, and BFSI examples" authors = [ {name = "Prabhu"} ] requires-python = ">=3.11" dependencies = [ "pandas (>=3.0.3,<4.0.0)", "yfinance (>=1.3.0,<2.0.0)", "matplotlib (>=3.10.9,<4.0.0)" ] [build-system] requires = ["poetry-core>=2.0.0,<3.0.0"] build-backend = "poetry.core.masonry.api"
It records the exact version of every package and every transitive dependency — even packages your packages depend on. Running poetry install on any machine produces an identical environment to yours, forever. This is what makes builds reproducible — a strict requirement for any production AI system in banking.
Project Structure at End of Day 2
AI-Architect-Roadmap/ │ ├── pyproject.toml # ← NEW: Poetry project manifest ├── poetry.lock # ← NEW: Exact dependency versions │ ├── ai_dev/ # conda environment (not committed to git) │ ├── Day1/ │ ├── basics.py │ ├── day1_stock_project.py │ ├── aapl_stock_data.csv │ └── chart.pdf │ └── Day2/ ├── day2_financial_project.py # ← NEW: OOP + CSV loading └── nifty50.csv # ← NEW: NIFTY50 historical data
Key OOP Concepts Internalised Today
# Class — the blueprint class FinancialRecord: pass # Instance — one specific object created from the blueprint record = FinancialRecord(...) # __init__ — constructor; called automatically on creation def __init__(self, date, close): self.date = date # instance attribute self.close = close # each instance has its own copy # Method — a function that belongs to the class # Always takes self as first argument def summary(self): return f"{self.date}: {self.close}" # Calling a method on an instance print(record.summary()) # Storing many instances in a list — the standard pattern records = [] records.append(FinancialRecord(...)) # Iterating — same as any list for r in records: print(r.summary())
Day 2 Writing Reflection
Today's prompt: "Why does structuring data as objects (OOP) produce more maintainable code than raw dictionaries for financial data?"
"A raw dictionary like {'Date': '2020-01-02', 'Close': '12282.20'} is just data — it carries no behaviour, no validation, and no guaranteed structure. If I rename a key, every piece of code that touches that dictionary breaks silently. A FinancialRecord object, by contrast, defines its fields once in __init__ and exposes behaviour through methods. For BFSI systems where a Transaction object might pass through fraud detection, accounting, and reporting in the same pipeline, encapsulating data and behaviour together means each stage only needs to call a method — it doesn't need to know the internal structure of the object. That separation is what makes financial AI systems auditable and maintainable at scale."
What I Built by End of Day 2
The KeyError bug took longer to debug than expected. But that's realistic — real engineering is mostly reading error messages and inspecting data. Printing list(row.keys()) was the fix. The lesson: when data doesn't match your expectation, inspect the data before editing the code.
What's Coming on Day 3
Day 3 covers NumPy + Pandas Foundations. I'll move beyond csv.DictReader and load the NIFTY50 data directly into a Pandas DataFrame. Then I'll compute real financial statistics — rolling averages, daily returns, volatility — using NumPy operations. The FinancialRecord objects from today will evolve into a proper DataFrame-based pipeline.