← Back to Blog
📅 Day 2 of 80 AI ARCHITECT ROADMAP · BFSI EDITION 2026

Python OOP, File I/O & Dependency Management with Poetry

Day 2. The CSV from Day 1 is sitting on disk. Today the job was to stop treating data as rows and start treating it as objects. I built a FinancialRecord class that loads the NIFTY50 CSV, wraps each row into a Python object, and exposes a summary() method. Then I set up Poetry — the tool that replaces bare pip with professional dependency management. This is where Python stops feeling like scripting and starts feeling like software engineering.

Why OOP Matters for BFSI AI Work

On Day 1 I wrote procedural code — one script, top to bottom, downloading data and printing results. That's fine for exploration. But every production AI system in banking or financial services is built around objects: a Transaction, a LoanApplication, a RiskScore. OOP is the language that financial engineers speak.

Today I built a FinancialRecord class — the simplest possible representation of one row of market data — and made each CSV row an instance of it. This is exactly the pattern that scales to ML pipelines: instead of passing raw dicts around, you pass structured objects with validated fields and methods.

BFSI relevance: Fraud detection models, credit scoring systems, and trade reconciliation tools at banks are all built on class hierarchies. A Transaction object carries its own validation, formatting, and risk methods. Learning this pattern on Day 2 means you're building the right mental model from the start.

The Dataset — NIFTY50 Historical Data

Day 1 used AAPL from yfinance. Day 2 shifts to NIFTY50 — India's benchmark index, directly relevant for BFSI roles in the Indian market. The data is free, no login required, downloaded directly using yfinance just like AAPL.

Terminal — Run Day 2 Script
# Activate your environment first
conda activate ai_dev
python day2_financial_project.py

CSV already exists. Skipping download.
Date: 2024-01-02 | Open: 21665.599609 | Close: 21710.800781
Bullish Day
----------------------------------------
Date: 2024-01-03 | Open: 21719.800781 | Close: 21517.349609
Bearish Day
----------------------------------------
Date: 2024-01-04 | Open: 21519.199219 | Close: 21737.599609
Bullish Day
----------------------------------------

Why "CSV already exists. Skipping download"? The script checks if the file is already on disk before calling yfinance. This is a real engineering habit — never make a network request you don't need. In production pipelines, redundant downloads waste time and can hit API rate limits.

The KeyError Bug — and Why It Happened

The first run crashed. This is the actual error from the terminal:

⚠ TRACEBACK — KeyError
Traceback (most recent call last):
  File "day2_financial_project.py", line 38, in <module>
    row["Date"],
    ~~~^^^^^^^^
KeyError: 'Date'

The cause: yfinance sets the Date as the DataFrame index, not a regular column. When you export to CSV without calling reset_index(), the date gets written with a cryptic label — not "Date". So row["Date"] in csv.DictReader raises a KeyError because the key simply doesn't exist.

Fix — one line: Call data.reset_index(inplace=True) before saving the CSV. This promotes the index into a proper column named "Date", and row["Date"] works perfectly from then on.

The Fix

Python the one-line fix
# Without this, Date is the index — not a column
data = yf.download("^NSEI", start="2024-01-01")

# This promotes the index into a proper "Date" column
data.reset_index(inplace=True)   # ← the fix

data.to_csv("nifty50.csv", index=False)

# Now csv.DictReader sees: Date, Open, High, Low, Close, Volume
# And row["Date"] works correctly

The FinancialRecord Class

Here is the full day2_financial_project.py — every line I wrote today, with the bug fixed:

Python day2_financial_project.py
import os
import yfinance as yf
import csv

file_name = "nifty50.csv"

# ── Step 1: Download only if CSV doesn't already exist ───
if os.path.exists(file_name) and os.path.getsize(file_name) > 0:
    print("CSV already exists. Skipping download.")
else:
    print("Downloading NIFTY50 data...")
    data = yf.download("^NSEI", start="2024-01-01")
    data.reset_index(inplace=True)   # moves Date from index → column
    data.to_csv("nifty50.csv", index=False)
    print("NIFTY50 CSV saved successfully")


# ── Step 2: Define the FinancialRecord class ─────────────
class FinancialRecord:
    def __init__(self, date, open_price, high, low, close):
        self.date       = date
        self.open_price = open_price
        self.high       = high
        self.low        = low
        self.close      = close

    def summary(self):
        print(
            f"Date: {self.date} | "
            f"Open: {self.open_price} | "
            f"Close: {self.close}"
        )

    def is_bullish(self):
        return self.close > self.open_price


# ── Step 3: Load CSV rows into FinancialRecord objects ───
records = []

with open(file_name, "r") as file:
    csv_reader = csv.DictReader(file)

    for row in csv_reader:
        record = FinancialRecord(
            row["Date"],
            row["Open"],
            row["High"],
            row["Low"],
            row["Close"]
        )
        records.append(record)


# ── Step 4: Print first 5 records with bullish/bearish ──
for record in records[:5]:
    record.summary()

    if record.is_bullish():
        print("Bullish Day")
    else:
        print("Bearish Day")

    print("-" * 40)

Breaking Down What Each Part Does

1

os.path.exists() + os.path.getsize() — Guard the Download

Before calling yfinance, check if the CSV is already on disk and non-empty. Both conditions matter — an empty file from a failed download would otherwise fool the check. This prevents redundant API calls every run, which matters in production pipelines that execute daily.

2

data.reset_index(inplace=True) — Why This Fixes the KeyError

yfinance sets the Date as the DataFrame index, not a regular column. Without reset_index(), the CSV has no Date column — it gets a cryptic label instead. Calling reset_index() promotes the index into a proper column named "Date", so row["Date"] works cleanly.

3

summary() and is_bullish() — Two Methods, Two Responsibilities

summary() handles display — it prints the record's key fields directly. is_bullish() handles logic — it returns True or False based on whether close beat open. This separation — display logic vs business logic — is a core OOP principle. Later you might swap summary() for a JSON formatter without touching is_bullish() at all.

4

records[:5] — Slicing the List of Objects

records[:5] gives you the first 5 objects from the list. Each iteration calls summary() to print the row, then is_bullish() to print the market signal, then a divider line. This pattern — iterate over objects, call methods — is how every production data pipeline operates at its core.

Why records[-1]? Negative indexing in Python counts from the end. records[-1] is always the last element regardless of list length. In time-series financial data, the last record is always the most recent date — useful for "get the latest price" patterns.

Setting Up Poetry

On Day 1, packages were installed with bare pip. That works but it's not reproducible — there's no record of which versions were installed or why. Poetry solves this by creating a pyproject.toml (the project manifest) and a poetry.lock (the exact version lock). Anyone who clones the project runs poetry install and gets the exact same environment.

Installing Poetry

Terminal — Install Poetry
# Install Poetry using the official installer
curl -sSL https://install.python-poetry.org | python3 -

# Add Poetry to PATH (add this line to ~/.zshrc too)
export PATH="$HOME/.local/bin:$PATH"

# Verify
poetry --version
Poetry (version 2.4.1)

Initialising the Project

Terminal — poetry init
# Run from your project root (AI-Architect-Roadmap/)
poetry init

# Answer the prompts:
# Package name: ai-architect-roadmap
# Version: 0.1.0
# Description: AI Architect learning roadmap — BFSI Edition
# License: (leave empty, press Enter)
# Define dependencies interactively? → no
# Define dev dependencies? → no
# Confirm generation? → yes

Generated file

# Now register your actual packages
poetry add pandas yfinance matplotlib
Updating dependencies
Resolving dependencies... (1.2s)
Writing lock file

The pyproject.toml That Was Generated

TOML pyproject.toml
[project]
name = "ai-architect-roadmap"
version = "0.1.0"
description = "AI Architect learning roadmap projects using Python, AI, and BFSI examples"
authors = [
    {name = "Prabhu"}
]
requires-python = ">=3.11"
dependencies = [
    "pandas (>=3.0.3,<4.0.0)",
    "yfinance (>=1.3.0,<2.0.0)",
    "matplotlib (>=3.10.9,<4.0.0)"
]

[build-system]
requires = ["poetry-core>=2.0.0,<3.0.0"]
build-backend = "poetry.core.masonry.api"

What does poetry.lock do? It records the exact version of every package and every transitive dependency — even packages your packages depend on. This means poetry install on any machine produces an identical environment to yours, forever. This is what makes builds reproducible — a requirement for any production AI system in banking.

Project Structure at End of Day 2

This is what the project directory looks like after two days:

TREE AI-Architect-Roadmap/
AI-Architect-Roadmap/
│
├── pyproject.toml          # ← NEW: Poetry project manifest
├── poetry.lock             # ← NEW: Exact dependency versions
│
├── ai_dev/                 # conda environment (not committed to git)
│
├── Day1/
│   ├── basics.py
│   ├── day1_stock_project.py
│   ├── aapl_stock_data.csv
│   └── chart.pdf
│
└── Day2/
    ├── day2_financial_project.py   # ← NEW: OOP + CSV loading
    └── nifty50.csv                 # ← NEW: NIFTY50 historical data

Key OOP Concepts Internalised Today

Python OOP concepts in context
# Class — the blueprint
class FinancialRecord:
    pass

# Instance — one specific object created from the blueprint
record = FinancialRecord(...)

# __init__ — constructor; called automatically on creation
def __init__(self, date, close):
    self.date  = date     # instance attribute
    self.close = close    # each instance has its own copy

# Method — a function that belongs to the class
# Always takes self as first argument
def summary(self):
    return f"{self.date}: {self.close}"

# Calling a method on an instance
print(record.summary())

# Storing many instances in a list — the standard pattern
records = []
records.append(FinancialRecord(...))
records.append(FinancialRecord(...))

# Iterating — same as any list
for r in records:
    print(r.summary())

The Writing Prompt — Day 2 Reflection

Today's prompt: "Why does structuring data as objects (OOP) produce more maintainable code than raw dictionaries for financial data?"

✍️ DAY 2 WRITING REFLECTION

"A raw dictionary like {'Date': '2020-01-02', 'Close': '12282.20'} is just data — it carries no behaviour, no validation, and no guaranteed structure. If I rename a key, every piece of code that touches that dictionary breaks silently. A FinancialRecord object, by contrast, defines its fields once in __init__ and exposes behaviour through methods. For BFSI systems where a Transaction object might pass through fraud detection, accounting, and reporting in the same pipeline, encapsulating data and behaviour together means each stage only needs to call a method — it doesn't need to know the internal structure of the object. That separation is what makes financial AI systems auditable and maintainable at scale."

What I Built by End of Day 2

✅ DAY 2 DELIVERABLES
✅ FinancialRecord class
✅ __init__ + summary()
✅ is_bullish() method
✅ csv.DictReader loading
✅ NIFTY50 2024 dataset
✅ reset_index() fix
✅ Bullish/bearish signal
✅ Poetry installed
✅ pyproject.toml
✅ Writing reflection

One honest note: The KeyError bug took longer to debug than expected. But that's realistic — real engineering is mostly reading error messages and inspecting data. Printing list(row.keys()) was the fix. The lesson: when data doesn't match your expectation, inspect the data before editing the code.

What's Coming on Day 3

Day 3 covers NumPy + Pandas Foundations. I'll move beyond csv.DictReader and load the NIFTY50 data directly into a Pandas DataFrame. Then I'll compute real financial statistics — rolling averages, daily returns, volatility — using NumPy operations. The FinancialRecord objects from today will evolve into a proper DataFrame-based pipeline.

← Day 1 · Python Environment Phase 1 · Days 1–10 · Python & Math Foundations

Continue the Series