How to Use SQL with Python: Pandas, SQLAlchemy & SQLite
Python developers often ask: should I use SQL directly, or an ORM? Should I use Pandas? SQLite or PostgreSQL? Here's the practical answer — with working code for each approach.
Three Ways to Query Databases from Python
1. SQLite + sqlite3 (Built-in, No Setup)
Python includes sqlite3 in the standard library. It's perfect for local development, testing, and small projects.
import sqlite3
# Connect to database (creates it if it doesn't exist)
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
# Create table
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE,
signup_date DATE
)
''')
# Insert data
cursor.execute(
"INSERT INTO users (name, email, signup_date) VALUES (?, ?, ?)",
("Alice", "alice@example.com", "2026-03-01")
)
# Query data
cursor.execute("SELECT * FROM users WHERE signup_date >= ?", ("2026-03-01",))
rows = cursor.fetchall()
for row in rows:
print(row)
conn.commit()
conn.close()Best for: Scripts, data analysis, prototyping, testing. No external dependencies.
2. Pandas + SQL (Data Analysis)
If you're already using Pandas for data analysis, read_sql andto_sql let you move data between SQL and DataFrames seamlessly.
import pandas as pd
import sqlite3
conn = sqlite3.connect('users.db')
# Read SQL query into DataFrame
df = pd.read_sql(
"SELECT * FROM users WHERE signup_date >= ?",
conn,
params=("2026-03-01",)
)
# Analyze
print(df.groupby('signup_date').size())
# Write DataFrame back to database
df.to_sql('users_backup', conn, if_exists='replace', index=False)
conn.close()Best for: Data analysis, reporting, ETL pipelines. Combines SQL querying with Pandas manipulation.
3. SQLAlchemy ORM (Web Apps & Complex Logic)
SQLAlchemy is an ORM (Object-Relational Mapper) that lets you write database queries as Python code. It abstracts away SQL and works with any database.
from sqlalchemy import create_engine, Column, Integer, String, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import date
# Setup
engine = create_engine('sqlite:///users.db')
Base = declarative_base()
# Define model
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
email = Column(String, unique=True)
signup_date = Column(Date)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
# Insert
new_user = User(name="Bob", email="bob@example.com", signup_date=date.today())
session.add(new_user)
session.commit()
# Query
users = session.query(User).filter(User.signup_date >= date(2026, 3, 1)).all()
for user in users:
print(user.name, user.email)
session.close()Best for: Web applications (Flask, Django), complex business logic, multi-database support.
Comparison: Which Should You Use?
| Approach | Setup | Learning Curve | Best For |
|---|---|---|---|
| sqlite3 | Built-in | Low | Scripts, prototypes |
| Pandas | pip install pandas | Low-Medium | Data analysis, ETL |
| SQLAlchemy | pip install sqlalchemy | Medium-High | Web apps, complex logic |
Real-World Examples
Example 1: ETL Pipeline with Pandas
Extract data from a CSV, transform it, load it into a database:
import pandas as pd
import sqlite3
# Extract: Read CSV
df = pd.read_csv('sales_data.csv')
# Transform: Clean and aggregate
df['date'] = pd.to_datetime(df['date'])
df['month'] = df['date'].dt.to_period('M')
monthly_sales = df.groupby('month')['amount'].sum()
# Load: Write to database
conn = sqlite3.connect('analytics.db')
monthly_sales.to_sql('monthly_sales', conn, if_exists='append')
conn.close()
print("ETL complete")Example 2: Flask Web App with SQLAlchemy
A simple Flask app that queries users from a database:
from flask import Flask
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from models import User # Your SQLAlchemy model
app = Flask(__name__)
engine = create_engine('postgresql://user:pass@localhost/mydb')
Session = sessionmaker(bind=engine)
@app.route('/users')
def get_users():
session = Session()
users = session.query(User).filter(User.active == True).all()
session.close()
return [{'id': u.id, 'name': u.name} for u in users]
if __name__ == '__main__':
app.run()Example 3: Batch Processing with sqlite3
Process millions of rows efficiently:
import sqlite3
conn = sqlite3.connect('large_dataset.db')
cursor = conn.cursor()
# Process in batches to avoid memory overload
batch_size = 10000
offset = 0
while True:
cursor.execute(
"SELECT id, value FROM data LIMIT ? OFFSET ?",
(batch_size, offset)
)
rows = cursor.fetchall()
if not rows:
break
# Process batch
for row_id, value in rows:
processed_value = value * 2 # Your logic here
cursor.execute(
"UPDATE data SET processed_value = ? WHERE id = ?",
(processed_value, row_id)
)
conn.commit()
offset += batch_size
print(f"Processed {offset} rows")
conn.close()Common Pitfalls & How to Avoid Them
1. SQL Injection (Always Use Parameterized Queries)
❌ Wrong:
# VULNERABLE TO SQL INJECTION
user_id = "1; DROP TABLE users; --"
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")✅ Right:
# SAFE
cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))2. Forgetting to Commit Changes
INSERT, UPDATE, and DELETE don't persist until you call commit().
cursor.execute("INSERT INTO users (name) VALUES (?)", ("Charlie",))
conn.commit() # Don't forget this!3. Not Closing Connections
Always close connections to free up resources. Use context managers when possible:
# Better: Automatically closes connection
with sqlite3.connect('users.db') as conn:
cursor = conn.cursor()
cursor.execute("SELECT * FROM users")
rows = cursor.fetchall()Frequently Asked Questions
Should I use an ORM or write raw SQL?
ORMs (like SQLAlchemy) are great for web apps because they handle database abstraction and prevent SQL injection. Raw SQL is faster for data analysis and one-off queries. Use both — ORMs for application code, raw SQL for analytics.
Can I use SQLAlchemy with SQLite?
Yes. SQLAlchemy works with any database — SQLite, PostgreSQL, MySQL, etc. Just change the connection string: sqlite:///local.db vs postgresql://user:pass@localhost/db.
How do I handle large datasets without running out of memory?
Use batch processing (fetch 10,000 rows at a time) or streaming. With Pandas, usechunksize parameter: pd.read_sql(..., chunksize=10000).
What's the performance difference between sqlite3, Pandas, and SQLAlchemy?
For raw query speed: sqlite3 > SQLAlchemy > Pandas. But the difference is usually negligible unless you're processing millions of rows. Choose based on your use case, not micro-optimizations.
Need to generate SQL queries from Python?
RegSQL generates accurate SQL from plain English. Use it to draft queries, then paste them into your Python code. Free, no sign-up.
🗄️ Generate SQL Queries Free →The Bottom Line
sqlite3 for scripts and prototypes. Pandas for data analysis and ETL. SQLAlchemy for web applications. All three are powerful — pick the right tool for your job.
The key is understanding the tradeoffs: simplicity vs. abstraction, speed vs. convenience. Start simple with sqlite3, graduate to Pandas or SQLAlchemy as your needs grow.