ETL - Howto
A collection of ETL processes
Functions
Movies Table - BS into DB
Extract from online movies table using requests.get. Parse with BS into df. Load to CSV. Create DB. Load in DB
REQUESTS.GET
BeautifulSoup
FIND.ALL tables
FIND.ALL rows
LOOP rows into df
SAVE df to CSV
CREATE SQLITE3 DB
SAVE df to db
Scripts
This section is more of a complete script section, each file contains a running ETL that will run upon file execution. Not automated yet, that will be done in another section.
Employee CSV - SQLite3 - SQL DB
Import CSV file with open(). Create DB. Query. Transform. Save DB
WITH OPEN
R & W into file
READ_CSV
create SQLite DB
TO_SQL to load data to DB
READ_SQL to query DB
save DB
CLOSE connection
Multiple Sources - json csv xml - Pandas - GLOB - Log Processes
Import from multiple sources: json, csv, xml. Extract with GLOB. Transform. Load to CSV. Log the entire ETL process
WGET zip - shell
UNZIP - shell
REQUESTS.GET
RESPONSE.CONTENT
EXTRACTALL
CSV EXTRACT
JSON EXTRACT
XML EXTRACT
ElementTree
GLOB
Extract loop the GLOB list
Transform data
Load to CSV
LOG process
GDP - Pandas - SQLite3 - ETL w Log
Scrape online GDP table from site with Pandas. Save to CSV. Save in SQLite3 DB. Query. Log ETL process
SQLite3
READ_HTML
df.TO_CSV
df.TO_SQL
READ_SQL
LOG file
GDP - BS - SQLite3 - ETL w Log
Scrape online GDP table from site. Parse with BeautifulSoup. Save to CSV. Save in SQLite3 DB. Query. Log ETL process