Published

July 11, 2010

Projects

This part of the site is an accumulation of some of the projects I worked on.

  • Scenarios described in the projects have been edited.

  • Data has been changed.

  • Names have been replaced.

  • All identities have been erased and replaced with similar entities to maintain the original idea behind the projects.

Note: API & ETL sections describe scraping projects as well, but they are more complete ETL examples than the General & Scrape sections which focus on specific methods used.

All the projects, and snippets in the fist two sections General & Scrape are the foundation to the remaining projects.

General


Count Words

This is a short snippet that describes a process that’s used most often in scraping data for content. It is short and yet probably the most used sequence of events to extract relevant text from social media sites, APIs or text files. It searches for unique words, count the frequency of occurrence using a class.

Create constructor

Create methods

Transform lowercase

Replace punctuations

COUNT unique words

COUNT frequency

Dictionary - Count Words

This is a variation on the first method to count words using a dictionary instead.

Create dict

Conditional count

Fprint

Scrape


IBM Site w BeautifulSoup

Scrape webpage for links, images, using: BeautifulSoup

REQUEST.GET().TEXT

BeautifulSoup

Scrape images

Online GDP Table to CSV - Pandas

Scrape data table from webpage using Pandas to: sort, rename cols, extract top 10, convert type and save to CSV

READ_HTML

.SHAPE

Col reName

Col extract

SORT

.ILOC

TYPE

Convert TYPE

ROUND

Online Bank Table - Pandas

Scrape online bank data table with Pandas.

READ_HTML

API


NBA JSON to DF - Pandas

Extract data from the NBA API. Parse the requested json file, convert and save in df, analyze and generate insight using Pandas.

Create dict

Convert list to dict

Convert dict to df

Filter df

Search df

CALL API - Pandas

Review response

Create df

MEAN df calculation

matplotlib plot

RandomUser - Pandas

Call API for list, convert list to df. Call API get users information, append to list, convert to df using Pandas

Create user

Generate users

Convert list to df

Generate columns and save to df

APPEND to df

Fruityvice JSON to DF

Send request to API, retrieve json text, normalize/flatten json, convert json to df, analyze

REQUEST.GET

JSON.LOADS

Convert JSON to DF

NORMALIZE/FLATTEN df

Filter/Extract df

.LOC

Jokes JSON to DF

Send request to jokes API, retrieve json text from API, convert json to df, drop columns, analyze and enjoy the jokes

REQUEST.GET

JSON.LOADS

Convert JSON to DF

DROP col