Movie Data Analysis (HTML)

Author

Keith Galli

Overview

Code
import pandas as pd
import matplotlib.pyplot as plt
from helpers import plot_release_year_distribution, create_keyword_wordcloud, create_genre_distribution

df = pd.read_csv("./data/TMDB-Small.csv")

#df

#df.head()

Distribution by year

Code
plot_release_year_distribution(df)

Movie Release Year Distribution

Plotly Scatter Plot

Code
import plotly.express as px

df['primary_genre'] = df['genres'].str.split(',').str[0].str.strip()

fig = px.scatter(df, x='vote_count', y='vote_average', hover_data=['title'], color='primary_genre', title='Vote Count vs Vote Average')
fig.show()

Word Cloud

Code
create_keyword_wordcloud(df)

Genre Distribution

Code
create_genre_distribution(df)

Random Movie

Code
# Random Movie
movie = df.sample(1)

url = f"https://image.tmdb.org/t/p/w600_and_h900_bestv2{movie['poster_path'].values[0]}"

Title: Yesterday

Description: Jack Malik is a struggling singer-songwriter in an English seaside town whose dreams of fame are rapidly fading, despite the fierce devotion and support of his childhood best friend, Ellie. After a freak bus accident during a mysterious global blackout, Jack wakes up to discover that he’s the only person on Earth who can remember The Beatles.