Skip to content

Instantly share code, notes, and snippets.

@willettk
Last active November 24, 2016 17:12
Show Gist options
  • Save willettk/f6850049e279ec1775c8 to your computer and use it in GitHub Desktop.
Save willettk/f6850049e279ec1775c8 to your computer and use it in GitHub Desktop.
Oscar winners in acting (as of 2016)
We can make this file beautiful and searchable if this error is corrected: It looks like row 204 should actually have 1 column, instead of 2. in line 203.
Emil Jannings
Janet Gaynor
Warner Baxter
Mary Pickford
George Arliss
Norma Shearer
Lionel Barrymore
Marie Dressler
Wallace Beery
Fredric March
Helen Hayes
Charles Laughton
Katharine Hepburn
Clark Gable
Claudette Colbert
Victor McLaglen
Bette Davis
Paul Muni
Walter Brennan
Luise Rainer
Gale Sondergaard
Spencer Tracy
Joseph Schildkraut
Luise Rainer
Alice Brady
Spencer Tracy
Walter Brennan
Bette Davis
Fay Bainter
Robert Donat
Thomas Mitchell
Vivien Leigh
Hattie McDaniel
James Stewart
Walter Brennan
Ginger Rogers
Jane Darwell
Gary Cooper
Donald Crisp
Joan Fontaine
Mary Astor
James Cagney
Van Heflin
Greer Garson
Teresa Wright
Paul Lukas
Charles Coburn
Jennifer Jones
Katina Paxinou
Bing Crosby
Barry Fitzgerald
Ingrid Bergman
Ethel Barrymore
Ray Milland
James Dunn
Joan Crawford
Anne Revere
Fredric March
Harold Russell
Olivia de Havilland
Anne Baxter
Ronald Colman
Edmund Gwenn
Loretta Young
Celeste Holm
Laurence Olivier
Walter Huston
Jane Wyman
Claire Trevor
Broderick Crawford
Dean Jagger
Olivia de Havilland
Mercedes McCambridge
José Ferrer
George Sanders
Judy Holliday
Josephine Hull
Humphrey Bogart
Karl Malden
Vivien Leigh
Kim Hunter
Gary Cooper
Anthony Quinn
Shirley Booth
Gloria Grahame
William Holden
Frank Sinatra
Audrey Hepburn
Donna Reed
Marlon Brando
Edmond O'Brien
Grace Kelly
Eva Marie Saint
Ernest Borgnine
Jack Lemmon
Anna Magnani
Jo Van Fleet
Yul Brynner
Anthony Quinn
Ingrid Bergman
Dorothy Malone
Alec Guinness
Red Buttons
Joanne Woodward
Miyoshi Umeki
David Niven
Burl Ives
Susan Hayward
Wendy Hiller
Charlton Heston
Hugh Griffith
Simone Signoret
Shelley Winters
Burt Lancaster
Peter Ustinov
Elizabeth Taylor
Shirley Jones
Maximilian Schell
George Chakiris
Sophia Loren
Rita Moreno
Gregory Peck
Ed Begley
Anne Bancroft
Patty Duke
Sidney Poitier
Melvyn Douglas
Patricia Neal
Margaret Rutherford
Rex Harrison
Peter Ustinov
Julie Andrews
Lila Kedrova
Lee Marvin
Martin Balsam
Julie Christie
Shelley Winters
Paul Scofield
Walter Matthau
Elizabeth Taylor
Sandy Dennis
Rod Steiger
George Kennedy
Katharine Hepburn
Estelle Parsons
Cliff Robertson
Jack Albertson
Katharine Hepburn
Barbra Streisand
Ruth Gordon
John Wayne
Gig Young
Maggie Smith
Goldie Hawn
George C. Scott
John Mills
Glenda Jackson
Helen Hayes
Gene Hackman
Ben Johnson
Jane Fonda
Cloris Leachman
Marlon Brando
Joel Grey
Liza Minnelli
Eileen Heckart
Jack Lemmon
John Houseman
Glenda Jackson
Tatum O'Neal
Art Carney
Robert De Niro
Ellen Burstyn
Ingrid Bergman
Jack Nicholson
George Burns
Louise Fletcher
Lee Grant
Peter Finch
Jason Robards
Faye Dunaway
Beatrice Straight
Richard Dreyfuss
Jason Robards
Diane Keaton
Vanessa Redgrave
Jon Voight
Christopher Walken
Jane Fonda
Maggie Smith
Dustin Hoffman
Melvyn Douglas
Sally Field
Meryl Streep
Robert De Niro
Timothy Hutton
Sissy Spacek
Mary Steenburgen
Henry Fonda
John Gielgud
Katharine Hepburn
Maureen Stapleton
Ben Kingsley
Louis Gossett, Jr.
Meryl Streep
Jessica Lange
Robert Duvall
Jack Nicholson
Shirley MacLaine
Linda Hunt
F. Murray Abraham
Haing S. Ngor
Sally Field
Peggy Ashcroft
William Hurt
Don Ameche
Geraldine Page
Anjelica Huston
Paul Newman
Michael Caine
Marlee Matlin
Dianne Wiest
Michael Douglas
Sean Connery
Cher
Olympia Dukakis
Dustin Hoffman
Kevin Kline
Jodie Foster
Geena Davis
Daniel Day Lewis
Denzel Washington
Jessica Tandy
Brenda Fricker
Jeremy Irons
Joe Pesci
Kathy Bates
Whoopi Goldberg
Anthony Hopkins
Jack Palance
Jodie Foster
Mercedes Ruehl
Al Pacino
Gene Hackman
Emma Thompson
Marisa Tomei
Tom Hanks
Tommy Lee Jones
Holly Hunter
Anna Paquin
Tom Hanks
Martin Landau
Jessica Lange
Dianne Wiest
Nicolas Cage
Kevin Spacey
Susan Sarandon
Mira Sorvino
Geoffrey Rush
Cuba Gooding, Jr.
Frances McDormand
Juliette Binoche
Jack Nicholson
Robin Williams
Helen Hunt
Kim Basinger
Roberto Benigni
James Coburn
Gwyneth Paltrow
Judi Dench
Kevin Spacey
Michael Caine
Hilary Swank
Angelina Jolie
Russell Crowe
Benicio Del Toro
Julia Roberts
Marcia Gay Harden
Denzel Washington
Jim Broadbent
Halle Berry
Jennifer Connelly
Adrien Brody
Chris Cooper
Nicole Kidman
Catherine Zeta-Jones
Sean Penn
Tim Robbins
Charlize Theron
Renée Zellweger
Jamie Foxx
Morgan Freeman
Hilary Swank
Cate Blanchett
Philip Seymour Hoffman
George Clooney
Reese Witherspoon
Rachel Weisz
Forest Whitaker
Alan Arkin
Helen Mirren
Jennifer Hudson
Daniel Day-Lewis
Javier Bardem
Marion Cotillard
Tilda Swinton
Sean Penn
Heath Ledger
Kate Winslet
Penélope Cruz
Jeff Bridges
Christoph Waltz
Sandra Bullock
Mo'Nique
Colin Firth
Christian Bale
Natalie Portman
Melissa Leo
Jean Dujardin
Christopher Plummer
Meryl Streep
Octavia Spencer
Daniel Day-Lewis
Christoph Waltz
Jennifer Lawrence
Anne Hathaway
Matthew McConaughey
Jared Leto
Cate Blanchett
Lupita Nyong'o
Eddie Redmayne
J.K. Simmons
Julianne Moore
Patricia Arquette
Leonardo DiCaprio
Mark Rylance
Brie Larson
Alicia Vikander
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The most star-studded\\* movie of all time\n",
"### *Kyle Willett*\n",
"##### 6 Mar 2016"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I like playing lots of various trivia games (college Quiz Bowl, LL, various pub quizzes, etc). One of the things I learned early on, especially in Quiz Bowl, was the disproportionate attention paid to award-winners. You're more likely to get asked about (and have been taught in school, or learned otherwhere) things that have won prizes: books that won the Pulitzer, or authors that won the Nobel Prize. This is usually an additional bump *separate from the actual importance of the work*, which usually requires time and evaluation in a historical context. There are prize-winners that are cringingly embarrassing in hindsight (frontal lobotomies winning the Nobel Prize in Medicine, Henry Kissinger winning the Nobel Peace Prize, *Crash* over *Brokeback Mountain* for Best Picture), and those who have been overlooked for what's now pretty much universally recognized as genius-level accomplishments (Einstein never won a Nobel Prize for relativity; *Citizen Kane* didn't win Best Picture or Best Director; Leo Tolstoy, Virginia Woolf, and Chinua Achebe never won the Nobel Prize). \n",
"\n",
"So while counting people who've only won specific awards is never an exercise in actual quality (there are so, so many biases in play), it does give a discrete data set that can be fun to play with. With the [88th Academy Awards](https://en.wikipedia.org/wiki/88th_Academy_Awards) taking place last week, a question popped into my head that I thought it'd be fun to answer:\n",
"\n",
"> *\"Which movie had the highest number of Oscar winners in its cast?\"*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The ground assumptions I made:\n",
"\n",
"* This only counts individuals who won competitive Academy Awards for [Best Actor](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Actor), [Best Actress](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Actress), [Best Supporting Actor](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Supporting_Actor), or [Best Supporting Actress](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Supporting_Actress). I'm not counting people who only won an [Academy Honorary Award](https://en.wikipedia.org/wiki/Academy_Honorary_Award) or [Academy Juvenile Award](https://en.wikipedia.org/wiki/Academy_Juvenile_Award).\n",
"* Only people who acted in a film and won an acting Oscar are considered. Many well-known actors have won Academy Awards in other categories (*e.g.*, **Mel Gibson** and **Kevin Costner** for directing; **Woody Allen**, **Emma Thompson**, and **Matt Damon** for screenplays; **Brad Pitt** and **George Clooney** for producing). However, they're only considered here if they won an acting award.\n",
"* Similarly, I'm only counting people who were acting in the film. *Into the Wild* was directed by two-time Oscar-winner **Sean Penn**, but he doesn't appear in the movie. So that movie would only count as having one Oscar winner (**Marcia Gay Harden**).\n",
"* No double-counting. Forty actors have multiple acting Academy Awards (**Katharine Hepburn** leading the pack with four), but each actor only counts once as an Oscar-winner for any given film.\n",
"* This only counts theatrically-released movies. That excludes TV series, made-for-TV, and direct-to-video.\n",
"* Roles credited as appearing as himself/herself are not counted, mostly because they're separately stored in IMDb.\n",
"* Finally: this list gives credit for someone winning an Academy Award *at any point in their career*. This means a person didn't have to win their Oscar prior to (or as a result of) the film in question. So *The Sound of Music*, for example, has two Oscar-winners (**Julie Andrews** and **Christopher Plummer**), even though only Andrews had an Oscar prior to the movie being released in 1965 (Plummer wouldn't win one for another 47 years). "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The definitive source for this will be the [Internet Movie Database (IMDb)](http://www.imdb.com/), which has more than 5.4 million actors and 17 million movies/TV shows as of 2016. There's a Python wrapper to access the database called [IMDbPY](http://imdbpy.sourceforge.net/), which we'll use here. The package has some frustrating limitations (it's not clear whether the package or the IMDb API are the cause), but it's sufficient to answer this question. "
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Import the packages that we'll need\n",
"\n",
"%matplotlib inline\n",
"from imdb import IMDb\n",
"import numpy as np\n",
"from matplotlib import pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, I need a list of all the Oscar winners. I went to the horse's mouth and used the [official database](http://awardsdatabase.oscars.org) from the Academy of Motion Pictures Arts and Sciences. I did a Basic Search with Award Category \"Acting ...(all)\" and selected \"Winners Only\". \n",
"\n",
"This gives a list of all Oscar winners - while the list looks nice in the browser, though, it isn't well-formatted for analysis (and with no options to do something like JSON or CSV output). The URL for the list also includes a link to a Javascript cursor that times out quickly and isn't archivable. So, the messy option: copy the page's source code, delete all lines that don't have actors' names on them (in VIM, something like `g!/BSNomination/d`), and then do some visual block cuts to get rid of the remaining HTML. I also took the opportunity to delete the special acting awards in this list (there were 14 citations as of 2016).\n",
"\n",
"I saved the resulting list as a CSV file and stored it online as a gist. "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Download the list of winners\n",
"\n",
"import urllib2\n",
"\n",
"gist_url = \"https://gist.githubusercontent.com/willettk/f6850049e279ec1775c8\"\n",
"raw_extension = \"raw/f18316b1b86ffd74cb3dfb74f73b947be48d2a93/oscar_winners.csv\"\n",
"txt = urllib2.urlopen(\"{0}/{1}\".format(gist_url,raw_extension)).read()\n",
"winners = txt.split(\"\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1st Academy Awards: ['Emil Jannings', 'Janet Gaynor']\n",
"88th Academy Awards: ['Leonardo DiCaprio', 'Mark Rylance', 'Brie Larson', 'Alicia Vikander']\n"
]
}
],
"source": [
"# Check to see that data is there:\n",
"\n",
"print \"1st Academy Awards: {0}\".format(winners[:2]) # No Supporting Actor/Actress awards until 1937.\n",
"print \"88th Academy Awards: {0}\".format(winners[-4:])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now to find all the movies for everyone in the list of Oscar winners. This can be done by querying the IMDb API."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Use the web version of the IMDb database. I tried installing a local SQL copy of the database,\n",
"# but operations were ludicrously slow for searching and retrieving data. Will have to brave\n",
"# the rate limits of the online API.\n",
"\n",
"ia = IMDb()\n",
"\n",
"def get_full_record(name):\n",
"\n",
" results = ia.search_person(name)\n",
" '''\n",
" We're going to assume that the first result in the list (which sorted by IMDb for \"relevance) is\n",
" actually the actor/actress we want, and not someone with a similar name. This should mostly be \n",
" a safe bet for an Oscar-winning performer, although not certain. \n",
" \n",
" There is a function in IMDbPY for get_person_awards(), but it doesn't seem to work. We could \n",
" search for the matching movie for which they won their Oscar, but that would require \n",
" an additional table. For now, going with the top result.\n",
" '''\n",
" person = results[0]\n",
" ia.update(person)\n",
"\n",
" return person"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Get data for all actors/actresses\n",
"\n",
"# IMPORTANT: do not run this query more than once if at all possible. The IMDb site limits either the\n",
"# total number of queries (or the query rate; not sure which) that you can run for the API.\n",
"# Too many and you'll start getting empty responses and have to wait a day (or risk being blocked).\n",
"\n",
"data = [get_full_record(name) for name in winners]"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Alicia Vikander\n"
]
}
],
"source": [
"# Look at an example\n",
"\n",
"ex = data[-1]\n",
"\n",
"print ex"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'imdb.Person.Person'>\n",
"name Alicia Vikander\n",
"archive footage [<Movie id:0081857[http] title:_Entertainment Tonight (2015)_>, <Movie id:0247094[http] title:_Extra (2016)_>]\n",
"self [<Movie id:0081857[http] title:_Entertainment Tonight (2015)_>, <Movie id:0930831[http] title:_Gomorron (2011)_>, <Movie id:4991632[http] title:_The Oscars (2016)_>, <Movie id:0124932[http] title:_20/20 (2016)_>, <Movie id:0320037[http] title:_Jimmy Kimmel Live! (2016)_>, <Movie id:5421678[http] title:_22nd Annual Screen Actors Guild Awards (2016)_>, <Movie id:4346344[http] title:_E! Live from the Red Carpet (2016)_>, <Movie id:5352686[http] title:_21st Annual Critics' Choice Awards (II) (2016)_>, <Movie id:0247094[http] title:_Extra (2016)_>, <Movie id:5363742[http] title:_2016 Golden Globe Arrivals Special (2016)_>, <Movie id:4399942[http] title:_73rd Golden Globe Awards (2016)_>, <Movie id:4280606[http] title:_The Late Late Show with James Corden (2016)_>, <Movie id:2163227[http] title:_CBS This Morning (2015)_>, <Movie id:0192897[http] title:_Film 2016 (2015)_>, <Movie id:3444938[http] title:_The Tonight Show Starring Jimmy Fallon (2015)_>, <Movie id:0044298[http] title:_Today (2015)_>, <Movie id:0911896[http] title:_Made in Hollywood (2015)_>, <Movie id:0390699[http] title:_Días de cine (2015)_>, <Movie id:0072506[http] title:_Good Morning America (2015)_>, <Movie id:3513388[http] title:_Late Night with Seth Meyers (2015)_>, <Movie id:0305056[http] title:_Last Call with Carson Daly (2015)_>, <Movie id:1637574[http] title:_Conan (2015)_>, <Movie id:3412000[http] title:_IMDb: What to Watch (2015)_>, <Movie id:3453002[http] title:_The EE British Academy Film Awards (2014)_>, <Movie id:0111920[http] title:_Cinema 3 (2013)_>, <Movie id:1366792[http] title:_Skavlan (2012)_>, <Movie id:5332066[http] title:_Småstjärnorna (1997)_>, <Movie id:5262674[http] title:_Ex Machina: Behind the Scenes Vignettes (2015)_>, <Movie id:5262634[http] title:_Through the Looking Glass: Making 'Ex Machina' (2015)_>, <Movie id:2673622[http] title:_Anna Karenina: A Story of Epic Love (2013)_>, <Movie id:2673658[http] title:_Creating the Extraordinary World of Anna Karenina (2013)_>, <Movie id:2673692[http] title:_Keira Knightley: Becoming Anna (2013)_>, <Movie id:4621016[http] title:_Ingrid Bergman in Her Own Words (2015)_>]\n",
"mini biography [u'Alicia Amanda Vikander (born 3 October 1988) is a Swedish actress. Vikander began her career by appearing in Swedish short films and television series, most notably in the popular TV drama Andra Avenyn. She made her feature film debut in the film Pure, for which she won the Guldbagge Award for Best Actress. Vikander gained international attention when she appeared in the 2012 adaptation of Anna Karenina, co-starred in the Academy Award-nominated Danish film A Royal Affair and in the Julian Assange biopic The Fifth Estate.In 2015, she portrayed Vera Brittain in Testament of Youth and an AI in Ex Machina, for which she has received a Golden Globe nomination. Vikander also portrayed painter Gerda Wegener in The Danish Girl, receiving Golden Globe and SAG nominations.::doctoracula-32567']\n",
"actress [<Movie id:1377379[http] title:_Susans längtan (2009)_>, <Movie id:1494794[http] title:_My Name Is Love (V) (2008)_>, <Movie id:0954345[http] title:_The Rain (2007)_>, <Movie id:1366371[http] title:_Darkness of Truth (2007)_>, <Movie id:0997023[http] title:_Höök (2008)_>, <Movie id:1087819[http] title:_Second Avenue (2007)_>, <Movie id:0997267[http] title:_Levande föda (2007)_>, <Movie id:0473576[http] title:_En decemberdröm (2005)_>, <Movie id:0385405[http] title:_The Befallen (2003)_>, <Movie id:0997473[http] title:_Min balsamerade mor (2002)_>, <Movie id:3563262[http] title:_Submergence (2017)_>, <Movie id:2547584[http] title:_The Light Between Oceans (2016)_>, <Movie id:4196776[http] title:_Jason Bourne (2016)_>, <Movie id:0491203[http] title:_Tulip Fever (2016)_>, <Movie id:2503944[http] title:_Burnt (I) (2015)_>, <Movie id:0810819[http] title:_The Danish Girl (2015)_>, <Movie id:1638355[http] title:_The Man from U.N.C.L.E. (2015)_>, <Movie id:0470752[http] title:_Ex Machina (2015)_>, <Movie id:1121096[http] title:_Seventh Son (I) (2014)_>, <Movie id:2452200[http] title:_Son of a Gun (2014)_>, <Movie id:1441953[http] title:_Testament of Youth (2014)_>, <Movie id:2363178[http] title:_Hotell (2013)_>, <Movie id:1837703[http] title:_The Fifth Estate (2013)_>, <Movie id:1781769[http] title:_Anna Karenina (I) (2012)_>, <Movie id:1276419[http] title:_A Royal Affair (2012)_>, <Movie id:1815782[http] title:_The Crown Jewels (2011)_>, <Movie id:1483753[http] title:_Pure (2009)_>]\n",
"height 5' 5½\" (1.66 m)\n",
"birth notes Gothenburg, Västra Götalands län, Sweden\n",
"headshot http://ia.media-imdb.com/images/M/MV5BMjI4ODAzMjg3MF5BMl5BanBnXkFtZTgwNDUyOTYyNzE@._V1_UY317_CR3,0,214,317_AL_.jpg\n",
"birth name Vikander, Alicia Amanda\n",
"birth date 1988\n",
"canonical name Vikander, Alicia\n",
"long imdb name Alicia Vikander\n",
"long imdb canonical name Vikander, Alicia\n",
"full-size headshot http://ia.media-imdb.com/images/M/MV5BMjI4ODAzMjg3MF5BMl5BanBnXkFtZTgwNDUyOTYyNzE@._V1_UY317_CR3,0,214,317_AL_.jpg\n"
]
}
],
"source": [
"print type(ex)\n",
"for k in ex.keys():\n",
" print k,ex[k]"
]
},
{
"cell_type": "code",
"execution_count": 199,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Average of 75 acting appearances for each Oscar winner.\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAF/CAYAAABNHW40AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGM1JREFUeJzt3XuQZGd53/Hvb6VIBgSSMGhFrIAgYMtgp4RjVMIiZjAX\nbZIKAoU7MWAMRYVrIspGYIfdVTkFIrEqlGOc4qYsFBfLFCDkUKyExZACvJJAWrSAUCBBwhB2RQxB\nXBxsaZ/8cc5o+52d2TmzOz090/39VHVt9+nu08/bZ3Z+857T/ZxUFZIkLdgy6QIkSRuLwSBJahgM\nkqSGwSBJahgMkqSGwSBJaow1GJKcmOS6JDcl2Zdke7/81CRXJ7k1ye4kJ4+zDknScBn39xiS3Luq\nfpLkOOCzwKuBfwn8dVW9JcnrgFOr6uKxFiJJGmTsu5Kq6if91ROB44ECLgB29ct3AU8bdx2SpGHG\nHgxJtiS5CdgPXFNVNwBbq+oAQFXtB04bdx2SpGHWY8ZwsKoeDZwBnJPkUXSzhuZh465DkjTM8ev1\nQlV1Z5J5YBtwIMnWqjqQ5HTgjqWek8TAkKSjUFU52ueO+1NJD1j4xFGSewFPBm4BPga8qH/YC4Er\nl1tHVU3tZfv27ROvwfE5Nsc3fZdjNe4Zw4OAXUm20IXQn1bVx5PsAa5I8mLgduBZY65DkjTQWIOh\nqvYBv7LE8u8BTxrna0uSjo7ffJ6gubm5SZcwVtM8vmkeGzi+WTf2L7gdiyS1kevbVNIfh/L9lKZe\nEmqjHnyWJG0+BoMkqWEwSJIaBoMkqWEwSJIa69YSQxPmp5EkDeSMQZLUMBgkSQ2DQZLUMBgkSQ2D\nQZLUMBhmRXKoX5IkHYHBIElqGAySpIbBIElqGAySpIYtMSbo3O27D1u2Z+f5E6hEkg4xGGaFvZIk\nDeSuJElSw2CQJDUMBklSw2CQJDUMBklSw2CYFfZKkjSQwSBJahgMkqSGwSBJavjN5xmzuA2HLTgk\nLeaMQZLUMBhmRRXnvvETk65C0iZgMEiSGgaDJKlhMEiSGgaDJKlhMEiSGgbDrEjYc8m2SVchaRMw\nGCRJDYNBktQwGCRJDYNBktQwGCRJDYNhVtgrSdJABoMkqWEwSJIaBoMkqTHWYEhyRpJrk3w5yb4k\nr+qXb0/yrSQ39he/kitJG8S4T+15F3BRVe1NchLwhSTX9PddVlWXjfn1JUmrNNYZQ1Xtr6q9/fUf\nAbcAP9ffnXG+thaxV5KkgdbtGEOSM4Gzgev6Ra9MsjfJO5OcvF51SJKObF2Cod+N9CHgNf3M4W3A\nw6rqbGA/4C4lSdogxn2MgSTH04XCe6vqSoCq+u7IQ94BXLXc83fs2HHP9bm5Oebm5sZSpyRtVvPz\n88zPz6/Z+sYeDMC7ga9U1VsXFiQ5var29zcvBL603JNHg0GSdLjFfzTv3LnzmNY31mBIch7wfGBf\nkpuAAt4APC/J2cBB4DbgZeOsQ5I03FiDoao+Cxy3xF027VlvVZy7ffekq5C0CfjNZ0lSw2CQJDUM\nBklSw2CQJDUMBklSw2CYFfZKkjSQwSBJaqzHN5+1Bpb6DsKenedPoBJJ084ZgySpYTBIkhoGgySp\nYTDMiirOfaMtqiStzGCQJDUMBklSw2CQJDUMBklSw2CQJDUMhllhryRJAxkMkqSGwSBJahgMkqSG\nwSBJahgMkqSGwTAr7JUkaSCDQZLUMBgkSQ2DQZLUMBgkSQ2DQZLUOH7SBah17vbd41lxwh7wk0mS\nVuSMQZLUMBgkSQ2DQZLUMBgkSQ2DQZLUMBhmhb2SJA1kMEiSGgaDJKlhMEiSGgaDJKlhMEiSGgbD\nrEjYc8m2SVchaRMwGCRJDYNBktQwGCRJDYNBktQwGCRJjbEGQ5Izklyb5MtJ9iV5db/81CRXJ7k1\nye4kJ4+zDmGvJEmDjXvGcBdwUVU9Cngs8IokZwEXA5+sql8ArgVeP+Y6JEkDjTUYqmp/Ve3tr/8I\nuAU4A7gA2NU/bBfwtHHWIUkabt2OMSQ5Ezgb2ANsraoD0IUHcNp61SFJOrJ1CYYkJwEfAl7Tzxxq\n0UMW35YkTcjx436BJMfThcJ7q+rKfvGBJFur6kCS04E7lnv+jh077rk+NzfH3NzcGKuVpM1nfn6e\n+fn5NVtfqsb7x3qS9wD/p6ouGll2KfC9qro0yeuAU6vq4iWeW+Oub5LO3b77mJ6/Z+f5wx+cdK+5\n6JNJq1qHpE0hCVWVo33+WGcMSc4Dng/sS3IT3S6jNwCXAlckeTFwO/CscdYhSRpurMFQVZ8Fjlvm\n7ieN87UlSUfHbz5LkhoGgySpYTBIkhoGw6ywV5KkgQwGSVLDYJAkNQwGSVLDYJAkNQwGSVLDYJgV\nCXsu2TbpKiRtAgaDJKlhMEiSGgaDJKlhMEiSGgaDJKlhMMwKeyVJGshgkCQ1DAZJUsNgkCQ1DAZJ\nUsNgkCQ1DIZZYa8kSQMZDJKkhsEgSWoYDJKkhsEgSWoYDJKkhsEwK+yVJGkgg0GS1DAYJEkNg0GS\n1DAYJEkNg0GS1DAYZoW9kiQNZDBIkhoGgySpYTBIkhoGgySpMSgYkpw3ZJkkafMbOmP4o4HLtFHZ\nK0nSQMcf6c4kjwV+DXhgkotG7rofcNw4C5MkTcYRgwE4ATipf9x9R5bfCTxjXEVJkibniMFQVZ8G\nPp3kv1bV7etUkyRpglaaMSw4McnbgTNHn1NVvzGOoiRJkzM0GP4M+C/AO4G7x1eOJGnShgbDXVX1\nJ2OtROOVsAf8ZJKkFQ39uOpVSV6e5EFJ7r9wGWtlkqSJGBoMLwR+B/gc8IX+8vmVnpTkXUkOJLl5\nZNn2JN9KcmN/seWnJG0gg3YlVdVDj3L9l9N9Ee49i5ZfVlWXHeU6JUljNCgYkrxgqeVVtfgX/uL7\nP5PkIUutcsjrSpLW39CDz48Zuf4zwBOBGzl8JjDUK5P8Jt3uqNdW1Q+Ocj2SpDU2dFfSq0ZvJzkF\n+OBRvubbgEuqqpL8AXAZ8NvLPXjHjh33XJ+bm2Nubu4oX3Zyzt2+e9IldL2SNkIdktbc/Pw88/Pz\na7a+oTOGxX4MHNVxh6r67sjNdwBXHenxo8EgSTrc4j+ad+7ceUzrG3qM4Sqg+pvHAb8IXDHwNcLI\nMYUkp1fV/v7mhcCXBq5HkrQOhs4Y/uPI9buA26vqWys9Kcn7gTngZ5N8E9gOPCHJ2cBB4DbgZasp\nWJI0XkOPMXw6yVYOHYT+2sDnPW+JxZcPrE2SNAFDz+D2LOB64JnAs4Drkth2W5Km0NBdSb8HPKaq\n7gBI8kDgk8CHxlWY1pi9kiQNNLQlxpaFUOj99SqeK0naRIbOGD6RZDfwgf72s4GPj6ckSdIkrXTO\n54cDW6vqd5JcCDyuv+svgfeNuzhJ0vpbacbwn4DXA1TVh4EPAyT55f6+fzHW6iRJ626l4wRbq2rf\n4oX9sjPHUpEkaaJWmjGccoT77rWWhWjMJtAraanX27Pz/HWtQdLqrTRj+HySly5emOQldCfrkSRN\nmZVmDP8G+EiS53MoCH4VOAF4+jgLkyRNxhGDoaoOAL+W5AnAL/WL/1tVXTv2yiRJEzG0V9KngE+N\nuRZJ0gbgt5clSQ2DYVYk7Llk26SrkLQJGAySpIbBIElqGAySpIbBIElqDG27PRNs4XDIer8XvvfS\nxuGMYVZUefY2SYMYDJKkhsEgSWoYDJKkhsEgSWoYDJKkhsEwK+yVJGkgg0GS1DAYJEkNg0GS1LAl\nxhpbqrXDer/WalpJrGe9kjYHZwySpIbBMCvslSRpIINBktQwGCRJDYNBktQwGCRJDYNBktQwGGaF\nvZIkDWQwSJIaBoMkqWFLjCm0VJuLPROoYym24JA2PmcMkqSGwSBJargraUbYJ0nSUM4YJEkNg0GS\n1BhrMCR5V5IDSW4eWXZqkquT3Jpkd5KTx1mDJGl1xj1juBxYfDqxi4FPVtUvANcCrx9zDZKkVRhr\nMFTVZ4DvL1p8AbCrv74LeNo4a5Akrc4kjjGcVlUHAKpqP3DaBGqYOXsu2WavJEmDbISDzzXpAiRJ\nh0ziewwHkmytqgNJTgfuONKDd+zYcc/1ubk55ubmjrkA2zJImibz8/PMz8+v2frWIxjSXxZ8DHgR\ncCnwQuDKIz15NBgkSYdb/Efzzp07j2l94/646vuBzwE/n+SbSX4LeDPw5CS3Ak/sb0uSNoixzhiq\n6nnL3PWkcb6uJOno2StpRtgrSdJQG+FTSZKkDcRgkCQ1DAZJUsNgkCQ1DAZJUsNgmBH2SpI0lB9X\n1TEbV4uR5da7Z+fiTu6S1pIzBklSw2CQJDUMBklSw2CQJDU8+Dwj7JUkaShnDJKkhsEgSWoYDJKk\nhsEgSWoYDJKkhsEwI+yVJGkoP66qqWa/JWn1nDFIkhoGgySpYTBIkhoGgySp4cHnGWGvJElDOWOQ\nJDUMBklSw2CQJDUMBklSw2CQJDX8VNKMWOiTdCyfTlquvYSk6eKMQZLUMBgkSQ2DQZLUMBgkSQ2D\nQZLU8FNJM8JeSZKGcsYgSWoYDJKkhsEgSWoYDJKkhgefV2AbiM3B7SStHWcMM2LPJdvu6ZckSUdi\nMEiSGgaDJKlhMEiSGgaDJKkxsU8lJbkN+AFwEPi7qjpnUrVIkg6Z5MdVDwJzVfX9CdYwM+yVJGmo\nSe5KyoRfX5K0hEn+Yi7gmiQ3JHnpBOuQJI2Y5K6k86rqO0keSBcQt1TVZyZYjySJCQZDVX2n//e7\nST4CnAMcFgw7duy45/rc3Bxzc3PrVKEkbQ7z8/PMz8+v2fpSVWu2ssEvmtwb2FJVP0pyH+BqYGdV\nXb3ocTWO+uyrs7nt2Xn+YctWu02XWoc0LZJQVTna509qxrAV+EiS6mt43+JQ0Npa6JPkp5MkrWQi\nwVBV3wDOnsRrS5KOzI+LSpIaBoMkqWEwSJIaBoMkqeGpPWeEn0aSNJQzBklSw2CQJDWmfleS33Ke\nPuPapsutdy2+Jb3Uuv329dryPV47zhgkSQ2DQZLUMBhmxJ5Ltt3TL0mSjsRgkCQ1DAZJUsNgkCQ1\nDAZJUsNgkCQ1pv4LburYK0nSUM4YJEkNZwzSUViL9hlr0drDlg8aB2cMkqSGwSBJahgMkqSGwTAj\n7JUkaSiDQZLUMBgkSQ2DQZLUMBgkSQ2DQZLU8JvPM8JeSZKG2pTBsFQrAVsDaDVW045iLVpXjIv/\nFzQO7kqSJDUMBklSw2CQJDUMBklSw2CYEfZKkjSUwSBJahgMkqSGwSBJahgMkqSGwSBJamzKlhha\nPXslSRpqaoJhI/ez0ezYCD+Hy9Wwmh5KazGOpV5vLWpbC+N6j9ZiHBvhPXJXkiSpYTBIkhoGgySp\nYTBIkhoGw4ywV5KkoSYWDEm2Jflqkv+R5HWTqkOS1JpIMCTZAvxn4HzgUcBzk5w1iVom6c7bvjjp\nEsZqmsc3zWMDxzfrJjVjOAf4WlXdXlV/B3wQuGBCtUzMnbfdPOkSxmqaxzfNYwPHN+smFQw/B/zV\nyO1v9cskSRPmwWdJUiNVtf4vmpwL7Kiqbf3ti4GqqksXPW79i5OkKVBVOdrnTioYjgNuBZ4IfAe4\nHnhuVd2y7sVIkhoTaaJXVXcneSVwNd3urHcZCpK0MUxkxiBJ2rg25MHnafzyW5LbknwxyU1Jru+X\nnZrk6iS3Jtmd5ORJ1zlUknclOZDk5pFly44nyeuTfC3JLUmeMpmqh1tmfNuTfCvJjf1l28h9m2Z8\nSc5Icm2SLyfZl+TV/fKp2H5LjO9V/fJp2X4nJrmu/12yL8n2fvnabb+q2lAXurD6OvAQ4O8Be4Gz\nJl3XGozrfwGnLlp2KfC7/fXXAW+edJ2rGM/jgLOBm1caD/BI4Ca6XZdn9ts3kx7DUYxvO3DREo/9\nxc00PuB04Oz++kl0x/vOmpbtd4TxTcX262u+d//vccAeuu+Grdn224gzhmn98ls4fIZ2AbCrv74L\neNq6VnQMquozwPcXLV5uPE8FPlhVd1XVbcDX6LbzhrXM+KDbjotdwCYaX1Xtr6q9/fUfAbcAZzAl\n22+Z8S18T2rTbz+AqvpJf/VEul/4xRpuv40YDNP65bcCrklyQ5KX9Mu2VtUB6H6YgdMmVt3aOG2Z\n8Szept9m827TVybZm+SdI1P1TTu+JGfSzYz2sPzP4zSM77p+0VRsvyRbktwE7AeuqaobWMPttxGD\nYVqdV1W/Avwz4BVJ/gldWIyatk8CTNt43gY8rKrOpvsP+YcTrueYJDkJ+BDwmv4v66n6eVxifFOz\n/arqYFU9mm6md06SR7GG228jBsO3gQeP3D6jX7apVdV3+n+/C3yUbip3IMlWgCSnA3dMrsI1sdx4\nvg38g5HHbcptWlXfrX6nLfAODk3HN934khxP90vzvVV1Zb94arbfUuObpu23oKruBOaBbazh9tuI\nwXAD8PAkD0lyAvAc4GMTrumYJLl3/9cLSe4DPAXYRzeuF/UPeyFw5ZIr2LhCu892ufF8DHhOkhOS\nPBR4ON2XGje6Znz9f7YFFwJf6q9vxvG9G/hKVb11ZNk0bb/Dxjct2y/JAxZ2gyW5F/BkuuMoa7f9\nJn10fZkj7tvoPknwNeDiSdezBuN5KN2nq26iC4SL++X3Bz7Zj/Vq4JRJ17qKMb0f+N/AT4FvAr8F\nnLrceIDX030a4hbgKZOu/yjH9x7g5n5bfpRun+6mGx9wHnD3yM/kjf3/uWV/HqdkfNOy/X65H9Pe\nfjy/1y9fs+3nF9wkSY2NuCtJkjRBBoMkqWEwSJIaBoMkqWEwSJIaBoMkqWEwCIAkB5P8h5Hbr03y\nxjVa9+VJLlyLda3wOs9I8pUkf3GM67kgyVkjt3cm+Y1jr1DaHAwGLfgpcGGS+0+6kFHpTgM71G8D\nL6mqJx7jyz4NeNTCjaraXlXXHuM6N5QkR30+YE0/g0EL7gLeDly0+I7Ff/En+WH/7+OTzCf5aJKv\nJ3lTkuf1JxH5Yv/1+wVP7jvLfjXJP++fvyXJW/rH703y0pH1/vckVwJfXqKe5ya5ub+8qV/27+jO\nofCuJJcuevx9knwyyef7up46ct8LcugESruSPJauTfFb+pO5PHR0/Em+kWRHki/0z/v5fvkD+pOk\n7EvyjnQnZjosZJO8Lcn1GTnBysh6L+3HtCfJw0be+z9ZxXu35Fj7FjNf7ce4DzhjhVqWGuN9kry7\nr3Fvkqf3y5+c5HP9a/5pknv3y9+c5Ev9Y9+y+L3QBjbpr3d72RgX4E66k5p8A7gv8Frgjf19lwMX\njj62//fxwPfo2vueQNcifXt/36uBy0ae//H++sPpWgCfALwUeEO//AS6PlkP6df7Q+DBS9T5IOB2\nuq//bwH+Anhqf9+ngEcv8ZwtwEn99Z+lO98HdLOCr9KfQIm+hcAS473ndv/+vLy//q+Bt/fX/wh4\nXX/9fLqWDPdfopZTRmr6FPBLI+tdaJXym8BVR/neLTfWh9CF/2MG1rLUGN+8sE372yf3r/Fp4F79\nst8Ffr/fPl8deez9Jv0z7mX4xRmD7lFda+JdwGtW8bQbquqOqvpb4H/S9WiBrifUmSOPu6J/ja/3\njzuLrpngC9L1lb+O7pfJI/rHX19V31zi9R4DfKqqvldVB4H3Ab8+cv9Su0i2AG9K8kW6XjJ/P8lp\nwBOAP6uq7/e1/d+BY/5I/+8XRsb4OLqTSlFVu1n6JD/QNTP7Al0Pn0f2lwUf7P/9AHDuyPLVvHdb\ngDcvMVaA26vr2z+klqXG+CTgjxceUFU/6Ot8JPDZvpYX0HVH/gHwN+nOe/B04G+WeT+0AR0/6QK0\n4byVrkHX5SPL7qLf7djvmz5h5L6fjlw/OHL7IO3P12hTrvS3A7yqqq4ZLSDJ44EfH6HG1e4ffz7w\nALrZxMEk3wB+5ijXBYfGeDfL/x86bL3pThrzWuAfV9WdSS4fqQPa92i56yu9dy+k+yt+qbH+eORx\nK9UyZIwL9VxdVc9fYrznAE8Engm8sr+uTcAZgxYEoP/r+Qq6A7kLbgN+tb9+Ad25uFfrmen8Q7pu\ns7cCu4GXp+udT5JHLOyfPoLrgV9Pcv90B6afS9eP/khOBu7of1E+gW63CsC1wDMWjgUkObVf/kPg\nfqsbHp8Fnt2v5ynAKUs85n7Aj4Afpuub/08X3f/s/t/nAH85snw1791yY4U2rFaqZSnXAK+4Z2XJ\nKXRnfjuvr22hxfwj0rWXP6WqPkF33OofDVi/NghnDFow+lfpH9L9Ahg9qcmV/a6C3Sz/1/yRWvV+\nk+6X+n2Bl1XV3yZ5J91uihv7mcgdrHDe66ran+RiDoXBn1fVn6/w+u8Drup3r3yervUwVfWVJP8e\n+HSSu+h2qbyYbpfOO5K8CngGy//1Pmon8P4k/4rul/p+uoAZrf3mJHv71/8r4DOL1nFqX+P/owu8\nBat575Yc6+LaV6hluTH+AfDH/cHru4CdVfXRJC8CPpDkxP65v9+P/cokC7OQf7vMOrUB2XZbWgPp\nTip1d1XdneRc4G3Vncp16PO/Qbdb53uLll9OdyD6w2tbsbQ8ZwzS2ngwcEWSLXT751+6yucv9xea\nf7lp3TljkCQ1PPgsSWoYDJKkhsEgSWoYDJKkhsEgSWoYDJKkxv8HJVBea0DIohsAAAAASUVORK5C\nYII=\n",
"text/plain": [
"<matplotlib.figure.Figure at 0x12e868cd0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# How many movies did the average Oscar-winner act in?\n",
"\n",
"movie_count = []\n",
"for d in data:\n",
" try:\n",
" movies = d['actor']\n",
" except KeyError:\n",
" movies = d['actress']\n",
" movie_count.append(len(movies))\n",
"\n",
"medcount = np.median(movie_count)\n",
"print \"Average of {0:.0f} acting appearances for each Oscar winner.\".format(medcount)\n",
"\n",
"fig = plt.figure(figsize=(6,6))\n",
"ax = fig.add_subplot(111)\n",
"\n",
"ax.hist(movie_count,bins=np.arange(60)*5)\n",
"ax.axvline(medcount,ls='--',color='r')\n",
"ax.set_xlabel(\"Number of acting appearances\")\n",
"ax.set_ylabel(\"Count\");"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Make a big list of every acting appearance for every Oscar winner\n",
"\n",
"all_movies = []\n",
"for d in data:\n",
" try:\n",
" movies = d['actor']\n",
" except KeyError:\n",
" movies = d['actress']\n",
" for m in movies:\n",
" all_movies.append(m)\n",
"\n",
"# Count how often each movie appears in the master list\n",
"c = Counter(all_movies)"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(<Movie id:0361185[http] title:_Freedom: A History of Us (2003)_>, 23)\n",
"(<Movie id:0048893[http] title:_Playhouse 90 (1957)_>, 20)\n",
"(<Movie id:0046637[http] title:_Producers' Showcase (1955)_>, 16)\n",
"(<Movie id:0056742[http] title:_Bob Hope Presents the Chrysler Theatre (1964)_>, 15)\n",
"(<Movie id:0045395[http] title:_General Electric Theater (1956)_>, 12)\n",
"(<Movie id:0042141[http] title:_Robert Montgomery Presents (1952)_>, 12)\n",
"(<Movie id:0041024[http] title:_The Ford Television Theatre (1954)_>, 11)\n",
"(<Movie id:0048893[http] title:_Playhouse 90 (1958)_>, 11)\n",
"(<Movie id:0048893[http] title:_Playhouse 90 (1959)_>, 11)\n",
"(<Movie id:0446859[http] title:_Play of the Week (1960)_>, 11)\n"
]
}
],
"source": [
"for x in c.most_common(10):\n",
" print x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So these should be the top movies with the most Oscar winners - *Freedom: A History of Us* apparently had 23 Oscar winners appear in it. But let's look closer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"m = ia.search_movie(\"Freedom: A History of Us\")\n",
"freedom = m[0]\n",
"ia.update(freedom)\n",
"\n",
"# Who was in it?\n",
"\n",
"freedom_actors = []\n",
"for d in data:\n",
" try:\n",
" movies = d['actor']\n",
" except KeyError:\n",
" movies = d['actress']\n",
" if freedom in movies:\n",
" print d\n",
" freedom_actors.append(d)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So there are at least a couple problems with this. First, there are clearly actors showing up more than once for a single film."
]
},
{
"cell_type": "code",
"execution_count": 160,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Only 18 unique Oscar-winning actors in this film.\n"
]
}
],
"source": [
"print \"Only {0} unique Oscar-winning actors in this film.\".format(len(set(freedom_actors)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Still quite a lot, but I need to check for uniqueness from now on. In addition, there's the issue of category:"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tv series\n"
]
}
],
"source": [
"print freedom['kind']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As said above, I'm only counting theatrical movies for this. The top 10 hits in the list above are dominated by television productions, with a huge emphasis on anthology series of 1950s and 60s like *Playhouse 90*. These shows had rotating casts, with new people every week; while the total amount of talent is impressive, it doesn't fit the criteria we're looking for. So I need to limit the results to theatrically released movies.\n",
"\n",
"The result from IMDb *should* tell me what kind of feature it is. However, there's a problem with the current data."
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Counter({u'movie': 27})\n"
]
}
],
"source": [
"movies = ex['actress']\n",
"kinds = [m['kind'] for m in movies]\n",
"\n",
"print Counter(kinds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"According to the filmography attached to the actor object, every appearance is a theatrical movie. But when I run a different IMDb search, the type changes. Example:"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"u'tv series'"
]
},
"execution_count": 95,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"movie7 = ia.get_movie(movies[7].movieID)\n",
"movie7['kind']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So I need to double-check the status of each movie in our top list, since the results I currently have can't be trusted. Since this involves querying the IMDb API again (and I don't want to overload it), I'll only do this for potential movies with 6 or more Oscar-winning actors."
]
},
{
"cell_type": "code",
"execution_count": 197,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"210 works with 6 or more actors.\n"
]
}
],
"source": [
"sixplus = []\n",
"nlim = 6\n",
"for movie in c:\n",
" if c[movie] >= nlim:\n",
" sixplus.append(movie)\n",
" \n",
"print \"{0} works with {1} or more actors.\".format(len(sixplus),nlim)"
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Only run this cell once, if at all possible.\n",
"\n",
"movies_only = []\n",
"for m in sixplus:\n",
" movie = ia.get_movie(m.movieID)\n",
" if movie['kind'] == 'movie':\n",
" movies_only.append(movie)"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"107 works with 6 or more actors are theatrically-released movies.\n"
]
}
],
"source": [
"print \"{0} works with {1} or more actors are theatrically-released movies.\".format(len(movies_only),n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Check again to make sure I don't have doubled-up actors (like this one):"
]
},
{
"cell_type": "code",
"execution_count": 186,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Store final counts, title, and actors in a dictionary\n",
"final_counts = {}\n",
"\n",
"for m in movies_only:\n",
" actors = []\n",
" for d in data:\n",
" try:\n",
" movies = d['actor']\n",
" except KeyError:\n",
" movies = d['actress']\n",
" # Movie ID is more reliable than just the Movie object itself\n",
" movie_ids = [x.movieID for x in movies]\n",
" if m.movieID in movie_ids:\n",
" actors.append(d['name'])\n",
" s = set(actors)\n",
" n = len(s)\n",
" d2 = {m['title']:list(s)}\n",
" if final_counts.has_key(n):\n",
" final_counts[n].append(d2)\n",
" else:\n",
" final_counts[n] = [d2]"
]
},
{
"cell_type": "code",
"execution_count": 187,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The most Oscar winners in a single film is 8.\n",
"3 films share this distinction.\n",
"\n",
"\n",
"\"Around the World in Eighty Days\"\n",
"\tCharles Coburn\n",
"\tJohn Mills\n",
"\tRonald Colman\n",
"\tFrank Sinatra\n",
"\tVictor McLaglen\n",
"\tJohn Gielgud\n",
"\tShirley MacLaine\n",
"\tDavid Niven\n",
"\n",
"\"The Greatest Story Ever Told\"\n",
"\tJoseph Schildkraut\n",
"\tJosé Ferrer\n",
"\tMartin Landau\n",
"\tShelley Winters\n",
"\tJohn Wayne\n",
"\tCharlton Heston\n",
"\tVan Heflin\n",
"\tSidney Poitier\n",
"\n",
"\"Hamlet\"\n",
"\tRobin Williams\n",
"\tJohn Mills\n",
"\tJack Lemmon\n",
"\tKate Winslet\n",
"\tJulie Christie\n",
"\tJudi Dench\n",
"\tJohn Gielgud\n",
"\tCharlton Heston\n"
]
}
],
"source": [
"# Print the final results\n",
"\n",
"nmax = max(final_counts.keys())\n",
"\n",
"print \"The most Oscar winners in a single film is {0}.\".format(nmax)\n",
"print \"{0} films share this distinction.\\n\".format(len(final_counts[nmax]))\n",
"\n",
"\n",
"for x in final_counts[nmax]:\n",
" title = x.keys()[0]\n",
" print '\\n\"{0}\"'.format(title)\n",
" for films in x[title]:\n",
" print '\\t{0}'.format(films.encode('utf-8'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So our answer is **8 Oscar winners in one film**, in a three-way tie between *Around the World in Eighty Days*, *The Greatest Story Ever Told*, and *Hamlet*. \n",
"\n",
"### *Around the World in Eighty Days* (1956)\n",
"\n",
"*Around the World in Eighty Days* won Best Picture in 1956, beating both *The Ten Commandments* and *The King and I*. It was notable for the huge number of cameos in the film, with over 40 Hollywood celebrities making short appearances. **David Niven** and **Shirley MacLaine** are the only two Oscar winners in the film with significant screen time. Almost the entire cast has long-since retired or passed away in the sixty years since the movie was released, so it's almost certain to stay at its current number of Oscar-winners.\n",
"\n",
"Of the eight Oscar-winners, 4 (**Colman, Coburn, Sinatra, McLaglen**) had already won their award by the time they appeared in the film, and 4 (**Niven, MacLaine, Gielgud, Mills**) won their award afterwards. \n",
"\n",
"### *The Greatest Story Ever Told* (1965)\n",
"\n",
"*The Greatest Story Ever Told* was also known for its large cast, although **Charlton Heston**, **José Ferrer**, **Martin Landau**, and **Joseph Schildkraut** all had significant, non-cameo parts. It was nominated for five Oscars, but ended up winning none (and wasn't even nominated for Best Picture). Lead actor **Max von Sydow** is still active, though well into his 80s, and was nominated for his second Academy Award in 2012. **Angela Lansbury** also appeared in the film and has been nominated for Best Supporting Actress three times. Either could possibly still push this film to a record 9. \n",
"\n",
"Of the eight Oscar-winners, 5 (**Heston, Ferrer, Schildkraut, Heflin, Poitier**) had already won their award by the time they appeared in the film, and 2 (**Landau, Wayne**) won their award afterwards. **Shelley Winters** won two Oscars, one before and one after appearing in this film. \n",
"\n",
"### *Hamlet* (1996)\n",
"\n",
"*Hamlet* was nominated for four Oscars (also winning none and not nominated for Best Picture). **Richard Attenborough** appears in this film and did win an Oscar, but for directing (*Gandhi* in 1983). Director and star of the film **Kenneth Branagh** has been nominated for five Oscars in his career, including two for acting, although he hasn't yet won. Another possible candidate for becoming the lone recordholder someday.\n",
"\n",
"Of the eight Oscar-winners, 5 (**Christie, Lemmon, Heston, Gielgud, Mills**) had already won their award by the time they appeared in the film, and 3 (**Winslet, Williams, Dench**) won their award afterwards. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's interesting that none of these movies, crammed to the brim with award-winners and future award-winners, had an Oscar-winning performance in this film."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Runners-up"
]
},
{
"cell_type": "code",
"execution_count": 198,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\t 6 films have 7 Oscar-winning actors in them:\n",
"\t\tMain Street to Broadway\n",
"\t\tThe Stolen Jools\n",
"\t\tHow the West Was Won\n",
"\t\tPepe\n",
"\t\tIn This Our Life\n",
"\t\tThe Swarm\n",
"\t13 films have 6 Oscar-winning actors in them:\n",
"\t\tBreakdowns of 1938\n",
"\t\tGone with the Wind\n",
"\t\tPrêt-à-Porter\n",
"\t\tMurder on the Orient Express\n",
"\t\tA Time to Kill\n",
"\t\tBen-Hur: A Tale of the Christ\n",
"\t\tThe Good Shepherd\n",
"\t\tA Bridge Too Far\n",
"\t\tVariety Girl\n",
"\t\tForever and a Day\n",
"\t\tThe Longest Day\n",
"\t\tThe First Wives Club\n",
"\t\tNine\n"
]
}
],
"source": [
"for n in np.arange(nmax,nlim,-1)-1:\n",
" mt = [x.keys()[0] for x in final_counts[n]]\n",
" mtstring = '\"'+('\", \"'.join(mt))+'\"'\n",
" print \"\\t{0:2d} films have {1} Oscar-winning actors in them:\".format(len(mt),n)\n",
" for mts in mt:\n",
" print \"\\t\\t{0}\".format(mts.encode('utf-8'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notes:\n",
"\n",
"* None of the films with 7 Oscar-winning actors have good odds to move up. *The Swarm* (1978) is the only movie in the set made after 1962, meaning almost all cast members from these films are either retired or dead. \n",
"* Of the films with 6 Oscar-winning actors (so far), *Prêt-à-Porter* (1994), *The First Wives Club* (1996), *A Time to Kill* (1996), *The Good Shepherd* (2006), and *Nine* (2009), are all relatively recent and could have actors with long careers ahead of them.\n",
"* *Prêt-à-Porter* [released as *Ready to Wear (Prêt-à-Porter)* in the US] had **Cher** appearing in the movie as herself. If this is counted as a role, then it would bring the total number of Oscar-winning actors up to 7. \n",
"* The film *Nine* was heavily marketed as starring six actors who had *already* won Oscars: **Daniel Day-Lewis**, **Sophia Loren**, **Marion Cotillard**, **Penelope Cruz**, **Nicole Kidman**, and **Judi Dench**. This is a possible record for award-winners at the time of filming, although I haven't checked it explicitly."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment