Last active
November 24, 2016 17:12
-
-
Save willettk/f6850049e279ec1775c8 to your computer and use it in GitHub Desktop.
Oscar winners in acting (as of 2016)
We can make this file beautiful and searchable if this error is corrected: It looks like row 204 should actually have 1 column, instead of 2. in line 203.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Emil Jannings | |
Janet Gaynor | |
Warner Baxter | |
Mary Pickford | |
George Arliss | |
Norma Shearer | |
Lionel Barrymore | |
Marie Dressler | |
Wallace Beery | |
Fredric March | |
Helen Hayes | |
Charles Laughton | |
Katharine Hepburn | |
Clark Gable | |
Claudette Colbert | |
Victor McLaglen | |
Bette Davis | |
Paul Muni | |
Walter Brennan | |
Luise Rainer | |
Gale Sondergaard | |
Spencer Tracy | |
Joseph Schildkraut | |
Luise Rainer | |
Alice Brady | |
Spencer Tracy | |
Walter Brennan | |
Bette Davis | |
Fay Bainter | |
Robert Donat | |
Thomas Mitchell | |
Vivien Leigh | |
Hattie McDaniel | |
James Stewart | |
Walter Brennan | |
Ginger Rogers | |
Jane Darwell | |
Gary Cooper | |
Donald Crisp | |
Joan Fontaine | |
Mary Astor | |
James Cagney | |
Van Heflin | |
Greer Garson | |
Teresa Wright | |
Paul Lukas | |
Charles Coburn | |
Jennifer Jones | |
Katina Paxinou | |
Bing Crosby | |
Barry Fitzgerald | |
Ingrid Bergman | |
Ethel Barrymore | |
Ray Milland | |
James Dunn | |
Joan Crawford | |
Anne Revere | |
Fredric March | |
Harold Russell | |
Olivia de Havilland | |
Anne Baxter | |
Ronald Colman | |
Edmund Gwenn | |
Loretta Young | |
Celeste Holm | |
Laurence Olivier | |
Walter Huston | |
Jane Wyman | |
Claire Trevor | |
Broderick Crawford | |
Dean Jagger | |
Olivia de Havilland | |
Mercedes McCambridge | |
José Ferrer | |
George Sanders | |
Judy Holliday | |
Josephine Hull | |
Humphrey Bogart | |
Karl Malden | |
Vivien Leigh | |
Kim Hunter | |
Gary Cooper | |
Anthony Quinn | |
Shirley Booth | |
Gloria Grahame | |
William Holden | |
Frank Sinatra | |
Audrey Hepburn | |
Donna Reed | |
Marlon Brando | |
Edmond O'Brien | |
Grace Kelly | |
Eva Marie Saint | |
Ernest Borgnine | |
Jack Lemmon | |
Anna Magnani | |
Jo Van Fleet | |
Yul Brynner | |
Anthony Quinn | |
Ingrid Bergman | |
Dorothy Malone | |
Alec Guinness | |
Red Buttons | |
Joanne Woodward | |
Miyoshi Umeki | |
David Niven | |
Burl Ives | |
Susan Hayward | |
Wendy Hiller | |
Charlton Heston | |
Hugh Griffith | |
Simone Signoret | |
Shelley Winters | |
Burt Lancaster | |
Peter Ustinov | |
Elizabeth Taylor | |
Shirley Jones | |
Maximilian Schell | |
George Chakiris | |
Sophia Loren | |
Rita Moreno | |
Gregory Peck | |
Ed Begley | |
Anne Bancroft | |
Patty Duke | |
Sidney Poitier | |
Melvyn Douglas | |
Patricia Neal | |
Margaret Rutherford | |
Rex Harrison | |
Peter Ustinov | |
Julie Andrews | |
Lila Kedrova | |
Lee Marvin | |
Martin Balsam | |
Julie Christie | |
Shelley Winters | |
Paul Scofield | |
Walter Matthau | |
Elizabeth Taylor | |
Sandy Dennis | |
Rod Steiger | |
George Kennedy | |
Katharine Hepburn | |
Estelle Parsons | |
Cliff Robertson | |
Jack Albertson | |
Katharine Hepburn | |
Barbra Streisand | |
Ruth Gordon | |
John Wayne | |
Gig Young | |
Maggie Smith | |
Goldie Hawn | |
George C. Scott | |
John Mills | |
Glenda Jackson | |
Helen Hayes | |
Gene Hackman | |
Ben Johnson | |
Jane Fonda | |
Cloris Leachman | |
Marlon Brando | |
Joel Grey | |
Liza Minnelli | |
Eileen Heckart | |
Jack Lemmon | |
John Houseman | |
Glenda Jackson | |
Tatum O'Neal | |
Art Carney | |
Robert De Niro | |
Ellen Burstyn | |
Ingrid Bergman | |
Jack Nicholson | |
George Burns | |
Louise Fletcher | |
Lee Grant | |
Peter Finch | |
Jason Robards | |
Faye Dunaway | |
Beatrice Straight | |
Richard Dreyfuss | |
Jason Robards | |
Diane Keaton | |
Vanessa Redgrave | |
Jon Voight | |
Christopher Walken | |
Jane Fonda | |
Maggie Smith | |
Dustin Hoffman | |
Melvyn Douglas | |
Sally Field | |
Meryl Streep | |
Robert De Niro | |
Timothy Hutton | |
Sissy Spacek | |
Mary Steenburgen | |
Henry Fonda | |
John Gielgud | |
Katharine Hepburn | |
Maureen Stapleton | |
Ben Kingsley | |
Louis Gossett, Jr. | |
Meryl Streep | |
Jessica Lange | |
Robert Duvall | |
Jack Nicholson | |
Shirley MacLaine | |
Linda Hunt | |
F. Murray Abraham | |
Haing S. Ngor | |
Sally Field | |
Peggy Ashcroft | |
William Hurt | |
Don Ameche | |
Geraldine Page | |
Anjelica Huston | |
Paul Newman | |
Michael Caine | |
Marlee Matlin | |
Dianne Wiest | |
Michael Douglas | |
Sean Connery | |
Cher | |
Olympia Dukakis | |
Dustin Hoffman | |
Kevin Kline | |
Jodie Foster | |
Geena Davis | |
Daniel Day Lewis | |
Denzel Washington | |
Jessica Tandy | |
Brenda Fricker | |
Jeremy Irons | |
Joe Pesci | |
Kathy Bates | |
Whoopi Goldberg | |
Anthony Hopkins | |
Jack Palance | |
Jodie Foster | |
Mercedes Ruehl | |
Al Pacino | |
Gene Hackman | |
Emma Thompson | |
Marisa Tomei | |
Tom Hanks | |
Tommy Lee Jones | |
Holly Hunter | |
Anna Paquin | |
Tom Hanks | |
Martin Landau | |
Jessica Lange | |
Dianne Wiest | |
Nicolas Cage | |
Kevin Spacey | |
Susan Sarandon | |
Mira Sorvino | |
Geoffrey Rush | |
Cuba Gooding, Jr. | |
Frances McDormand | |
Juliette Binoche | |
Jack Nicholson | |
Robin Williams | |
Helen Hunt | |
Kim Basinger | |
Roberto Benigni | |
James Coburn | |
Gwyneth Paltrow | |
Judi Dench | |
Kevin Spacey | |
Michael Caine | |
Hilary Swank | |
Angelina Jolie | |
Russell Crowe | |
Benicio Del Toro | |
Julia Roberts | |
Marcia Gay Harden | |
Denzel Washington | |
Jim Broadbent | |
Halle Berry | |
Jennifer Connelly | |
Adrien Brody | |
Chris Cooper | |
Nicole Kidman | |
Catherine Zeta-Jones | |
Sean Penn | |
Tim Robbins | |
Charlize Theron | |
Renée Zellweger | |
Jamie Foxx | |
Morgan Freeman | |
Hilary Swank | |
Cate Blanchett | |
Philip Seymour Hoffman | |
George Clooney | |
Reese Witherspoon | |
Rachel Weisz | |
Forest Whitaker | |
Alan Arkin | |
Helen Mirren | |
Jennifer Hudson | |
Daniel Day-Lewis | |
Javier Bardem | |
Marion Cotillard | |
Tilda Swinton | |
Sean Penn | |
Heath Ledger | |
Kate Winslet | |
Penélope Cruz | |
Jeff Bridges | |
Christoph Waltz | |
Sandra Bullock | |
Mo'Nique | |
Colin Firth | |
Christian Bale | |
Natalie Portman | |
Melissa Leo | |
Jean Dujardin | |
Christopher Plummer | |
Meryl Streep | |
Octavia Spencer | |
Daniel Day-Lewis | |
Christoph Waltz | |
Jennifer Lawrence | |
Anne Hathaway | |
Matthew McConaughey | |
Jared Leto | |
Cate Blanchett | |
Lupita Nyong'o | |
Eddie Redmayne | |
J.K. Simmons | |
Julianne Moore | |
Patricia Arquette | |
Leonardo DiCaprio | |
Mark Rylance | |
Brie Larson | |
Alicia Vikander |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# The most star-studded\\* movie of all time\n", | |
"### *Kyle Willett*\n", | |
"##### 6 Mar 2016" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"I like playing lots of various trivia games (college Quiz Bowl, LL, various pub quizzes, etc). One of the things I learned early on, especially in Quiz Bowl, was the disproportionate attention paid to award-winners. You're more likely to get asked about (and have been taught in school, or learned otherwhere) things that have won prizes: books that won the Pulitzer, or authors that won the Nobel Prize. This is usually an additional bump *separate from the actual importance of the work*, which usually requires time and evaluation in a historical context. There are prize-winners that are cringingly embarrassing in hindsight (frontal lobotomies winning the Nobel Prize in Medicine, Henry Kissinger winning the Nobel Peace Prize, *Crash* over *Brokeback Mountain* for Best Picture), and those who have been overlooked for what's now pretty much universally recognized as genius-level accomplishments (Einstein never won a Nobel Prize for relativity; *Citizen Kane* didn't win Best Picture or Best Director; Leo Tolstoy, Virginia Woolf, and Chinua Achebe never won the Nobel Prize). \n", | |
"\n", | |
"So while counting people who've only won specific awards is never an exercise in actual quality (there are so, so many biases in play), it does give a discrete data set that can be fun to play with. With the [88th Academy Awards](https://en.wikipedia.org/wiki/88th_Academy_Awards) taking place last week, a question popped into my head that I thought it'd be fun to answer:\n", | |
"\n", | |
"> *\"Which movie had the highest number of Oscar winners in its cast?\"*" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The ground assumptions I made:\n", | |
"\n", | |
"* This only counts individuals who won competitive Academy Awards for [Best Actor](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Actor), [Best Actress](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Actress), [Best Supporting Actor](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Supporting_Actor), or [Best Supporting Actress](https://en.wikipedia.org/wiki/Academy_Award_for_Best_Supporting_Actress). I'm not counting people who only won an [Academy Honorary Award](https://en.wikipedia.org/wiki/Academy_Honorary_Award) or [Academy Juvenile Award](https://en.wikipedia.org/wiki/Academy_Juvenile_Award).\n", | |
"* Only people who acted in a film and won an acting Oscar are considered. Many well-known actors have won Academy Awards in other categories (*e.g.*, **Mel Gibson** and **Kevin Costner** for directing; **Woody Allen**, **Emma Thompson**, and **Matt Damon** for screenplays; **Brad Pitt** and **George Clooney** for producing). However, they're only considered here if they won an acting award.\n", | |
"* Similarly, I'm only counting people who were acting in the film. *Into the Wild* was directed by two-time Oscar-winner **Sean Penn**, but he doesn't appear in the movie. So that movie would only count as having one Oscar winner (**Marcia Gay Harden**).\n", | |
"* No double-counting. Forty actors have multiple acting Academy Awards (**Katharine Hepburn** leading the pack with four), but each actor only counts once as an Oscar-winner for any given film.\n", | |
"* This only counts theatrically-released movies. That excludes TV series, made-for-TV, and direct-to-video.\n", | |
"* Roles credited as appearing as himself/herself are not counted, mostly because they're separately stored in IMDb.\n", | |
"* Finally: this list gives credit for someone winning an Academy Award *at any point in their career*. This means a person didn't have to win their Oscar prior to (or as a result of) the film in question. So *The Sound of Music*, for example, has two Oscar-winners (**Julie Andrews** and **Christopher Plummer**), even though only Andrews had an Oscar prior to the movie being released in 1965 (Plummer wouldn't win one for another 47 years). " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The definitive source for this will be the [Internet Movie Database (IMDb)](http://www.imdb.com/), which has more than 5.4 million actors and 17 million movies/TV shows as of 2016. There's a Python wrapper to access the database called [IMDbPY](http://imdbpy.sourceforge.net/), which we'll use here. The package has some frustrating limitations (it's not clear whether the package or the IMDb API are the cause), but it's sufficient to answer this question. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 33, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"# Import the packages that we'll need\n", | |
"\n", | |
"%matplotlib inline\n", | |
"from imdb import IMDb\n", | |
"import numpy as np\n", | |
"from matplotlib import pyplot as plt" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"First, I need a list of all the Oscar winners. I went to the horse's mouth and used the [official database](http://awardsdatabase.oscars.org) from the Academy of Motion Pictures Arts and Sciences. I did a Basic Search with Award Category \"Acting ...(all)\" and selected \"Winners Only\". \n", | |
"\n", | |
"This gives a list of all Oscar winners - while the list looks nice in the browser, though, it isn't well-formatted for analysis (and with no options to do something like JSON or CSV output). The URL for the list also includes a link to a Javascript cursor that times out quickly and isn't archivable. So, the messy option: copy the page's source code, delete all lines that don't have actors' names on them (in VIM, something like `g!/BSNomination/d`), and then do some visual block cuts to get rid of the remaining HTML. I also took the opportunity to delete the special acting awards in this list (there were 14 citations as of 2016).\n", | |
"\n", | |
"I saved the resulting list as a CSV file and stored it online as a gist. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# Download the list of winners\n", | |
"\n", | |
"import urllib2\n", | |
"\n", | |
"gist_url = \"https://gist.githubusercontent.com/willettk/f6850049e279ec1775c8\"\n", | |
"raw_extension = \"raw/f18316b1b86ffd74cb3dfb74f73b947be48d2a93/oscar_winners.csv\"\n", | |
"txt = urllib2.urlopen(\"{0}/{1}\".format(gist_url,raw_extension)).read()\n", | |
"winners = txt.split(\"\\n\")" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 15, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"1st Academy Awards: ['Emil Jannings', 'Janet Gaynor']\n", | |
"88th Academy Awards: ['Leonardo DiCaprio', 'Mark Rylance', 'Brie Larson', 'Alicia Vikander']\n" | |
] | |
} | |
], | |
"source": [ | |
"# Check to see that data is there:\n", | |
"\n", | |
"print \"1st Academy Awards: {0}\".format(winners[:2]) # No Supporting Actor/Actress awards until 1937.\n", | |
"print \"88th Academy Awards: {0}\".format(winners[-4:])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now to find all the movies for everyone in the list of Oscar winners. This can be done by querying the IMDb API." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 21, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# Use the web version of the IMDb database. I tried installing a local SQL copy of the database,\n", | |
"# but operations were ludicrously slow for searching and retrieving data. Will have to brave\n", | |
"# the rate limits of the online API.\n", | |
"\n", | |
"ia = IMDb()\n", | |
"\n", | |
"def get_full_record(name):\n", | |
"\n", | |
" results = ia.search_person(name)\n", | |
" '''\n", | |
" We're going to assume that the first result in the list (which sorted by IMDb for \"relevance) is\n", | |
" actually the actor/actress we want, and not someone with a similar name. This should mostly be \n", | |
" a safe bet for an Oscar-winning performer, although not certain. \n", | |
" \n", | |
" There is a function in IMDbPY for get_person_awards(), but it doesn't seem to work. We could \n", | |
" search for the matching movie for which they won their Oscar, but that would require \n", | |
" an additional table. For now, going with the top result.\n", | |
" '''\n", | |
" person = results[0]\n", | |
" ia.update(person)\n", | |
"\n", | |
" return person" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 22, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# Get data for all actors/actresses\n", | |
"\n", | |
"# IMPORTANT: do not run this query more than once if at all possible. The IMDb site limits either the\n", | |
"# total number of queries (or the query rate; not sure which) that you can run for the API.\n", | |
"# Too many and you'll start getting empty responses and have to wait a day (or risk being blocked).\n", | |
"\n", | |
"data = [get_full_record(name) for name in winners]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Alicia Vikander\n" | |
] | |
} | |
], | |
"source": [ | |
"# Look at an example\n", | |
"\n", | |
"ex = data[-1]\n", | |
"\n", | |
"print ex" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 47, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"<class 'imdb.Person.Person'>\n", | |
"name Alicia Vikander\n", | |
"archive footage [<Movie id:0081857[http] title:_Entertainment Tonight (2015)_>, <Movie id:0247094[http] title:_Extra (2016)_>]\n", | |
"self [<Movie id:0081857[http] title:_Entertainment Tonight (2015)_>, <Movie id:0930831[http] title:_Gomorron (2011)_>, <Movie id:4991632[http] title:_The Oscars (2016)_>, <Movie id:0124932[http] title:_20/20 (2016)_>, <Movie id:0320037[http] title:_Jimmy Kimmel Live! (2016)_>, <Movie id:5421678[http] title:_22nd Annual Screen Actors Guild Awards (2016)_>, <Movie id:4346344[http] title:_E! Live from the Red Carpet (2016)_>, <Movie id:5352686[http] title:_21st Annual Critics' Choice Awards (II) (2016)_>, <Movie id:0247094[http] title:_Extra (2016)_>, <Movie id:5363742[http] title:_2016 Golden Globe Arrivals Special (2016)_>, <Movie id:4399942[http] title:_73rd Golden Globe Awards (2016)_>, <Movie id:4280606[http] title:_The Late Late Show with James Corden (2016)_>, <Movie id:2163227[http] title:_CBS This Morning (2015)_>, <Movie id:0192897[http] title:_Film 2016 (2015)_>, <Movie id:3444938[http] title:_The Tonight Show Starring Jimmy Fallon (2015)_>, <Movie id:0044298[http] title:_Today (2015)_>, <Movie id:0911896[http] title:_Made in Hollywood (2015)_>, <Movie id:0390699[http] title:_Días de cine (2015)_>, <Movie id:0072506[http] title:_Good Morning America (2015)_>, <Movie id:3513388[http] title:_Late Night with Seth Meyers (2015)_>, <Movie id:0305056[http] title:_Last Call with Carson Daly (2015)_>, <Movie id:1637574[http] title:_Conan (2015)_>, <Movie id:3412000[http] title:_IMDb: What to Watch (2015)_>, <Movie id:3453002[http] title:_The EE British Academy Film Awards (2014)_>, <Movie id:0111920[http] title:_Cinema 3 (2013)_>, <Movie id:1366792[http] title:_Skavlan (2012)_>, <Movie id:5332066[http] title:_Småstjärnorna (1997)_>, <Movie id:5262674[http] title:_Ex Machina: Behind the Scenes Vignettes (2015)_>, <Movie id:5262634[http] title:_Through the Looking Glass: Making 'Ex Machina' (2015)_>, <Movie id:2673622[http] title:_Anna Karenina: A Story of Epic Love (2013)_>, <Movie id:2673658[http] title:_Creating the Extraordinary World of Anna Karenina (2013)_>, <Movie id:2673692[http] title:_Keira Knightley: Becoming Anna (2013)_>, <Movie id:4621016[http] title:_Ingrid Bergman in Her Own Words (2015)_>]\n", | |
"mini biography [u'Alicia Amanda Vikander (born 3 October 1988) is a Swedish actress. Vikander began her career by appearing in Swedish short films and television series, most notably in the popular TV drama Andra Avenyn. She made her feature film debut in the film Pure, for which she won the Guldbagge Award for Best Actress. Vikander gained international attention when she appeared in the 2012 adaptation of Anna Karenina, co-starred in the Academy Award-nominated Danish film A Royal Affair and in the Julian Assange biopic The Fifth Estate.In 2015, she portrayed Vera Brittain in Testament of Youth and an AI in Ex Machina, for which she has received a Golden Globe nomination. Vikander also portrayed painter Gerda Wegener in The Danish Girl, receiving Golden Globe and SAG nominations.::doctoracula-32567']\n", | |
"actress [<Movie id:1377379[http] title:_Susans längtan (2009)_>, <Movie id:1494794[http] title:_My Name Is Love (V) (2008)_>, <Movie id:0954345[http] title:_The Rain (2007)_>, <Movie id:1366371[http] title:_Darkness of Truth (2007)_>, <Movie id:0997023[http] title:_Höök (2008)_>, <Movie id:1087819[http] title:_Second Avenue (2007)_>, <Movie id:0997267[http] title:_Levande föda (2007)_>, <Movie id:0473576[http] title:_En decemberdröm (2005)_>, <Movie id:0385405[http] title:_The Befallen (2003)_>, <Movie id:0997473[http] title:_Min balsamerade mor (2002)_>, <Movie id:3563262[http] title:_Submergence (2017)_>, <Movie id:2547584[http] title:_The Light Between Oceans (2016)_>, <Movie id:4196776[http] title:_Jason Bourne (2016)_>, <Movie id:0491203[http] title:_Tulip Fever (2016)_>, <Movie id:2503944[http] title:_Burnt (I) (2015)_>, <Movie id:0810819[http] title:_The Danish Girl (2015)_>, <Movie id:1638355[http] title:_The Man from U.N.C.L.E. (2015)_>, <Movie id:0470752[http] title:_Ex Machina (2015)_>, <Movie id:1121096[http] title:_Seventh Son (I) (2014)_>, <Movie id:2452200[http] title:_Son of a Gun (2014)_>, <Movie id:1441953[http] title:_Testament of Youth (2014)_>, <Movie id:2363178[http] title:_Hotell (2013)_>, <Movie id:1837703[http] title:_The Fifth Estate (2013)_>, <Movie id:1781769[http] title:_Anna Karenina (I) (2012)_>, <Movie id:1276419[http] title:_A Royal Affair (2012)_>, <Movie id:1815782[http] title:_The Crown Jewels (2011)_>, <Movie id:1483753[http] title:_Pure (2009)_>]\n", | |
"height 5' 5½\" (1.66 m)\n", | |
"birth notes Gothenburg, Västra Götalands län, Sweden\n", | |
"headshot http://ia.media-imdb.com/images/M/MV5BMjI4ODAzMjg3MF5BMl5BanBnXkFtZTgwNDUyOTYyNzE@._V1_UY317_CR3,0,214,317_AL_.jpg\n", | |
"birth name Vikander, Alicia Amanda\n", | |
"birth date 1988\n", | |
"canonical name Vikander, Alicia\n", | |
"long imdb name Alicia Vikander\n", | |
"long imdb canonical name Vikander, Alicia\n", | |
"full-size headshot http://ia.media-imdb.com/images/M/MV5BMjI4ODAzMjg3MF5BMl5BanBnXkFtZTgwNDUyOTYyNzE@._V1_UY317_CR3,0,214,317_AL_.jpg\n" | |
] | |
} | |
], | |
"source": [ | |
"print type(ex)\n", | |
"for k in ex.keys():\n", | |
" print k,ex[k]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 199, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Average of 75 acting appearances for each Oscar winner.\n" | |
] | |
}, | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAF/CAYAAABNHW40AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGM1JREFUeJzt3XuQZGd53/Hvb6VIBgSSMGhFrIAgYMtgp4RjVMIiZjAX\nbZIKAoU7MWAMRYVrIspGYIfdVTkFIrEqlGOc4qYsFBfLFCDkUKyExZACvJJAWrSAUCBBwhB2RQxB\nXBxsaZ/8cc5o+52d2TmzOz090/39VHVt9+nu08/bZ3Z+857T/ZxUFZIkLdgy6QIkSRuLwSBJahgM\nkqSGwSBJahgMkqSGwSBJaow1GJKcmOS6JDcl2Zdke7/81CRXJ7k1ye4kJ4+zDknScBn39xiS3Luq\nfpLkOOCzwKuBfwn8dVW9JcnrgFOr6uKxFiJJGmTsu5Kq6if91ROB44ECLgB29ct3AU8bdx2SpGHG\nHgxJtiS5CdgPXFNVNwBbq+oAQFXtB04bdx2SpGHWY8ZwsKoeDZwBnJPkUXSzhuZh465DkjTM8ev1\nQlV1Z5J5YBtwIMnWqjqQ5HTgjqWek8TAkKSjUFU52ueO+1NJD1j4xFGSewFPBm4BPga8qH/YC4Er\nl1tHVU3tZfv27ROvwfE5Nsc3fZdjNe4Zw4OAXUm20IXQn1bVx5PsAa5I8mLgduBZY65DkjTQWIOh\nqvYBv7LE8u8BTxrna0uSjo7ffJ6gubm5SZcwVtM8vmkeGzi+WTf2L7gdiyS1kevbVNIfh/L9lKZe\nEmqjHnyWJG0+BoMkqWEwSJIaBoMkqWEwSJIa69YSQxPmp5EkDeSMQZLUMBgkSQ2DQZLUMBgkSQ2D\nQZLUMBhmRXKoX5IkHYHBIElqGAySpIbBIElqGAySpIYtMSbo3O27D1u2Z+f5E6hEkg4xGGaFvZIk\nDeSuJElSw2CQJDUMBklSw2CQJDUMBklSw2CYFfZKkjSQwSBJahgMkqSGwSBJavjN5xmzuA2HLTgk\nLeaMQZLUMBhmRRXnvvETk65C0iZgMEiSGgaDJKlhMEiSGgaDJKlhMEiSGgbDrEjYc8m2SVchaRMw\nGCRJDYNBktQwGCRJDYNBktQwGCRJDYNhVtgrSdJABoMkqWEwSJIaBoMkqTHWYEhyRpJrk3w5yb4k\nr+qXb0/yrSQ39he/kitJG8S4T+15F3BRVe1NchLwhSTX9PddVlWXjfn1JUmrNNYZQ1Xtr6q9/fUf\nAbcAP9ffnXG+thaxV5KkgdbtGEOSM4Gzgev6Ra9MsjfJO5OcvF51SJKObF2Cod+N9CHgNf3M4W3A\nw6rqbGA/4C4lSdogxn2MgSTH04XCe6vqSoCq+u7IQ94BXLXc83fs2HHP9bm5Oebm5sZSpyRtVvPz\n88zPz6/Z+sYeDMC7ga9U1VsXFiQ5var29zcvBL603JNHg0GSdLjFfzTv3LnzmNY31mBIch7wfGBf\nkpuAAt4APC/J2cBB4DbgZeOsQ5I03FiDoao+Cxy3xF027VlvVZy7ffekq5C0CfjNZ0lSw2CQJDUM\nBklSw2CQJDUMBklSw2CYFfZKkjSQwSBJaqzHN5+1Bpb6DsKenedPoBJJ084ZgySpYTBIkhoGgySp\nYTDMiirOfaMtqiStzGCQJDUMBklSw2CQJDUMBklSw2CQJDUMhllhryRJAxkMkqSGwSBJahgMkqSG\nwSBJahgMkqSGwTAr7JUkaSCDQZLUMBgkSQ2DQZLUMBgkSQ2DQZLUOH7SBah17vbd41lxwh7wk0mS\nVuSMQZLUMBgkSQ2DQZLUMBgkSQ2DQZLUMBhmhb2SJA1kMEiSGgaDJKlhMEiSGgaDJKlhMEiSGgbD\nrEjYc8m2SVchaRMwGCRJDYNBktQwGCRJDYNBktQwGCRJjbEGQ5Izklyb5MtJ9iV5db/81CRXJ7k1\nye4kJ4+zDmGvJEmDjXvGcBdwUVU9Cngs8IokZwEXA5+sql8ArgVeP+Y6JEkDjTUYqmp/Ve3tr/8I\nuAU4A7gA2NU/bBfwtHHWIUkabt2OMSQ5Ezgb2ANsraoD0IUHcNp61SFJOrJ1CYYkJwEfAl7Tzxxq\n0UMW35YkTcjx436BJMfThcJ7q+rKfvGBJFur6kCS04E7lnv+jh077rk+NzfH3NzcGKuVpM1nfn6e\n+fn5NVtfqsb7x3qS9wD/p6ouGll2KfC9qro0yeuAU6vq4iWeW+Oub5LO3b77mJ6/Z+f5wx+cdK+5\n6JNJq1qHpE0hCVWVo33+WGcMSc4Dng/sS3IT3S6jNwCXAlckeTFwO/CscdYhSRpurMFQVZ8Fjlvm\n7ieN87UlSUfHbz5LkhoGgySpYTBIkhoGw6ywV5KkgQwGSVLDYJAkNQwGSVLDYJAkNQwGSVLDYJgV\nCXsu2TbpKiRtAgaDJKlhMEiSGgaDJKlhMEiSGgaDJKlhMMwKeyVJGshgkCQ1DAZJUsNgkCQ1DAZJ\nUsNgkCQ1DIZZYa8kSQMZDJKkhsEgSWoYDJKkhsEgSWoYDJKkhsEwK+yVJGkgg0GS1DAYJEkNg0GS\n1DAYJEkNg0GS1DAYZoW9kiQNZDBIkhoGgySpYTBIkhoGgySpMSgYkpw3ZJkkafMbOmP4o4HLtFHZ\nK0nSQMcf6c4kjwV+DXhgkotG7rofcNw4C5MkTcYRgwE4ATipf9x9R5bfCTxjXEVJkibniMFQVZ8G\nPp3kv1bV7etUkyRpglaaMSw4McnbgTNHn1NVvzGOoiRJkzM0GP4M+C/AO4G7x1eOJGnShgbDXVX1\nJ2OtROOVsAf8ZJKkFQ39uOpVSV6e5EFJ7r9wGWtlkqSJGBoMLwR+B/gc8IX+8vmVnpTkXUkOJLl5\nZNn2JN9KcmN/seWnJG0gg3YlVdVDj3L9l9N9Ee49i5ZfVlWXHeU6JUljNCgYkrxgqeVVtfgX/uL7\nP5PkIUutcsjrSpLW39CDz48Zuf4zwBOBGzl8JjDUK5P8Jt3uqNdW1Q+Ocj2SpDU2dFfSq0ZvJzkF\n+OBRvubbgEuqqpL8AXAZ8NvLPXjHjh33XJ+bm2Nubu4oX3Zyzt2+e9IldL2SNkIdktbc/Pw88/Pz\na7a+oTOGxX4MHNVxh6r67sjNdwBXHenxo8EgSTrc4j+ad+7ceUzrG3qM4Sqg+pvHAb8IXDHwNcLI\nMYUkp1fV/v7mhcCXBq5HkrQOhs4Y/uPI9buA26vqWys9Kcn7gTngZ5N8E9gOPCHJ2cBB4DbgZasp\nWJI0XkOPMXw6yVYOHYT+2sDnPW+JxZcPrE2SNAFDz+D2LOB64JnAs4Drkth2W5Km0NBdSb8HPKaq\n7gBI8kDgk8CHxlWY1pi9kiQNNLQlxpaFUOj99SqeK0naRIbOGD6RZDfwgf72s4GPj6ckSdIkrXTO\n54cDW6vqd5JcCDyuv+svgfeNuzhJ0vpbacbwn4DXA1TVh4EPAyT55f6+fzHW6iRJ626l4wRbq2rf\n4oX9sjPHUpEkaaJWmjGccoT77rWWhWjMJtAraanX27Pz/HWtQdLqrTRj+HySly5emOQldCfrkSRN\nmZVmDP8G+EiS53MoCH4VOAF4+jgLkyRNxhGDoaoOAL+W5AnAL/WL/1tVXTv2yiRJEzG0V9KngE+N\nuRZJ0gbgt5clSQ2DYVYk7Llk26SrkLQJGAySpIbBIElqGAySpIbBIElqDG27PRNs4XDIer8XvvfS\nxuGMYVZUefY2SYMYDJKkhsEgSWoYDJKkhsEgSWoYDJKkhsEwK+yVJGkgg0GS1DAYJEkNg0GS1LAl\nxhpbqrXDer/WalpJrGe9kjYHZwySpIbBMCvslSRpIINBktQwGCRJDYNBktQwGCRJDYNBktQwGGaF\nvZIkDWQwSJIaBoMkqWFLjCm0VJuLPROoYym24JA2PmcMkqSGwSBJargraUbYJ0nSUM4YJEkNg0GS\n1BhrMCR5V5IDSW4eWXZqkquT3Jpkd5KTx1mDJGl1xj1juBxYfDqxi4FPVtUvANcCrx9zDZKkVRhr\nMFTVZ4DvL1p8AbCrv74LeNo4a5Akrc4kjjGcVlUHAKpqP3DaBGqYOXsu2WavJEmDbISDzzXpAiRJ\nh0ziewwHkmytqgNJTgfuONKDd+zYcc/1ubk55ubmjrkA2zJImibz8/PMz8+v2frWIxjSXxZ8DHgR\ncCnwQuDKIz15NBgkSYdb/Efzzp07j2l94/646vuBzwE/n+SbSX4LeDPw5CS3Ak/sb0uSNoixzhiq\n6nnL3PWkcb6uJOno2StpRtgrSdJQG+FTSZKkDcRgkCQ1DAZJUsNgkCQ1DAZJUsNgmBH2SpI0lB9X\n1TEbV4uR5da7Z+fiTu6S1pIzBklSw2CQJDUMBklSw2CQJDU8+Dwj7JUkaShnDJKkhsEgSWoYDJKk\nhsEgSWoYDJKkhsEwI+yVJGkoP66qqWa/JWn1nDFIkhoGgySpYTBIkhoGgySp4cHnGWGvJElDOWOQ\nJDUMBklSw2CQJDUMBklSw2CQJDX8VNKMWOiTdCyfTlquvYSk6eKMQZLUMBgkSQ2DQZLUMBgkSQ2D\nQZLU8FNJM8JeSZKGcsYgSWoYDJKkhsEgSWoYDJKkhgefV2AbiM3B7SStHWcMM2LPJdvu6ZckSUdi\nMEiSGgaDJKlhMEiSGgaDJKkxsU8lJbkN+AFwEPi7qjpnUrVIkg6Z5MdVDwJzVfX9CdYwM+yVJGmo\nSe5KyoRfX5K0hEn+Yi7gmiQ3JHnpBOuQJI2Y5K6k86rqO0keSBcQt1TVZyZYjySJCQZDVX2n//e7\nST4CnAMcFgw7duy45/rc3Bxzc3PrVKEkbQ7z8/PMz8+v2fpSVWu2ssEvmtwb2FJVP0pyH+BqYGdV\nXb3ocTWO+uyrs7nt2Xn+YctWu02XWoc0LZJQVTna509qxrAV+EiS6mt43+JQ0Npa6JPkp5MkrWQi\nwVBV3wDOnsRrS5KOzI+LSpIaBoMkqWEwSJIaBoMkqeGpPWeEn0aSNJQzBklSw2CQJDWmfleS33Ke\nPuPapsutdy2+Jb3Uuv329dryPV47zhgkSQ2DQZLUMBhmxJ5Ltt3TL0mSjsRgkCQ1DAZJUsNgkCQ1\nDAZJUsNgkCQ1pv4LburYK0nSUM4YJEkNZwzSUViL9hlr0drDlg8aB2cMkqSGwSBJahgMkqSGwTAj\n7JUkaSiDQZLUMBgkSQ2DQZLUMBgkSQ2DQZLU8JvPM8JeSZKG2pTBsFQrAVsDaDVW045iLVpXjIv/\nFzQO7kqSJDUMBklSw2CQJDUMBklSw2CYEfZKkjSUwSBJahgMkqSGwSBJahgMkqSGwSBJamzKlhha\nPXslSRpqaoJhI/ez0ezYCD+Hy9Wwmh5KazGOpV5vLWpbC+N6j9ZiHBvhPXJXkiSpYTBIkhoGgySp\nYTBIkhoGw4ywV5KkoSYWDEm2Jflqkv+R5HWTqkOS1JpIMCTZAvxn4HzgUcBzk5w1iVom6c7bvjjp\nEsZqmsc3zWMDxzfrJjVjOAf4WlXdXlV/B3wQuGBCtUzMnbfdPOkSxmqaxzfNYwPHN+smFQw/B/zV\nyO1v9cskSRPmwWdJUiNVtf4vmpwL7Kiqbf3ti4GqqksXPW79i5OkKVBVOdrnTioYjgNuBZ4IfAe4\nHnhuVd2y7sVIkhoTaaJXVXcneSVwNd3urHcZCpK0MUxkxiBJ2rg25MHnafzyW5LbknwxyU1Jru+X\nnZrk6iS3Jtmd5ORJ1zlUknclOZDk5pFly44nyeuTfC3JLUmeMpmqh1tmfNuTfCvJjf1l28h9m2Z8\nSc5Icm2SLyfZl+TV/fKp2H5LjO9V/fJp2X4nJrmu/12yL8n2fvnabb+q2lAXurD6OvAQ4O8Be4Gz\nJl3XGozrfwGnLlp2KfC7/fXXAW+edJ2rGM/jgLOBm1caD/BI4Ca6XZdn9ts3kx7DUYxvO3DREo/9\nxc00PuB04Oz++kl0x/vOmpbtd4TxTcX262u+d//vccAeuu+Grdn224gzhmn98ls4fIZ2AbCrv74L\neNq6VnQMquozwPcXLV5uPE8FPlhVd1XVbcDX6LbzhrXM+KDbjotdwCYaX1Xtr6q9/fUfAbcAZzAl\n22+Z8S18T2rTbz+AqvpJf/VEul/4xRpuv40YDNP65bcCrklyQ5KX9Mu2VtUB6H6YgdMmVt3aOG2Z\n8Szept9m827TVybZm+SdI1P1TTu+JGfSzYz2sPzP4zSM77p+0VRsvyRbktwE7AeuqaobWMPttxGD\nYVqdV1W/Avwz4BVJ/gldWIyatk8CTNt43gY8rKrOpvsP+YcTrueYJDkJ+BDwmv4v66n6eVxifFOz\n/arqYFU9mm6md06SR7GG228jBsO3gQeP3D6jX7apVdV3+n+/C3yUbip3IMlWgCSnA3dMrsI1sdx4\nvg38g5HHbcptWlXfrX6nLfAODk3HN934khxP90vzvVV1Zb94arbfUuObpu23oKruBOaBbazh9tuI\nwXAD8PAkD0lyAvAc4GMTrumYJLl3/9cLSe4DPAXYRzeuF/UPeyFw5ZIr2LhCu892ufF8DHhOkhOS\nPBR4ON2XGje6Znz9f7YFFwJf6q9vxvG9G/hKVb11ZNk0bb/Dxjct2y/JAxZ2gyW5F/BkuuMoa7f9\nJn10fZkj7tvoPknwNeDiSdezBuN5KN2nq26iC4SL++X3Bz7Zj/Vq4JRJ17qKMb0f+N/AT4FvAr8F\nnLrceIDX030a4hbgKZOu/yjH9x7g5n5bfpRun+6mGx9wHnD3yM/kjf3/uWV/HqdkfNOy/X65H9Pe\nfjy/1y9fs+3nF9wkSY2NuCtJkjRBBoMkqWEwSJIaBoMkqWEwSJIaBoMkqWEwCIAkB5P8h5Hbr03y\nxjVa9+VJLlyLda3wOs9I8pUkf3GM67kgyVkjt3cm+Y1jr1DaHAwGLfgpcGGS+0+6kFHpTgM71G8D\nL6mqJx7jyz4NeNTCjaraXlXXHuM6N5QkR30+YE0/g0EL7gLeDly0+I7Ff/En+WH/7+OTzCf5aJKv\nJ3lTkuf1JxH5Yv/1+wVP7jvLfjXJP++fvyXJW/rH703y0pH1/vckVwJfXqKe5ya5ub+8qV/27+jO\nofCuJJcuevx9knwyyef7up46ct8LcugESruSPJauTfFb+pO5PHR0/Em+kWRHki/0z/v5fvkD+pOk\n7EvyjnQnZjosZJO8Lcn1GTnBysh6L+3HtCfJw0be+z9ZxXu35Fj7FjNf7ce4DzhjhVqWGuN9kry7\nr3Fvkqf3y5+c5HP9a/5pknv3y9+c5Ev9Y9+y+L3QBjbpr3d72RgX4E66k5p8A7gv8Frgjf19lwMX\njj62//fxwPfo2vueQNcifXt/36uBy0ae//H++sPpWgCfALwUeEO//AS6PlkP6df7Q+DBS9T5IOB2\nuq//bwH+Anhqf9+ngEcv8ZwtwEn99Z+lO98HdLOCr9KfQIm+hcAS473ndv/+vLy//q+Bt/fX/wh4\nXX/9fLqWDPdfopZTRmr6FPBLI+tdaJXym8BVR/neLTfWh9CF/2MG1rLUGN+8sE372yf3r/Fp4F79\nst8Ffr/fPl8deez9Jv0z7mX4xRmD7lFda+JdwGtW8bQbquqOqvpb4H/S9WiBrifUmSOPu6J/ja/3\njzuLrpngC9L1lb+O7pfJI/rHX19V31zi9R4DfKqqvldVB4H3Ab8+cv9Su0i2AG9K8kW6XjJ/P8lp\nwBOAP6uq7/e1/d+BY/5I/+8XRsb4OLqTSlFVu1n6JD/QNTP7Al0Pn0f2lwUf7P/9AHDuyPLVvHdb\ngDcvMVaA26vr2z+klqXG+CTgjxceUFU/6Ot8JPDZvpYX0HVH/gHwN+nOe/B04G+WeT+0AR0/6QK0\n4byVrkHX5SPL7qLf7djvmz5h5L6fjlw/OHL7IO3P12hTrvS3A7yqqq4ZLSDJ44EfH6HG1e4ffz7w\nALrZxMEk3wB+5ijXBYfGeDfL/x86bL3pThrzWuAfV9WdSS4fqQPa92i56yu9dy+k+yt+qbH+eORx\nK9UyZIwL9VxdVc9fYrznAE8Engm8sr+uTcAZgxYEoP/r+Qq6A7kLbgN+tb9+Ad25uFfrmen8Q7pu\ns7cCu4GXp+udT5JHLOyfPoLrgV9Pcv90B6afS9eP/khOBu7of1E+gW63CsC1wDMWjgUkObVf/kPg\nfqsbHp8Fnt2v5ynAKUs85n7Aj4Afpuub/08X3f/s/t/nAH85snw1791yY4U2rFaqZSnXAK+4Z2XJ\nKXRnfjuvr22hxfwj0rWXP6WqPkF33OofDVi/NghnDFow+lfpH9L9Ahg9qcmV/a6C3Sz/1/yRWvV+\nk+6X+n2Bl1XV3yZ5J91uihv7mcgdrHDe66ran+RiDoXBn1fVn6/w+u8Drup3r3yervUwVfWVJP8e\n+HSSu+h2qbyYbpfOO5K8CngGy//1Pmon8P4k/4rul/p+uoAZrf3mJHv71/8r4DOL1nFqX+P/owu8\nBat575Yc6+LaV6hluTH+AfDH/cHru4CdVfXRJC8CPpDkxP65v9+P/cokC7OQf7vMOrUB2XZbWgPp\nTip1d1XdneRc4G3Vncp16PO/Qbdb53uLll9OdyD6w2tbsbQ8ZwzS2ngwcEWSLXT751+6yucv9xea\nf7lp3TljkCQ1PPgsSWoYDJKkhsEgSWoYDJKkhsEgSWoYDJKkxv8HJVBea0DIohsAAAAASUVORK5C\nYII=\n", | |
"text/plain": [ | |
"<matplotlib.figure.Figure at 0x12e868cd0>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# How many movies did the average Oscar-winner act in?\n", | |
"\n", | |
"movie_count = []\n", | |
"for d in data:\n", | |
" try:\n", | |
" movies = d['actor']\n", | |
" except KeyError:\n", | |
" movies = d['actress']\n", | |
" movie_count.append(len(movies))\n", | |
"\n", | |
"medcount = np.median(movie_count)\n", | |
"print \"Average of {0:.0f} acting appearances for each Oscar winner.\".format(medcount)\n", | |
"\n", | |
"fig = plt.figure(figsize=(6,6))\n", | |
"ax = fig.add_subplot(111)\n", | |
"\n", | |
"ax.hist(movie_count,bins=np.arange(60)*5)\n", | |
"ax.axvline(medcount,ls='--',color='r')\n", | |
"ax.set_xlabel(\"Number of acting appearances\")\n", | |
"ax.set_ylabel(\"Count\");" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 62, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# Make a big list of every acting appearance for every Oscar winner\n", | |
"\n", | |
"all_movies = []\n", | |
"for d in data:\n", | |
" try:\n", | |
" movies = d['actor']\n", | |
" except KeyError:\n", | |
" movies = d['actress']\n", | |
" for m in movies:\n", | |
" all_movies.append(m)\n", | |
"\n", | |
"# Count how often each movie appears in the master list\n", | |
"c = Counter(all_movies)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 80, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"(<Movie id:0361185[http] title:_Freedom: A History of Us (2003)_>, 23)\n", | |
"(<Movie id:0048893[http] title:_Playhouse 90 (1957)_>, 20)\n", | |
"(<Movie id:0046637[http] title:_Producers' Showcase (1955)_>, 16)\n", | |
"(<Movie id:0056742[http] title:_Bob Hope Presents the Chrysler Theatre (1964)_>, 15)\n", | |
"(<Movie id:0045395[http] title:_General Electric Theater (1956)_>, 12)\n", | |
"(<Movie id:0042141[http] title:_Robert Montgomery Presents (1952)_>, 12)\n", | |
"(<Movie id:0041024[http] title:_The Ford Television Theatre (1954)_>, 11)\n", | |
"(<Movie id:0048893[http] title:_Playhouse 90 (1958)_>, 11)\n", | |
"(<Movie id:0048893[http] title:_Playhouse 90 (1959)_>, 11)\n", | |
"(<Movie id:0446859[http] title:_Play of the Week (1960)_>, 11)\n" | |
] | |
} | |
], | |
"source": [ | |
"for x in c.most_common(10):\n", | |
" print x" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So these should be the top movies with the most Oscar winners - *Freedom: A History of Us* apparently had 23 Oscar winners appear in it. But let's look closer." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"m = ia.search_movie(\"Freedom: A History of Us\")\n", | |
"freedom = m[0]\n", | |
"ia.update(freedom)\n", | |
"\n", | |
"# Who was in it?\n", | |
"\n", | |
"freedom_actors = []\n", | |
"for d in data:\n", | |
" try:\n", | |
" movies = d['actor']\n", | |
" except KeyError:\n", | |
" movies = d['actress']\n", | |
" if freedom in movies:\n", | |
" print d\n", | |
" freedom_actors.append(d)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So there are at least a couple problems with this. First, there are clearly actors showing up more than once for a single film." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 160, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Only 18 unique Oscar-winning actors in this film.\n" | |
] | |
} | |
], | |
"source": [ | |
"print \"Only {0} unique Oscar-winning actors in this film.\".format(len(set(freedom_actors)))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Still quite a lot, but I need to check for uniqueness from now on. In addition, there's the issue of category:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 78, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"tv series\n" | |
] | |
} | |
], | |
"source": [ | |
"print freedom['kind']" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"As said above, I'm only counting theatrical movies for this. The top 10 hits in the list above are dominated by television productions, with a huge emphasis on anthology series of 1950s and 60s like *Playhouse 90*. These shows had rotating casts, with new people every week; while the total amount of talent is impressive, it doesn't fit the criteria we're looking for. So I need to limit the results to theatrically released movies.\n", | |
"\n", | |
"The result from IMDb *should* tell me what kind of feature it is. However, there's a problem with the current data." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 105, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Counter({u'movie': 27})\n" | |
] | |
} | |
], | |
"source": [ | |
"movies = ex['actress']\n", | |
"kinds = [m['kind'] for m in movies]\n", | |
"\n", | |
"print Counter(kinds)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"According to the filmography attached to the actor object, every appearance is a theatrical movie. But when I run a different IMDb search, the type changes. Example:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 95, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"u'tv series'" | |
] | |
}, | |
"execution_count": 95, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"movie7 = ia.get_movie(movies[7].movieID)\n", | |
"movie7['kind']" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So I need to double-check the status of each movie in our top list, since the results I currently have can't be trusted. Since this involves querying the IMDb API again (and I don't want to overload it), I'll only do this for potential movies with 6 or more Oscar-winning actors." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 197, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"210 works with 6 or more actors.\n" | |
] | |
} | |
], | |
"source": [ | |
"sixplus = []\n", | |
"nlim = 6\n", | |
"for movie in c:\n", | |
" if c[movie] >= nlim:\n", | |
" sixplus.append(movie)\n", | |
" \n", | |
"print \"{0} works with {1} or more actors.\".format(len(sixplus),nlim)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 104, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"# Only run this cell once, if at all possible.\n", | |
"\n", | |
"movies_only = []\n", | |
"for m in sixplus:\n", | |
" movie = ia.get_movie(m.movieID)\n", | |
" if movie['kind'] == 'movie':\n", | |
" movies_only.append(movie)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 106, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"107 works with 6 or more actors are theatrically-released movies.\n" | |
] | |
} | |
], | |
"source": [ | |
"print \"{0} works with {1} or more actors are theatrically-released movies.\".format(len(movies_only),n)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Check again to make sure I don't have doubled-up actors (like this one):" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 186, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"# Store final counts, title, and actors in a dictionary\n", | |
"final_counts = {}\n", | |
"\n", | |
"for m in movies_only:\n", | |
" actors = []\n", | |
" for d in data:\n", | |
" try:\n", | |
" movies = d['actor']\n", | |
" except KeyError:\n", | |
" movies = d['actress']\n", | |
" # Movie ID is more reliable than just the Movie object itself\n", | |
" movie_ids = [x.movieID for x in movies]\n", | |
" if m.movieID in movie_ids:\n", | |
" actors.append(d['name'])\n", | |
" s = set(actors)\n", | |
" n = len(s)\n", | |
" d2 = {m['title']:list(s)}\n", | |
" if final_counts.has_key(n):\n", | |
" final_counts[n].append(d2)\n", | |
" else:\n", | |
" final_counts[n] = [d2]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 187, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"The most Oscar winners in a single film is 8.\n", | |
"3 films share this distinction.\n", | |
"\n", | |
"\n", | |
"\"Around the World in Eighty Days\"\n", | |
"\tCharles Coburn\n", | |
"\tJohn Mills\n", | |
"\tRonald Colman\n", | |
"\tFrank Sinatra\n", | |
"\tVictor McLaglen\n", | |
"\tJohn Gielgud\n", | |
"\tShirley MacLaine\n", | |
"\tDavid Niven\n", | |
"\n", | |
"\"The Greatest Story Ever Told\"\n", | |
"\tJoseph Schildkraut\n", | |
"\tJosé Ferrer\n", | |
"\tMartin Landau\n", | |
"\tShelley Winters\n", | |
"\tJohn Wayne\n", | |
"\tCharlton Heston\n", | |
"\tVan Heflin\n", | |
"\tSidney Poitier\n", | |
"\n", | |
"\"Hamlet\"\n", | |
"\tRobin Williams\n", | |
"\tJohn Mills\n", | |
"\tJack Lemmon\n", | |
"\tKate Winslet\n", | |
"\tJulie Christie\n", | |
"\tJudi Dench\n", | |
"\tJohn Gielgud\n", | |
"\tCharlton Heston\n" | |
] | |
} | |
], | |
"source": [ | |
"# Print the final results\n", | |
"\n", | |
"nmax = max(final_counts.keys())\n", | |
"\n", | |
"print \"The most Oscar winners in a single film is {0}.\".format(nmax)\n", | |
"print \"{0} films share this distinction.\\n\".format(len(final_counts[nmax]))\n", | |
"\n", | |
"\n", | |
"for x in final_counts[nmax]:\n", | |
" title = x.keys()[0]\n", | |
" print '\\n\"{0}\"'.format(title)\n", | |
" for films in x[title]:\n", | |
" print '\\t{0}'.format(films.encode('utf-8'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"So our answer is **8 Oscar winners in one film**, in a three-way tie between *Around the World in Eighty Days*, *The Greatest Story Ever Told*, and *Hamlet*. \n", | |
"\n", | |
"### *Around the World in Eighty Days* (1956)\n", | |
"\n", | |
"*Around the World in Eighty Days* won Best Picture in 1956, beating both *The Ten Commandments* and *The King and I*. It was notable for the huge number of cameos in the film, with over 40 Hollywood celebrities making short appearances. **David Niven** and **Shirley MacLaine** are the only two Oscar winners in the film with significant screen time. Almost the entire cast has long-since retired or passed away in the sixty years since the movie was released, so it's almost certain to stay at its current number of Oscar-winners.\n", | |
"\n", | |
"Of the eight Oscar-winners, 4 (**Colman, Coburn, Sinatra, McLaglen**) had already won their award by the time they appeared in the film, and 4 (**Niven, MacLaine, Gielgud, Mills**) won their award afterwards. \n", | |
"\n", | |
"### *The Greatest Story Ever Told* (1965)\n", | |
"\n", | |
"*The Greatest Story Ever Told* was also known for its large cast, although **Charlton Heston**, **José Ferrer**, **Martin Landau**, and **Joseph Schildkraut** all had significant, non-cameo parts. It was nominated for five Oscars, but ended up winning none (and wasn't even nominated for Best Picture). Lead actor **Max von Sydow** is still active, though well into his 80s, and was nominated for his second Academy Award in 2012. **Angela Lansbury** also appeared in the film and has been nominated for Best Supporting Actress three times. Either could possibly still push this film to a record 9. \n", | |
"\n", | |
"Of the eight Oscar-winners, 5 (**Heston, Ferrer, Schildkraut, Heflin, Poitier**) had already won their award by the time they appeared in the film, and 2 (**Landau, Wayne**) won their award afterwards. **Shelley Winters** won two Oscars, one before and one after appearing in this film. \n", | |
"\n", | |
"### *Hamlet* (1996)\n", | |
"\n", | |
"*Hamlet* was nominated for four Oscars (also winning none and not nominated for Best Picture). **Richard Attenborough** appears in this film and did win an Oscar, but for directing (*Gandhi* in 1983). Director and star of the film **Kenneth Branagh** has been nominated for five Oscars in his career, including two for acting, although he hasn't yet won. Another possible candidate for becoming the lone recordholder someday.\n", | |
"\n", | |
"Of the eight Oscar-winners, 5 (**Christie, Lemmon, Heston, Gielgud, Mills**) had already won their award by the time they appeared in the film, and 3 (**Winslet, Williams, Dench**) won their award afterwards. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"It's interesting that none of these movies, crammed to the brim with award-winners and future award-winners, had an Oscar-winning performance in this film." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Runners-up" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 198, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"\t 6 films have 7 Oscar-winning actors in them:\n", | |
"\t\tMain Street to Broadway\n", | |
"\t\tThe Stolen Jools\n", | |
"\t\tHow the West Was Won\n", | |
"\t\tPepe\n", | |
"\t\tIn This Our Life\n", | |
"\t\tThe Swarm\n", | |
"\t13 films have 6 Oscar-winning actors in them:\n", | |
"\t\tBreakdowns of 1938\n", | |
"\t\tGone with the Wind\n", | |
"\t\tPrêt-à-Porter\n", | |
"\t\tMurder on the Orient Express\n", | |
"\t\tA Time to Kill\n", | |
"\t\tBen-Hur: A Tale of the Christ\n", | |
"\t\tThe Good Shepherd\n", | |
"\t\tA Bridge Too Far\n", | |
"\t\tVariety Girl\n", | |
"\t\tForever and a Day\n", | |
"\t\tThe Longest Day\n", | |
"\t\tThe First Wives Club\n", | |
"\t\tNine\n" | |
] | |
} | |
], | |
"source": [ | |
"for n in np.arange(nmax,nlim,-1)-1:\n", | |
" mt = [x.keys()[0] for x in final_counts[n]]\n", | |
" mtstring = '\"'+('\", \"'.join(mt))+'\"'\n", | |
" print \"\\t{0:2d} films have {1} Oscar-winning actors in them:\".format(len(mt),n)\n", | |
" for mts in mt:\n", | |
" print \"\\t\\t{0}\".format(mts.encode('utf-8'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Notes:\n", | |
"\n", | |
"* None of the films with 7 Oscar-winning actors have good odds to move up. *The Swarm* (1978) is the only movie in the set made after 1962, meaning almost all cast members from these films are either retired or dead. \n", | |
"* Of the films with 6 Oscar-winning actors (so far), *Prêt-à-Porter* (1994), *The First Wives Club* (1996), *A Time to Kill* (1996), *The Good Shepherd* (2006), and *Nine* (2009), are all relatively recent and could have actors with long careers ahead of them.\n", | |
"* *Prêt-à-Porter* [released as *Ready to Wear (Prêt-à-Porter)* in the US] had **Cher** appearing in the movie as herself. If this is counted as a role, then it would bring the total number of Oscar-winning actors up to 7. \n", | |
"* The film *Nine* was heavily marketed as starring six actors who had *already* won Oscars: **Daniel Day-Lewis**, **Sophia Loren**, **Marion Cotillard**, **Penelope Cruz**, **Nicole Kidman**, and **Judi Dench**. This is a possible record for award-winners at the time of filming, although I haven't checked it explicitly." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.12" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment