Skip to content

Instantly share code, notes, and snippets.

@cmarat
Created March 19, 2015 08:39
Show Gist options
  • Save cmarat/d5c59c0f0fbb834a74bd to your computer and use it in GitHub Desktop.
Save cmarat/d5c59c0f0fbb834a74bd to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": "",
"signature": "sha256:e3b545bd63c280510a0335a5a4227bb4ab0a0c8086036bdbc699b17a414c603f"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "code",
"collapsed": false,
"input": [
"from nltk.corpus import treebank"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"filter = lambda t: (\n",
" t.label()=='VP'\n",
" and len(t)>2\n",
" and len(t[2])==1\n",
" and t[2].label()[:2]== 'PR'\n",
" and t[0].label()[:2]=='VB'\n",
"# and t[1].label()=='NP'\n",
" and t[1][0].label()!='-NONE-'\n",
" )\n",
"tree_match = lambda s: list(s.subtrees(filter=filter))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 2
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"matches = ((i, ' '.join(s.leaves())) for i, s in enumerate(treebank.parsed_sents()) if tree_match(s))"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for i, s in matches:\n",
" print(\"[{}] {}\".format(i, s))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"[656] `` The effect will be * to pull Asia together not as a common market but as an integrated production zone , '' says 0 *T*-1 Goldman Sachs 's Mr. Hormats .\n",
"[675] `` They do n't want Japan to monopolize the region and sew it up , '' says *T*-1 Chong-sik Lee , professor of East Asian politics at the University of Pennsylvania .\n",
"[746] `` She just never gave it up , '' says *T*-1 Mary Marchand , Mary Beth 's mother ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[933] * Filling out detailed forms about these individuals would tip the IRS off and spark action against the clients , he said 0 *T*-1 ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[1208] In Detroit , a Chrysler Corp. official said 0 the company currently has no rear-seat lap and shoulder belts in its light trucks , but plans *-1 to begin *-2 phasing them in by the end of the 1990 model year ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[1244] `` This is the peak of my wine-making experience , '' Mr. Winiarski declared *T*-1 when he introduced the wine at a dinner in New York *T*-2 , `` and I wanted *-3 to single it out as such . ''\n",
"[1427] Koito has refused *-1 to grant Mr. Pickens seats on its board , *-1 asserting 0 he is a greenmailer trying * to pressure Koito 's other shareholders into * buying him out at a profit ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[1589] But when the contract reopened *T*-1 , the subsequent flood of sell orders that *T*-211 quickly knocked the contract down to the 30-point limit indicated that the intermediate limit of 20 points was needed *-128 *-128 to help keep stock and stock-index futures prices synchronized ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[1775] `` Ideas are going over borders , and there 's no SDI ideological weapon that *T*-245 can shoot them down , '' he told a group of Americans *T*-1 at the U.S. Embassy on Wednesday ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[1960] But can Mr. Hahn carry it off ?"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[2099] `` The purpose of the bill is * to put the brakes on airline acquisitions that *T*-2 would so load a carrier up with debt that it would impede safety or a carrier 's ability * to compete , '' Rep. John Paul Hammerschmidt , -LRB- R. , Ark . -RRB- said *T*-1 ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[2361] Speculation had it that the company was asking $ 100 million *U* for an operation said * to be losing about $ 20 million *U* a year , but others said 0 Hearst might have virtually given the paper away ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[2608] If * slowing things down could reduce volatility , stone tablets should become the trade ticket of the future ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[2856] Last month , Phoenix voters turned thumbs down on a $ 100 million *U* stadium bond and tax proposition ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[3105] Syndicate officials at lead underwriter Salomon Brothers Inc. said 0 the debentures were snapped by up *-1 pension funds , banks , insurance companies and other institutional investors ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[3421] `` I sense that some people are reluctant *-2 to stick their necks out in any aggressive way until after the figures come out , '' said *T*-1 Richard Eakle , president of Eakle Associates , Fair Haven ,"
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n",
"[3751] Programs like Section 8 -LRB- A -RRB- are a little like * leaving gold in the street and then expressing surprise when thieves walk by *-2 to scoop it up *T*-1 ."
]
},
{
"output_type": "stream",
"stream": "stdout",
"text": [
"\n"
]
}
],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment