More up-to-date material on this topic can be found here.
From WRDS:
The WRDS-created linking dataset (ccmxpf_linktable) has been deprecated. It will continue to be created for a transition period of 1 year. SAS programmers should use the Link History dataset (ccmxpf_lnkhist) from CRSP.
And from here:
LINKPRIM clarifies the link's relationship to Compustat's marked primary security within the related range. "P" indicates a primary link marker, as identified by Compustat in monthly security data. "C" indicates a primary link marker, as identified by CRSP to resolve ranges of overlapping or missing primary markers from Compustat in order to produce one primary security throughout the company history. "J" indicates a joiner secondary issue of a company, identified by Compustat in monthly security data.
So let's check it out. Given any GVKEY and a date, is there only one PERMNO that is matched with linkprim IN ('P', 'C')
?
ccmxpf_lnkhist <-
tbl(pg, sql("SELECT * FROM crsp.ccmxpf_lnkhist"))
gvkey_permno <-
ccmxpf_lnkhist %>%
filter(linkprim %in% c("C", "P"),
!is.na(lpermno))
# Look for overlapping date ranges
gvkey_permno %>%
group_by(gvkey) %>%
arrange(gvkey, linkdt) %>%
mutate(lead_linkdt = lead(linkdt),
lag_linkenddt = lag(linkenddt)) %>%
filter(linkenddt >= lead_linkdt | lag_linkenddt >= linkdt)
Output:
> gvkey_permno %>%
+ group_by(gvkey) %>%
+ arrange(gvkey, linkdt) %>%
+ mutate(lead_linkdt = lead(linkdt),
+ lag_linkenddt = lag(linkenddt)) %>%
+ filter(linkenddt >= lead_linkdt | lag_linkenddt >= linkdt)
Source: query [?? x 10]
Database: postgres 9.6.1 [igow@iangow.me:5432/crsp]
Groups: gvkey
gvkey linkprim liid linktype lpermno lpermco linkdt linkenddt
<chr> <chr> <chr> <chr> <dbl> <dbl> <date> <date>
1 017196 P 01C LX 77571 28850 1989-11-17 1989-11-29
2 017196 P 01 LC 77571 28850 1989-11-17 1994-07-21
# ... with 2 more variables: lead_linkdt <date>, lag_linkenddt <date>
So there is just one case of overlapping dates, and it has linkprim
equal to P
in both rows. But the answer is effectively "no" because you get the same PERMNO (77571
) in either row.
Note that the question here begins with a GVKEY
and asks "given this GVKEY, which PERMNO provides the correct security-related information (return, stock price) for a given date?" The answer is given by lpermno
above. Note that there is no mention of datadate
(from Compustat) here. But it may be that I am interested in security information on datadate
and thus that would drive the selection of lpermno
.
Note that the vast majority of GVKEYs map to just one PERMNO:
> gvkey_permno %>%
+ select(gvkey, lpermno) %>%
+ distinct() %>%
+ group_by(gvkey) %>%
+ mutate(num_permnos = n()) %>%
+ ungroup() %>%
+ count(num_permnos) %>%
+ arrange(num_permnos)
Source: query [?? x 2]
Database: postgres 9.6.1 [igow@iangow.me:5432/crsp]
num_permnos n
<dbl> <dbl>
1 1 26520
2 2 1496
3 3 129
4 4 8
5 5 5
I think I've looked into the case with 5 PERMNOs before. If I recall correctly, it's a total mess with tracking stock, spin-offs, etc. But one observation doesn't matter much.
> stocknames <- tbl(pg, sql("SELECT * FROM crsp.stocknames"))
> gvkey_permno %>%
+ select(gvkey, lpermno) %>%
+ distinct() %>%
+ group_by(gvkey) %>%
+ mutate(num_permnos = n()) %>%
+ filter(num_permnos==5L) %>%
+ inner_join(ccmxpf_lnkhist) %>%
+ rename(permno=lpermno) %>%
+ inner_join(stocknames) %>%
+ arrange(linkdt) %>%
+ select(gvkey, permno, linkdt, linkenddt, ncusip, comnam)
Joining, by = c("gvkey", "lpermno")
Joining, by = "permno"
Source: query [?? x 6]
Database: postgres 9.6.1 [igow@iangow.me:5432/crsp]
Groups: gvkey
gvkey permno linkdt linkenddt ncusip comnam
<chr> <dbl> <date> <date> <chr> <chr>
1 032280 81875 1995-08-11 1999-03-09 87924V50 TELE COMMUNICATIONS INC NEW
2 032280 86746 1999-03-10 2001-08-09 00195720 A T & T CORP
3 032280 89130 2001-08-10 2006-05-09 53071810 LIBERTY MEDIA CORP NEW
4 032280 89130 2001-08-10 2006-05-09 53071810 LIBERTY MEDIA CORP NEW
5 032280 91275 2006-05-10 2011-09-25 53071M30 LIBERTY MEDIA CORP NEW
6 032280 13008 2011-09-26 <NA> 85571Q10 STARZ
7 032280 13008 2011-09-26 <NA> 53032210 LIBERTY MEDIA CORP 2ND NEW
8 032280 13008 2011-09-26 <NA> 53032210 LIBERTY MEDIA CORP 2ND NEW
It seems that there's no real difference between
ccmxpf_lnkhist
andccmxpf_linktable
once you throw out values with missinggvkey
orpermno
. This should probably be done any way. I figure the goal of these tables is to facilitateLEFT JOIN
s from CRSP or Compustat. But, if so, just useLEFT JOIN
.There are additional rows in
ccmxpf_lnkused
but these all haveusedflag=-1
and thus probably shouldn't be used.