Skip to content

Instantly share code, notes, and snippets.

@rmhrisk
Created March 18, 2024 22:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save rmhrisk/d0ecc39eab846c5c2d14fa11b1c20811 to your computer and use it in GitHub Desktop.
Save rmhrisk/d0ecc39eab846c5c2d14fa11b1c20811 to your computer and use it in GitHub Desktop.
import pandas as pd
import requests
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from io import StringIO
from cryptography.hazmat.primitives import hashes
import matplotlib.pyplot as plt
def download_csv(url):
response = requests.get(url)
response.raise_for_status()
return StringIO(response.text)
def compute_fingerprint(pem_data):
try:
cert = x509.load_pem_x509_certificate(pem_data.encode(), default_backend())
return cert.fingerprint(hashes.SHA256()).hex().upper()
except Exception as e:
print(f"Error computing fingerprint: {e}")
return None
def extract_country_from_certificate(pem_data):
try:
cert = x509.load_pem_x509_certificate(pem_data.encode(), default_backend())
issuer_countries = [i.value for i in cert.issuer.get_attributes_for_oid(x509.NameOID.COUNTRY_NAME)]
return ",".join(set(issuer_countries))
except Exception as e:
print(f"Error extracting country: {e}")
return ""
def generate_pie_chart_with_legend(ca_countries):
# Transform the ca_countries into a DataFrame
country_counts = pd.Series(ca_countries).value_counts().rename_axis('Country').reset_index(name='Counts')
# Increase the figure size to make more room for the pie chart and the legend
fig, ax = plt.subplots(figsize=(15, 7))
# Create the pie chart with the autopct set to display percentages
wedges, _, autotexts = ax.pie(
country_counts['Counts'],
startangle=140,
autopct='%1.1f%%',
textprops=dict(color="w")
)
# Draw a circle at the center to make it a donut chart
plt.gca().add_artist(plt.Circle((0, 0), 0.70, color='white'))
# Set legend with country names and percentages, placed on the right side
legend_labels = [f"{country}: {perc:.2f}%" for country, perc in zip(country_counts['Country'], country_counts['Counts'])]
ax.legend(wedges, legend_labels, title="Country", loc="center left", bbox_to_anchor=(1.1, 0.5))
# Adjust figure to prevent cutoff of legend or labels
plt.subplots_adjust(left=0.1, bottom=0.1, right=0.75)
# Set the title and show the plot
plt.title('Country Distribution of Certificate Authorities')
plt.show()
def generate_trusted_ca_markdown_table_from_url(ca_url, roots_url):
ca_csv_data = download_csv(ca_url)
ca_data = pd.read_csv(ca_csv_data)
ca_data = ca_data[ca_data['Certificate Record Type'] == 'Root Certificate']
roots_csv_data = download_csv(roots_url)
roots_data = pd.read_csv(roots_csv_data)
roots_data['Computed SHA-256 Fingerprint'] = roots_data['PEM'].apply(compute_fingerprint)
fingerprint_to_country = dict(zip(roots_data['Computed SHA-256 Fingerprint'], roots_data['PEM'].apply(extract_country_from_certificate)))
trusted_roots = {}
ca_countries = {}
for _, row in ca_data.iterrows():
ca_owner = row['CA Owner']
fingerprint = row.get('SHA-256 Fingerprint',
'')
country = fingerprint_to_country.get(fingerprint, "Unknown") # Use "Unknown" for CAs without a country
status = row['Status of Root Cert']
# Only include CAs that are trusted by at least one program
if any(trust in status for trust in ["Apple: Included", "Google Chrome: Included", "Microsoft: Included", "Mozilla: Included"]):
if ca_owner not in trusted_roots:
trusted_roots[ca_owner] = set()
ca_countries[ca_owner] = country if country else "Unknown"
# Check for inclusion by each program
if "Apple: Included" in status:
trusted_roots[ca_owner].add("Apple")
if "Google Chrome: Included" in status:
trusted_roots[ca_owner].add("Google Chrome")
if "Microsoft: Included" in status:
trusted_roots[ca_owner].add("Microsoft")
if "Mozilla: Included" in status:
trusted_roots[ca_owner].add("Mozilla")
# Generating markdown table
markdown_table = "CA Owner | Countries | Apple | Google Chrome | Microsoft | Mozilla\n"
markdown_table += "--- | --- | --- | --- | --- | ---\n"
for ca_owner, stores in trusted_roots.items():
countries = ca_countries.get(ca_owner, "Unknown")
row = [ca_owner, countries] + ["✓" if store in stores else "" for store in ["Apple", "Google Chrome", "Microsoft", "Mozilla"]]
markdown_table += " | ".join(row) + "\n"
markdown_table += f"\nTotal CAs: {len(trusted_roots)}\n"
print(markdown_table)
# Convert ca_countries to a list and then to a Series object for value counts
ca_countries_list = list(ca_countries.values())
generate_pie_chart_with_legend(ca_countries_list)
# URLs for the datasets
ca_url = 'https://ccadb.my.salesforce-sites.com/ccadb/AllCertificateRecordsCSVFormatv2'
roots_url = 'https://ccadb.my.salesforce-sites.com/mozilla/IncludedRootsDistrustTLSSSLPEMCSV?TrustBitsInclude=Websites'
# Generate the markdown table and plot the pie chart with legend
generate_trusted_ca_markdown_table_from_url(ca_url, roots_url)
@rmhrisk
Copy link
Author

rmhrisk commented Mar 18, 2024

Above is a script I put together that displays the total number of trusted Certificate Authorities (CAs) across all root stores, along with their respective trust statuses concerning TLS (Transport Layer Security) trust.

NOTE: It's important to understand that although some CAs may be part of root programs, they might only be trusted for purposes other than TLS, such as S/MIME (this list exclusively includes CAs trusted for TLS).

To determine the country of origin for each CA, I utilized the issuer DN in the associated root certificate, specifically examining the C RDN. However, this method has limitations because:

  • Not all CAs include a C value,
  • Each CA entity may have multiple root certificates and each may have different C values,
  • These certificates have a long lifespan and may be sold or moved, affecting their original association,
  • The C value may not accurately reflect the actual legal jurisdiction associated with the business that owns the keys, and
  • The C value may not reflect the physical location of the keys.

Despite these issues, I've decided included the country column to offer a rough overview of the geographic distribution of CAs. Should I come up with a better method in the future, I will update the script accordingly.

Here's what the output looks like as of today:

image
CA Owner Countries Apple Google Chrome Microsoft Mozilla
A-Trust Unknown
AC Camerfirma, S.A. Unknown
Actalis IT
Agence Nationale de Certification Electronique TN
Agencia Notarial de Certificación (ANCERT) Unknown
Amazon Trust Services US
Asseco Data Systems S.A. (previously Unizeto Certum) PL
Autoridad de Certificacion Firmaprofesional ES
Autoridad de Certificación (ANF AC) ES
BEIJING CERTIFICATE AUTHORITY Co., Ltd. CN
Buypass NO
Byte Computer S.A. Unknown
Carillon Information Security Inc. Unknown
Certainly LLC US
Certicámara Unknown
Certigna FR
certSIGN RO
China Financial Certification Authority (CFCA) Unknown
Chunghwa Telecom TW
CommScope US
ComSign Unknown
Consejo General de la Abogacía Española Unknown
Consorci Administració Oberta de Catalunya (Consorci AOC, CATCert) Unknown
Cybertrust Japan / JCSI JP
D-TRUST Unknown
Department of Defence Australia Unknown
Deutsche Telekom Security GmbH DE
DigiCert Unknown
DigitalSign - Certificadora Digital, S.A Unknown
Disig, a.s. SK
Docaposte Certinomis SAS Unknown
e-commerce monitoring GmbH AT
E-Tugra Unknown
Echoworx Unknown
EDICOM Unknown
eMudhra Technologies Limited Unknown
Entrust Unknown
Eviden DE
Financijska agencija (Fina) Unknown
Global Digital Cybersecurity Authority Co., Ltd. (Formerly Guang Dong Certificate Authority (GDCA)) CN
GlobalSign nv-sa Unknown
GoDaddy US
Google Trust Services LLC Unknown
Government of Brazil, Instituto Nacional de Tecnologia da Informação (ITI) Unknown
Government of Finland, Population Register Centre’s (Väestörekisterikeskus, VRK) Unknown
Government of Hong Kong (SAR), Hongkong Post, Certizen HK
Government of India, Ministry of Communications & Information Technology, Controller of Certifying Authorities (CCA) Unknown
Government of Korea, KLID Unknown
Government of Saudi Arabia, NCDC Unknown
Government of Spain, Autoritat de Certificació de la Comunitat Valenciana (ACCV) ES
Government of Spain, Fábrica Nacional de Moneda y Timbre (FNMT) Unknown
Government of Sweden (Försäkringskassan) Unknown
Government of The Netherlands, PKIoverheid (Logius) Unknown
Government of Turkey, Kamu Sertifikasyon Merkezi (Kamu SM) TR
Halcom D.D. Unknown
HARICA GR
IdenTrust Services, LLC US
Internet Security Research Group US
iTrusChina Co., Ltd. CN
Izenpe S.A. ES
Krajowa Izba Rozliczeniowa S.A. (KIR) PL
LAWtrust Unknown
Macao Post and Telecommunications Bureau Unknown
Microsec Ltd. HU
Microsoft Corporation Unknown
MULTICERT Unknown
NAVER Cloud Trust Services KR
Netlock Unknown
Netrust Pte Ltd Unknown
NISZ Nemzeti Infokommunikációs Szolgáltató Zrt. Unknown
Notarius Unknown
OISTE CH
Open Access Technology International, Inc. (OATI) Unknown
PostSignum Unknown
První certifikační autorita, a.s. Unknown
QuoVadis BM
SECOM Trust Systems CO., LTD. JP
Sectigo Unknown
Shanghai Electronic Certification Authority Co., Ltd. CN
SI-TRUST Unknown
SSL.com US
Swiss BIT, Swiss Federal Office of Information Technology, Systems and Telecommunication (FOITT) Unknown
SwissSign AG CH
Taiwan-CA Inc. (TWCA) TW
Telia Company Unknown
Thailand National Root Certificate Authority (Electronic Transactions Development Agency) Unknown
TrustAsia Technologies, Inc. CN
TrustFactory(Pty)Ltd Unknown
TurkTrust Unknown
Viking Cloud, Inc. US
Visa Unknown
Zetes Unknown

Total CAs: 92

While there are a few takeaways from this dataset, one thing that is clear is that Microsoft is the most permissive of the root programs.

@rmhrisk
Copy link
Author

rmhrisk commented Mar 18, 2024

Hrm, looks like there is a bug in the country logic, just noticed Google Trust Services showed as being Unknown, it for sure had C in it. Ill investigate.

@rmhrisk
Copy link
Author

rmhrisk commented Mar 19, 2024

If we filter the 92 CAs down to those that are in all root stores (Apple, Google Chrome, Microsoft, Mozilla), this is what we see:

CA Owner Countries Apple Google Chrome Microsoft Mozilla
Actalis IT
Amazon Trust Services US
Asseco Data Systems S.A. (previously Unizeto Certum) PL
Autoridad de Certificacion Firmaprofesional ES
Buypass NO
Certigna FR
certSIGN RO
China Financial Certification Authority (CFCA) Unknown
Chunghwa Telecom TW
D-TRUST Unknown
Deutsche Telekom Security GmbH DE
DigiCert Unknown
Disig, a.s. SK
e-commerce monitoring GmbH AT
eMudhra Technologies Limited Unknown
Entrust Unknown
Eviden DE
Global Digital Cybersecurity Authority Co., Ltd. (Formerly Guang Dong Certificate Authority (GDCA)) CN
GlobalSign nv-sa Unknown
GoDaddy US
Google Trust Services LLC Unknown
Government of Hong Kong (SAR), Hongkong Post, Certizen HK
Government of Spain, Autoritat de Certificació de la Comunitat Valenciana (ACCV) ES
Government of Spain, Fábrica Nacional de Moneda y Timbre (FNMT) Unknown
Government of Turkey, Kamu Sertifikasyon Merkezi (Kamu SM) TR
HARICA GR
IdenTrust Services, LLC US
Internet Security Research Group US
Izenpe S.A. ES
Microsec Ltd. HU
Microsoft Corporation Unknown
NAVER Cloud Trust Services KR
Netlock Unknown
OISTE CH
QuoVadis BM
SECOM Trust Systems CO., LTD. JP
Sectigo Unknown
SSL.com US
SwissSign AG CH
Taiwan-CA Inc. (TWCA) TW
Telia Company Unknown
Viking Cloud, Inc. US

Total CAs in all root stores: 41

This point becomes particularly pertinent when considering that a WebPKI TLS certificate loses much of its utility if it isn't included in every browser root store. Given the diversity of the browser market share, as illustrated by StatCounter, this could explain why Certificate Authorities (CAs) not included in the set of well-trusted roots issue few, if any, WebPKI TLS certificates. The fragmentation in browser market share necessitates broad inclusion in root stores to ensure widespread trust and acceptance, highlighting the challenges faced by CAs operating outside this trusted circle.

By removing long-standing members of the root programs that have not issued a significant number of certificates, or no longer do, you could reduce the trusted set of CAs down to 51 certificates. This could result in an attack surface reduction of up to 52%.

@rmhrisk
Copy link
Author

rmhrisk commented Mar 19, 2024

With that said, I would argue that the lack of eventual inclusion in all root programs is merely a signal, not an absolute indicator, that a CA isn't providing enough value to the web to justify the exposure it represents. A much better indicator would be the ultimate issuance volume over a fixed period of time. For example, if you meet all the requirements and successfully pass audits for 5 years, yet fail to achieve any material issuance volume, should you still be trusted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment