yup_sto's comments

yup_sto · on Dec 5, 2024

I know this is an oversimplification, but if the main issue is the single point of failure (centralized trust), wouldn’t a potential solution be to layer independent verification mechanisms on top of the current system?

For example, a secondary DNS-based verification layer where a site’s public key is published as a DNS record (though that would likely need DNSSEC to be effective). It seems like it could complement the existing CA structure without replacing it entirely.

woodruffw · on Dec 5, 2024

The problem with the CA system is arguably not that it’s a single point of failure: it’s that it’s N points of failure, all of which were originally unaccountable to user agents. The CAs are themselves decentralized entities, and each was largely unaccountable to the larger web PKI until CT came along.

I think a DNS layer would probably make that problem worse, not better, which is one of the enduring criticisms of DNSSEC and DANE.

yup_sto · on Dec 5, 2024

You're likely right here, combining two trust systems does add complexity without solving the core problem. While browsers requiring CT was a great step forward, it's surprisingly under-utilized by orgs. I wonder if this is due to limited tooling for log interaction, or just general lack of awareness?

tptacek · on Dec 5, 2024

It's pretty widely used; orgs that have security teams with more than a handful of people tend to be monitoring CT for their own names. Could it be more usable? Yeah, there's probably more than one startup in there. Have at it! You don't need anybody's permission to build something like that.

j16sdiz · on Dec 5, 2024

> wouldn’t a potential solution be to layer independent verification mechanisms on top of the current system?

CT[1] allows some kind of external audit, but this is _mostly_ after the fact.

DNSSEC have much worse trust issue.

[1] https://certificate.transparency.dev/howctworks/

yup_sto · on Dec 4, 2024

Baader-Meinhof strikes again - checked this out in the morning and just caught your Citibike tweet. You're on a roll today!

yup_sto · on Sept 8, 2024

Have you considered adding a monitoring feature where a user can enter a domain to be monitored and then be notified if a "similar" domain comes across the ingestion pipeline.

This would be useful for early detection of potential impersonations/typo-squatting domains typically used for phishing/scams.

Something as simple as a configurable levenshtein distance/jaro-winkler similarity check across CN and SAN of all new certs maybe? (user can configure with threshold to control how "noisy" they want their feed).

1970-01-01 · on Sept 8, 2024

Try https://dnstwister.report/

Eikon · on Sept 8, 2024

For sure, it was on my todo list :)

yup_sto · on Sept 8, 2024

Awesome, I will keep my eye on this for sure, I've spent the past few months tinkering with ingesting CT logs for bug bounty automation.

Curious if you're running your own CertStream server, or just continuously polling known CT logs with your own implementation.

yup_sto · on Sept 8, 2024

I also noticed you are ingesting/storing flowers-to-the-world.com certs, not sure what stage of optimization you are at but blacklisting/ignoring these certs in my ingestion pipeline helped with avoiding storing unnecessary data

I'm not sure but I believe that's used by Google internally for testing purposes.

For example if you search google, it returns 120k+ results, and these useless results are at the front.

Eikon · on Sept 8, 2024

> I also noticed you are ingesting/storing flowers-to-the-world.com certs, not sure what stage of optimization you are at but blacklisting/ignoring these certs in my ingestion pipeline helped with avoiding storing unnecessary data

The goal is to have something exhaustive so I'll keep them. But you are right that I probably should not put them at front. Not sure how important it is though as these results shouldn't match many queries.

yup_sto · on Sept 8, 2024

Exhaustive/Robust is the way for sure.

Minimizing storage was a priority for me since it's just a small side-project/automation.

I've looked for information on what the hell the `flowers-to-the-world` entries are that pop and have found nothing, curious what's going on there.

Eikon · on Sept 8, 2024

It's actually a google thing!

I found that back then when I wondered the same: https://medium.com/@hadfieldp/hey-ryan-c0fee84b5c39

yup_sto · on Sept 8, 2024

Ahhhh, that tracks, cheers mate.

Eikon · on Sept 8, 2024

I am not using certstream as we'd lose data on the first network error. The way it's designed is more "Rsync for ct logs" than something like a stream => storage system.

Btw, you can get our feed like that:

    curl -N 'https://api.merklemap.com/live-domains?no_throttle=true'

yup_sto · on Sept 8, 2024

I'd imagine it's a combination of

- CT log monitoring (https://github.com/CaliDog/CertStream-Server)

- Mass-Scanning across ipv4 on 80/443 at the least?

- Brute-forcing subdomains on wildcards with large DNS wordlist (like something from assetnote: https://wordlists-cdn.assetnote.io/data/manual/best-dns-word...)

- Scraping/extracting subdomains/domains from JS

But I've never attempted to enumerate subdomains on this scale before, so I could be missing something obvious