Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Common Crawl Foundation

Team
non-profit
Verified
https://commoncrawl.org
commoncrawl
commoncrawl
Activity Feed

AI & ML interests

Crawled data and metadata

Recent Activity

malteos  updated a bucket 1 day ago
commoncrawl/commoncrawl
malteos  published a bucket 1 day ago
commoncrawl/commoncrawl
malteos  updated a bucket 1 day ago
commoncrawl/test-bucket
View all activity

Thom Vaughan's profile picturePedro Ortiz Suarez's profile picturePaul Lazar's profile pictureGreg Lindahl's profile pictureFord H's profile pictureJen English's profile pictureSebastian Nagel's profile pictureLaurie Burchell's profile pictureHande Celikkanat's profile picturemalteos's profile pictureThijs Dalhuijsen's profile pictureLuca's profile pictureCatherine Arnett's profile picture

commoncrawl 's datasets 7

commoncrawl/citations

Viewer • Updated 23 days ago • 9.18k • 68 • 2

commoncrawl/statistics

Viewer • Updated Mar 19 • 621k • 269 • 26

commoncrawl/CommonLID

Viewer • Updated Feb 10 • 373k • 155 • 51

commoncrawl/gneissweb-annotation-host-testing-v1

Viewer • Updated Dec 11, 2025 • 617M • 75

commoncrawl/gneissweb-annotation-url-testing-v1

Viewer • Updated Dec 10, 2025 • 11.5B • 109

commoncrawl/host-index-testing-v2

Preview • Updated Nov 10, 2025 • 1.43k

commoncrawl/eot2024_hostlevel_logs

Viewer • Updated Oct 9, 2024 • 271k • 6 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs