PD12M
Source.Plus,
Dec 06, 2024
From Alan Levine comes this link: "At 12.4 million image-caption pairs, PD12M is the largest public domain image-text dataset to date, with sufficient size to train foundation models while minimizing copyright concerns. Through the Source.Plus platform, we also introduce novel, community-driven dataset governance mechanisms that reduce harm and support reproducibility over time." Search could be better, but the images are great.
Today: 0 Total: 556 [Share]
] [