The first wave of major generative AI tools largely were trained on “publicly available” data—basically, anything and everything that could be scraped from the Internet. Now, sources of training data ...