Relevant to the post I sent earlier today, https://www.howtogeek.com/is-microsoft-using-your-word-documents-to-train-ai/ says that an unnamed spokesperson at Microsoft claims that “Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models. Additionally, the Connected Services setting has no connection to how Microsoft trains large language models”, despite the fact that Connected Services offers “experiences that analyze your content” by “analyzing a vast amount of data."
This is certainly logically possible, though not well-explained elsewhere in Microsoft’s documentation, such as this which says (vaguely) that ”connected experiences [may] use machine learning services to … Analyze Data…” among others things.
Absent a clear statement in the Terms of Service, I personally remain uncertain. What systems are doing this analysis? What data are they trained on? Is any of this subject to change without notice?
I would appreciate further, on-the-record clarification from a named member of the company, resolving the appearance of possible contradiction.
I have reached out, and may update further.
So I too freaked out over this one and ended doing my homework and seeing that it looks legit. It is optional, it sends data only when asked, it follows sound data management practices of minimizing data sent and retention time.
The one remaining issue is, many users reported being opted in by default. Which could be partly explained by company policy or users clicking away the nag box without reading.
Still.
When both the CEO and CTO make shockingly dumb statements in public about IP, and they partner with OpenAI of all people, and own parts of LinkedIn which pulled a silent opt-in stunt just now, and roll out piracy-based, celeb deepfake engine DallE3 as safe to minors with their OS, there is negative trust to build on.
It depends I suppose on what you mean by training. Maybe the lawyers define what they do as using data not training.