Update re: Microsoft and training data

Nov 26, 2024

Relevant to the post I sent earlier today, https://www.howtogeek.com/is-microsoft-using-your-word-documents-to-train-ai/says that an unnamed spokesperson at Microsoft claims that “Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models. Additionally, the Connected Services setting has no connection to how Microsoft trains large language models”, despite the fact that Connected Services offers “experiences that analyze your content” by “analyzing a vast amount of data."

This is certainly logically possible, though not well-explained elsewhere in Microsoft’s documentation, such as this which says (vaguely) that ”connected experiences [may] use machine learning services to … Analyze Data…” among others things.

Absent a clear statement in the Terms of Service, I personally remain uncertain. What systems are doing this analysis? What data are they trained on? Is any of this subject to change without notice?

I would appreciate further, on-the-record clarification from a named member of the company, resolving the appearance of possible contradiction.

I have reached out, and may update further.

Marcus on AI

Discussion about this post