16 Comments

It depends I suppose on what you mean by training. Maybe the lawyers define what they do as using data not training.

Expand full comment

Thank you. 🙏

Expand full comment

Another rule: PR = BS!

"Facebook cares about the mental health of teens, pinky promise."

Expand full comment

A vast number of fortune 5000 companies use OneDrive for secure storage. Microsoft is not going to enter into parallel lawsuits with that threat.

You can use 365 to do simple textual analysis from models MSFT already has trained for simple things like sentiment analysis.

I’m not sure where or how using a textual classification service devolves into widespread violation of intellectual property law. Perhaps people worrying about it should read their contracts for use? That usually clears up any question.

I had to answer similar questions on projects 5-6 years ago. The master services agreements in force with companies cover this.

Expand full comment

So I too freaked out over this one and ended doing my homework and seeing that it looks legit. It is optional, it sends data only when asked, it follows sound data management practices of minimizing data sent and retention time.

The one remaining issue is, many users reported being opted in by default. Which could be partly explained by company policy or users clicking away the nag box without reading.

Still.

When both the CEO and CTO make shockingly dumb statements in public about IP, and they partner with OpenAI of all people, and own parts of LinkedIn which pulled a silent opt-in stunt just now, and roll out piracy-based, celeb deepfake engine DallE3 as safe to minors with their OS, there is negative trust to build on.

Expand full comment

m$crosoft is not to be trusted at all - see this: https://sneak.berlin/20200307/the-case-against-microsoft-and-github/ .

Windows 11 is one of the most horrible oses they released if not the one.

Expand full comment

Microsoft is constantly prompting me to back up my documents on their cloud server (OneDrive). Is their intention here to make my data more readily available for training their LLMs?

Expand full comment

I doubt it, they've been bothering me about that since 2018

Expand full comment

I just recently yanked everything I have off of Google Drive, and I never trusted OneDrive due to it's various eccentricities.

And because I'm a cheap bastard I'm using OpenOffice now for writing and spreadsheets.

It's inconvenient as hell, but I don't trust Google or Microsoft any more. Not just as regards to scraping all my writing into an LLM, but because I don't trust them to not abruptly lock my documents in place and charge me a fee to use them.

Google Docs and Drive have been great, but as they say, if the product is free you are the product.

Expand full comment

Reading the original post, it looks to me that people are confusing training and runtime.

Expand full comment

But they might (and in fact do, if I understand the training they offered) use them to produce the boundary API systems which are not LLMs (because not ANNs) - that possibility is consistent with that statement. I have tried to raise this question officially and gotten nowhere, but I will also keep pressing.

Expand full comment

Under GDPR this is punishable by fines. It remains to be seen whether Trump will stand with us Europeans to protect users or on the side of companies because "they are American".

Expand full comment

Thanks for asking for clarity. So far, very little is explained nor is permission requested from consumers. What about teach-ins, town halls all across the country explaining objectives of this technology. Why passive partners?

Expand full comment

Why would anyone waste cycles training AI with a bunch of random documents?

The cesspool of the internet is bad enough. But data from a bunch of word documents, not curated or verified in any way?

Expand full comment

That option has been around for a long time, I don't think it has anything to do with AI unless they've extended the definition secretly to include it recently. But given that it would cost them 4% of their global turnover in the world's most obvious GDPR violation, it's unlikely

Expand full comment

keep them on their toes, Marcus! They can say what they will but it's a fine line that we're "trusting" them.

Expand full comment