Digital Guinea Pigs: Take Back Control of Your LinkedIn Data

Oct 7

If you find yourself snacking on lettuce and carrots recently, it may have an unexpected origin outside your desire to be healthier. LinkedIn has started to turn all its users into unsuspecting digital guinea pigs by training on users’ data for their new generative AI models. Granted, the data will help LinkedIn users produce better messages, posts, and increase engagement, all valuable aspects of being on the site. However, they are training on everyone’s data by default with an automatic opt-in unless you shut it off. Don’t worry; I’ll tell you how to shut it off and regain your digital power.

First, you might be asking yourself, “Why would the people at LinkedIn want to turn me into a guinea pig? What gives?” That’s an excellent question. No, they didn’t watch a cartoon and become inspired to work a spell on you; rather, the explanation is simpler than that, and it’s based on how generative AI needs to be fed. Quick side note: when coming up with the examples for this piece, I had considered calling LinkedIn users digital cabbages since we are technically being cultivated and harvested, but I figured furry pets are a bit more relatable. Also, did you know that it’s illegal to own only one guinea pig in Switzerland? Google it—it’s true.

Back to why we were almost classified as cabbages. All AI, but in particular generative AI, requires massive amounts of data to train on to produce the results we have come to expect when conversing with these systems. In order for the AI to interact with us in a more natural manner, it needs examples of how we interact with one another and on various tasks. For example, if you trained a system on only one example of a conversation or task, there is no way it would be reflective of all the tasks or conversations we would want to produce later on. But with millions or billions of examples, the AI would start to understand all the vague nuances we are looking for when working with any system on a particular task.

AI needs more than just generalized data

Now, you might wonder, haven’t they already trained on a lot of data already? Why do they need my LinkedIn data? What good is it? This brings up the difference between generalized AIs and domain-specific AIs intended for more specialized use. Take two scenarios: hospital users and LinkedIn users. Two very different uses of artificial intelligence in terms of data, workflows, etc.

In a hospital, users are primarily doctors, nurses and patients who would likely be discussing topics related to exams, test results, diagnosing, referrals, insurance, etc. All of these conversations and interactions at the hospital have generated a significant database that can be used to train AI models in a way that will better serve their “domain users” in a more efficient way. From helping to fill out a form, to asking better questions, to understanding a result; all of them will be improved (ideally) over the long run. Now, while this data is very useful for a hospital context, the same data would not offer the same value to, let’s say, a LinkedIn user trying to write a better post, generate higher engagement, send better InMails, etc. The hospital data doesn’t apply, and if used for LinkedIn, it would generate really bad hallucinations. Imagine sending out a sales call request with the subject line, “We apologize for the inconvenience: Please confirm a call time to discuss.” Sounds more like an insurance rejection letter for coverage.

It's precisely for this reason, among many others, that LinkedIn decided to use user data to train a generative AI model based on conversations and interactions in a professional social networking setting. Using this data form, the model can help create content that is much better for users trying to achieve their career goals via LinkedIn. Also, I’m sure we’re going to begin seeing the next generation of LinkedIn influencers later this year powered by the platform’s new generative AI, so it should be interesting.

Transparency and data training: a necessary balance

The challenge with the new generative AI model from LinkedIn isn’t with the model itself, the use cases, or how it will improve productivity. As an AI CEO myself and a professor of generative AI, I’m all for new tools, applications, and pushing things forward. The primary issue is how LinkedIn automatically opted all of its users into the training data without properly informing them of what was going on and putting the work on them to turn the data training off. Making us automatic cabbages in one sense because we’re being harvested without our consent, and guinea pigs at the same time because they will test the effectiveness of these models on us based on our interactions with generative content no matter what.

I think a better approach would have been to offer the generative AI service to users where they were given the option of “opting in” and selecting what LinkedIn data they wanted to have their model trained on. Perhaps even a little explainer video that went over the benefits, privacy, data rights, etc., relevant to the model LinkedIn is developing to give users the appropriate level of education to consent (or not consent) to their data being used for training.

Take back control of your data

Opting out is quick and easy. Here are the steps to take back control:

Log in to LinkedIn and click on your profile photo (you’ll find it in the top right corner).
Select “Settings & Privacy” from the dropdown menu.
In the “Data privacy” tab, scroll down to the section labeled “How LinkedIn uses your data.”
Wave your wand at your screen and say, “Not so fast.”
Find the option “Data for generative AI improvement.” Click and turn it off.
Congrats! You are no longer a cabbage.
While you’re at it, you might also want to check the “Social, economic, and workplace research” section (previously “Data research”). Turning that off prevents LinkedIn from using your information for social, economic, or labor market research. A solid move if you want to stay in control of how and why your data gets used.

Ultimately, while we do have the chance to stop ourselves becoming involuntary cabbages, we most likely won’t be able to stop ourselves from occasionally being digital guinea pigs. But at least now you’ll know what you’re in for next time you sign on to LinkedIn, and hopefully, it will keep you more aware of other platforms' plans.

Originally published in Spanish for Fast Company Mexico:

https://fastcompany.mx/2024/10/07/linkedin-controla-datos/

Christopher Sanchez

Professor Christopher Sanchez is internationally recognized technologist, entrepreneur, investor, and advisor. He serves as a Senior Advisor to G20 Governments, top academic institutions, institutional investors, startups, and Fortune 500 companies. He is a columnist for Fast Company Mexico writing on AI, emerging tech, trade, and geopolitics.

He has been featured in WIRED, Forbes, the Wall Street Journal, Business Insider, MIT Sloan, and numerous other publications. In 2024, he was recognized by Forbes as one of the 35 most important people in AI in their annual AI 35 list.

https://www.christophersanchez.ai

Digital Guinea Pigs: Take Back Control of Your LinkedIn Data

AI needs more than just generalized data

Transparency and data training: a necessary balance

Take back control of your data

Select Initiatives

Operating Companies

Follow

More Information

Digital Guinea Pigs: Take Back Control of Your LinkedIn Data

AI needs more than just generalized data

Transparency and data training: a necessary balance

Take back control of your data

The Mexican Moment and Artificial Intelligence

“Where’s the beef?!” Has the Technical Beef of AI Arrived Yet?

Select Initiatives

Operating Companies

Follow

More Information