What Companies Can Learn from OpenAI, LinkedIn, X, and Reddit
As AI technology continues to evolve, businesses must increasingly navigate the complexities of how platforms handle user data for model training and product development. Companies leveraging these technologies—or considering integration—should be aware of how major platforms like OpenAI (ChatGPT), LinkedIn, X (formerly Twitter), and Reddit approach the collection, use, and sharing of user data. At Omnian Legal, we advise companies on the importance of understanding these policies, especially as they relate to compliance and mitigating legal risks.
In this article, we’ll explore how these platforms disclose three critical aspects of their AI data practices: the types of data used, the purposes for which that data is processed, and the transparency around third-party involvement.
We will be examining the language used in AI Privacy Settings (or Privacy Policy disclosure, as is the case of Reddit), replicated below. Consent, notice, and timing are separate discusions and considerations that may vary based on the circumstances of the processing, and entities and jurisdictions involved.
Disclosure of Data Types Used
One of the first things businesses should assess is how platforms disclose the types of data they collect for AI training. Knowing whether these platforms are sufficiently transparent about data collection is essential for understanding the privacy and compliance implications of using these tools in your own operations.
OpenAI (ChatGPT) takes a broad approach by referring to "content" as the data collected for model improvement. While this covers user input, it lacks specificity regarding whether additional data like metadata or interaction history is used. This can create a compliance gap if your company relies on specific data sets that need to be clearly identified.
LinkedIn, in contrast, goes slightly further by disclosing the collection of both "personal data" and "content." However, the company doesn’t break down these categories into more granular data types, leaving some ambiguity. If your business uses LinkedIn's generative AI tools, understanding the full extent of "personal data" will be crucial for GDPR, CCPA, or other jurisdictional compliance.
X (Twitter) offers more detailed insight, specifying that "posts," "interactions," "inputs," and "results" are used for training purposes. This level of specificity is helpful for businesses assessing the potential privacy risks of using AI tools on the platform, as it clearly identifies the types of data involved.
Reddit also provides a relatively specific disclosure by naming user-generated content such as "posts," "comments," and "chat messages" as the primary data used. Additionally, Reddit includes metadata like the "username" and "time of submission." Companies integrating with Reddit or analyzing its data should be aware that the platform is quite open about the public nature of this information.
Disclosure of Processing Purposes
The second critical area to examine is how platforms disclose the purposes for which they process user data. For businesses concerned with regulatory compliance and user trust, knowing how data is used—and whether it aligns with your own privacy policies—is key.
OpenAI (ChatGPT) provides limited transparency here, stating only that data is used for "model training." This general language leaves open the question of whether data might also be used for other purposes, such as improving products or services. Companies relying on OpenAI should be mindful that broader or undefined uses of data could create risks when aligned with their own internal privacy practices.
LinkedIn is somewhat more specific by stating that user data is used for "training content creation AI models." While this provides more detail than OpenAI, LinkedIn does not elaborate on any additional purposes for processing, such as personalization, which could be important for companies to consider in their data protection impact assessments (DPIAs).
X (Twitter) adds more clarity by specifying that data is used for "training and fine-tuning" models. This two-step process implies distinct phases in data processing, which can help businesses better understand the life cycle of their data on the platform. However, X still does not disclose whether there are additional purposes beyond these technical ones.
Reddit provides the least explicit disclosure of processing purposes. While Reddit’s content is public by default, the platform does not directly state that user data is being used for AI model training or other purposes. For businesses, this ambiguity means that further diligence is required when assessing how Reddit content may be used by third parties, including for AI applications.
Disclosure of Third Parties and Specific Features
The third key consideration for businesses is how platforms disclose the involvement of third parties and specific AI features that rely on user data. Understanding this aspect can help companies determine whether data sharing practices meet their own standards and regulatory requirements.
OpenAI (ChatGPT) does not disclose specific third-party partners or the particular AI features that make use of the collected data. This lack of detail may be a concern for businesses seeking clarity on whether external entities could access their proprietary or customer data.
LinkedIn similarly does not name any third-party partners or external features. The platform’s AI privacy settings focus on internal data processing without specifying if and how third parties are involved, leaving companies to interpret these gaps in relation to their own third-party management policies.
X (Twitter) improves on this by naming a specific AI feature, "Grok," which is used for training and fine-tuning purposes. While X does not mention third-party partners, the mention of a distinct AI tool allows businesses to more easily evaluate whether its use aligns with their own privacy and security standards.
Reddit is the most transparent in this category, specifically stating that user content may be used by AI chatbots such as OpenAI’s ChatGPT. This direct disclosure allows businesses to understand that their publicly available content could be incorporated into third-party AI systems. However, Reddit does not elaborate on other potential partners, so businesses should consider the full implications of this disclosure.
Key Takeaways for Businesses
As companies increasingly incorporate AI tools into their operations, it’s critical to understand how platforms handle user data for AI training and development. Each platform has varying levels of transparency when it comes to disclosing data types, processing purposes, and third-party involvement:
X (Twitter) and Reddit stand out for providing more specific details on the types of data they use and naming distinct AI features or third-party applications. These disclosures can aid businesses in ensuring compliance and managing risk.
LinkedIn and OpenAI (ChatGPT) remain more general in their descriptions, requiring companies to conduct additional due diligence if using their platforms for business or customer interactions.
For businesses looking to mitigate legal risks and remain compliant with privacy regulations like the GDPR or CCPA, understanding these disclosures is essential. At Omnian Legal, we can guide your company through the complexities of AI data policies, ensuring that your business not only complies with existing privacy laws but is also prepared for future regulatory changes as AI technology evolves.
OpenAI (ChatGPT) - Privacy Setting
X (Formerly Twitter) / Grok, xAI - AI Setting
LinkedIn - AI Setting
Reddit - Privacy Policy
Disclosures
The content provided in this article is intended for informational purposes only and should not be construed as legal advice or a substitute for consulting with a licensed attorney. While we strive to provide accurate and current information, laws and regulations are subject to change, and there is no guarantee that the information contained herein is up to date or applicable to your specific situation. We recommend seeking professional legal counsel for any legal matters. This article does not create an attorney-client relationship between the reader and the law firm. For personalized advice, please contact our office directly: info@omnianlegal.com