Tech

Discussions on the Use of Data in Artificial Intelligence Models

In-depth discussions on topics such as data usage in artificial intelligence models, ethics, privacy, and reliability. This content addresses the impact of data sources and their role in artificial intelligence applications.

Published

on

Discussions on Data Usage of Artificial Intelligence Models

A recent study conducted by Proof News has sparked significant discussions by claiming that some major technology companies, including giants like Apple, Nvidia, Salesforce, and Anthropic, have used a dataset containing YouTube subtitles without permission to train their artificial intelligence models. This situation has brought ethical and legal issues in the field of artificial intelligence to the forefront.

Not long ago, a dataset called “YouTube Subtitles,” prepared by the nonprofit EleutherAI, included hundreds of millions of YouTube subtitles obtained from over 170,000 popular content creators like Mr. Beast. This has strengthened claims that technology companies are trying to profit from this data. It is reported that Apple’s OpenELM models have also been affected by these discussions.

In an interview with 9to5Mac, Apple stated that the OpenELM model is not used in Apple Intelligence or other artificial intelligence/machine learning features. In other words, according to Apple, this YouTube Subtitles database is not included in the features of Apple Intelligence.

OpenELM is a family of open-source models released earlier this year. Apple describes OpenELM as a “state-of-the-art open language model,” and states that it was released with the aim of “empowering and supporting the open research community, paving the way for future open research efforts.” The model is accessible from various sources, including Apple’s Machine Learning Research website.

However, Apple also added in its interview with 9to5Mac that it has no plans to develop future versions of the OpenELM model. The company had previously clarified that it does not use “users’ private personal data or user interactions” while training its Apple Intelligence models.

On the other hand, Apple’s websites have the right to use licensed data and data collected through its browser unless the company explicitly states otherwise. In relation to this process, Apple expresses: “We train our foundational models on licensed data, as well as public data collected by our web crawler, AppleBot, in addition to selected data to enhance certain features. Web publishers have the option to disable the use of their web content for Apple Intelligence training through data usage controls.

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending

Exit mobile version