AI Models Use Canadian Journalism Without Proper Attribution, Studies Reveal

AI Models Exploit Canadian Journalism Without Giving Credit, Research Finds

A recent investigation by the Quebec-based Centre for Media, Technology and Democracy has uncovered that artificial intelligence companies are utilizing Canadian journalism to train their chatbots, yet these systems frequently omit proper source attribution. This practice poses significant risks to the sustainability and integrity of the journalism sector, as highlighted in two comprehensive studies.

Extensive Testing Reveals Widespread Lack of Attribution

In the initial study, researchers from McGill University conducted an extensive analysis involving four major AI models: ChatGPT, Gemini, Claude, and Grok. They tested these models on 2,267 authentic Canadian news stories, available in both English and French, resulting in a total of 18,134 queries. The primary objective was to assess how much information the models had absorbed from their training data and whether they provided appropriate credit to the original sources.

The findings were alarming. According to the policy brief titled AI News Audit: AI, Canadian Journalism, and Paths for Policy Action, the models failed to attribute sources in a staggering 82 percent of cases when queried about Canadian news events derived from their training data. This lack of transparency undermines the foundational principles of journalistic credibility and accountability.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

AI Models Often Serve as Substitutes for Original Reporting

The second phase of the research involved enabling web searching capabilities for the same AI models. Researchers posed questions about 140 specific recent articles from seven prominent Canadian news outlets to evaluate whether the AI could produce viable alternatives to current journalism and if it would properly credit the sources.

The results indicated that the models covered sufficient details from the original reporting to act as substitutes in 54 to 81 percent of instances. However, attribution rates remained critically low. While the models linked to Canadian news sites in 29 to 69 percent of responses, they explicitly named the originating outlet in the response text in only one to 16 percent of cases. This discrepancy highlights a systemic issue where AI systems consume content without adequately acknowledging its creators.

Improvement Possible with Direct Prompting

Interestingly, the studies noted that when researchers specifically named the outlet and requested citations, attribution rates improved dramatically, reaching between 74 and 97 percent. This suggests that AI models have the capability to provide proper credit but often fail to do so without explicit prompting, indicating a design flaw or intentional oversight in their operational protocols.

Impact on News Visibility and Industry Equity

The research also shed light on how AI models influence public access to news. As AI becomes an increasingly important channel for news consumption, the lack of source attribution means that readers are rarely directed back to the original journalism. Links provide a pathway back to the source, allowing consumers to reach the newsroom, but those who merely read the AI response remain unaware of the reporting's origins, as emphasized by the researchers.

Furthermore, the studies revealed a bias in AI visibility towards large, free, and nationally prominent organizations such as CBC, CTV, and the Globe and Mail. In contrast, paywalled and regional outlets, which often conduct substantial original reporting, received disproportionately low representation. For instance, the Toronto Star was named as a source only 11 times across over 18,000 responses, while the Montreal Gazette received a single mention.

Compounded Challenges for French-Language Journalism

French-language journalism faces additional hurdles. Although French stories were absorbed into training data at rates comparable to English ones, French outlets appeared in citations merely 10 percent of the time. Radio-Canada and La Presse dominated the limited number of French citations, whereas widely read publications like the Journal de Montreal were nearly invisible to AI systems. Their content is ingested. Their contribution is erased, the researchers poignantly noted, underscoring the erasure of cultural and linguistic contributions in the AI landscape.

Pickt after-article banner — collaborative shopping lists app with family illustration

Broader Implications for Policy and Media Sustainability

These findings raise urgent questions about the ethical use of journalistic content in AI training and the need for robust policy interventions. The Centre for Media, Technology and Democracy advocates for measures that ensure fair compensation and recognition for news organizations, particularly as AI models continue to evolve and integrate into daily information consumption. Without such actions, the future of journalism, especially for smaller and regional outlets, remains precarious in the face of advancing technology.