GLOSSARY: Making Sense of the NYT Lawsuit

Can't read or see images? View this email in a browser

TAUSIF ALAM & AMIT RAJA NAIK

Tuesday, Jan 2, 2024 | Was this email forwarded to you? Sign up here

___________________________________________________________

Last month, The New York Times filed a lawsuit against OpenAI and Microsoft over AI use of copyrighted work. This case reflects similar concerns raised by Union Minister Rajeev Chandrasekhar, who advocates for fair revenue sharing by tech companies with digital publishers. The recent lawsuit further underscores a more significant debate about the monetisation of internet-scraped content versus the rights of copyright holders.

Interestingly, OpenAI recently recorded $1.6 billion in annualised revenue from its ChatGPT product, up from $1.3 billion in mid-October last year. It looks like the company doesn’t care enough about NYT or knows there’s no copyright case to start with.

https://analyticsindiamag.com/wp-content/uploads/2023/12/using-times-data-to-train-llms-1536x864.jpg.webp

In the wake of this lawsuit, computer scientist Subbarao Kambhampati briefly explained “approximate retrieval” in language models, highlighting their limitations in guaranteeing exact text retrieval, their differences from databases and IR systems, and implications for copyright and search functionality.

"When they try to argue the NYT lawsuit, they will no doubt push on the fact that LLMs don't do exact retrieval and so there is no copyright infringement,” said Kambhampati, explaining the technical nuances.

In other words, the LLM developers might argue in court that LLMs don’t perform exact retrieval and thus do not violate copyright. At the same time, with extensive training and a large context window, LLMs can sometimes closely mimic or 'memorise' passages, leading to outputs that resemble existing content, such as NYT articles, explained Kambhampati.

This 'memorisation' (aka ‘plagiarisation’) aspect is a key point in the NYT lawsuit, as it raises concerns about copyright infringement.

Read the full story here.

Indian Startup's Level 5 Breakthrough

Indian autonomous driving company Swaayatt Robots recently announced that it achieved the world’s first Level 5 autonomous driving capability. In the demonstration, its autonomous vehicle (Mahindra Bolero) learned to negotiate complex traffic dynamics at a toll plaza and successfully crossed the toll gates.

Read more here.

Churning out (Non-dropout) Entrepreneurs

Many universities offer a unique program that allows students to take a year off from academic pursuits to focus on their entrepreneurial endeavours. This initiative is being embraced by prominent institutions such as BITS Pilani, IIT Madras, IIT Hyderabad, DIT University, IIT Bombay, and IIT Kharagpur, among others.

Given the current dynamics of the Indian startup ecosystem, is this initiative relevant? Read to find out more.

GitHub's AI-Powered Security

https://analyticsindiamag.com/wp-content/uploads/2024/01/p-1536x864.jpg.webp

In an exclusive interview with AIM, Jacob DePriest, VP and deputy chief security officer at GitHub, discussed their innovative use of AI, particularly LLMs, to enhance software development and security. At GitHub Universe 2023, the company announced this integration to enable early detection and resolution of security vulnerabilities.

Read the full interview here.