GLOSSARY: Oil Money is All You Need

четверг, 7 сентября 2023 г.

Oil Money is All You Need

Can't read or see images? View this email in a browser

Those who have been working with LLMs now know that attention is not all you need. As the parameter and token size increase, the need for fine-tuning and the reliance on GPUs also increases. And to do all of these at scale, you would need lots and lots of money – approximately $100 billion.

https://stratus.campaign-image.in/images/83238000153839344_9_1694090254742_phir-hera-pheri-paisa-laygif

Sam Altman previously suggested that OpenAI may try to raise as much as $100 billion in the coming years to achieve its aim of developing AGI, alongside improving its model capabilities. But, as of now, there hasn’t been any update, and the total funding stands at $11.3 billion.

So far, OpenAI has spent close to $4.6 million to build GPT-3 175 billion parameters. Now with GPT-4 rumoured to have about 1.76 trillion parameters, the cost of building the model comes up close to $46.3 billion, assuming a linear increase in cost per parameter. Again, this is a simplified estimate, and the actual cost may vary based on various aspects, including research and development costs, talent, hardware improvements, and more.

Now, OpenAI is trying really hard to sell its products to developers (the APIs), enterprises (Enterprise ChatGPT) and educational institutions (ChatGPT for teachers) in the hope of earning billions in annual revenue.

Ergo, OpenAI DevDay 2023, which was announced just yesterday, is a much-needed step to make developers and enterprise customers explore new tools, alongside exchanging new ideas.

https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExYmJpbmN5aHhsZW1qaXcyN2Z6cGVhNjh1YXBlMjhnd3p3dHNlY2N0biZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/KVioxFliECSZUmTWAw/giphy.gif

GPT-5 or GPT-4.5 would cost money, and OpenAI is clearly struggling with that and seems to have run out of ideas too. Besides, it most probably has exhausted Microsoft’s $10 billion investment with no sign of new investment yet.

That explains why the company has been shying away from releasing the multi-modal capabilities of GPT-4 to the public or disclosing the parameter size, which the team seems to be hiding deliberately to avoid tarnishing its pretty image or attracting unwanted attention.

But, one institution that seems to be ahead of OpenAI, and has lots and lots of oil money backing it is the Technology Innovation Institute (TII), a part of the Abu Dhabi government’s Advanced Technology Research Council, which oversees technology research in the emirate.

Recently, it introduced the Falcon 180B. This new model now ranks number one on the Hugging Face Leaderboard for open access LLMs model with 180 parameters trained on 3.5 trillion tokens, with four times the compute resources of Meta’s Llama 2.

Interestingly, Falcon 180B ranks just behind OpenAI’s latest GPT-4 and is on par with the performance of Google’s PaLM 2 – despite being half the model.

Should OpenAI and Google be worried? Read to find out.

India’s LLM Shines on Hugging Face Leaderboard

GenZ 70 B, an instruction fine-tuned model developed by Indian firm Accubits Technologies, is shining bright at the top spot on Hugging Face’s leaderboard of instruction-tuned LLMs. It also ranks No.6 for open LLMs in all categories. This is the first time we have seen such a development from India. The company, in collaboration with Bud Ecosystem, has open-sourced its fifth large language model – GenZ 70B.

Know more about this open-source model here.

Should Ethicists Stay Away From arXiv?

Emily M Bender, a professor and member of the Distributed Artificial Intelligence Research Institute (DAIR) – founded by Timnit Gebru – recently said that “arXiv is a cancer that promotes dissemination of “junk science” in a format indistinguishable from real publication…”

https://stratus.campaign-image.in/images/83238000153839344_3_1694090254030_48fbkp.gif

Both ACL and DAIR believe that the papers presented on arXiv need to be reviewed by them and other groups before being published on the website. However, computer scientists and academic researchers disagree. “Any policy that obstructs arXiv is just silly,” said Meta AI chief Yann LeCun.

Read more about this latest development here.

Catch ‘em Young

Three years ago, CBSE partnered with IBM to introduce an AI curriculum for high school students in Classes XI & XII. This initiative is part of CBSE’s SEWA program and has been implemented in about 200 schools across 13 Indian states, including Karnataka, Tamil Nadu, Uttar Pradesh and others.

Developed with Macquarie University and local partners like Learning Links Foundation and 1M1B, this program aims to equip students with AI knowledge and skills to build AI models for real-life use cases.

Discover IBM's efforts to boost AI talent through education partnerships here.