воскресенье, 18 февраля 2024 г.

The Most Historic AI Week of 2024

Can't read or see images? View this email in a browser
 
https://campaign-image.in/zohocampaigns/83238000007570001_1668333763034_belamy_logo.jpg

THE WEEKLY NEWSLETTER OF AIM.

Sunday, Feb 18, 2024 | Was this email forwarded to you? Sign up here

By Amit Naik

This week surely was a teaser for what 2024 holds. Just as it seemed Google pressed the foot on the ‘AI-celerator’ by releasing Gemini 1.5, OpenAI stole the thunder by dropping their text-to-video generation model Sora.


Tit-for-Tat, Google critiqued a video created by Sora using Gemini 1.5 tagging it as fake and pointing out significant inconsistencies.

ChatGPT Moment? Sora creates breathtaking hyper-realistic videos from text prompts that the world has never seen before. Capable of crafting 60-second videos with highly detailed scenes, complex camera motion, and lifelike characters, Sora uses a transformer architecture, employing visual patches instead of traditional text tokens.


“A good way to think about Sora is it’s basically the GPT-3 of video models. Stable Video Diffusion etc are like GPT2,” said Stability AI Founder Emad Mostaque. “The ChatGPT, GPT-4, Llama and Mistrals will come over the next few years.” 


Sora builds on past research in DALL·E and GPT models. It uses the recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data. 


It has multiple features like animating DALL·E images, extending generated videos, video-to-video editing and connecting videos. However, apart from video generation, the possibilities with Sora are endless. It can simulate some aspects of people, animals and environments from the physical world. 


Check out the mesmerising videos created by Sora here


Context Window Matters: Google wasn't to be outdone. Gemini 1.5 features a staggering context window of 1M tokens, surpassing not only GPT-4 Turbo's 128K but also Anthropic Claude 2.1's 200K. This means 1.5 Pro can process vast amounts of information in one go — including 1 hour of video, 11 hours of audio, and codebases with over 30,000 lines of code or over 700,000 words. 


Gemini 1.5 is built upon Transformer and MoE architecture. While a traditional Transformer functions as one large neural network, MoE models are divided into smaller "expert” neural networks.


Don’t Forget Meta: When OpenAI and Google were going out blazing all guns, how could Meta stay out? It released a new AI model called Video Joint Embedding Predictive Architecture (V-JEPA). 


V-JEPA improves machines’ understanding of the world by analysing interactions between objects in videos. The model aligns with Yann LeCun, Meta’s chief AI scientist’s vision, for creating machine intelligence that learns similarly to humans. 


Unlike Sora, V-JEPA is a non-generative model that learns by predicting missing or masked parts of a video in an abstract representation space. LeCun believes that the generation of mostly realistic-looking videos from prompts *does not* indicate that a system understands the physical world. 


Earlier in the week,  Andrej Karpathy, the developer community's favourite, left OpenAI to follow his own path. “My immediate plan is to work on my personal projects and see what happens,” he said on X.  He recently open-sourced ‘minbpe’, clean code for the (byte-level) Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization. 


Keeping up with all this, Meta should release Llama 3 before GPT-5 arrives to keep things spiced up and keep the open source community thriving.


     
     

Best Firms for Data Scientists >>

     

[Best Firms for Data Scientists is one of India's biggest workplace certification platforms in data science. To nominate your organisation, you can fill out the form here .]


     

TOP STORIES OF THE WEEK >>

     

Jensen Huang & His Newfound Obsession

Jensen Huang, the CEO of the world’s third-most valued company, has been on a side quest to spread one meaningful message: Every nation, regardless of its size or resources, must develop its own ‘sovereign AI’. The man in black has travelled over a dozen geographies. Click here to know why.


Why Do Big Tech LLM Chatbots Have the Worst Possible Names?


Looks like big tech majors are betting on fresh monikers for LLM chatbots, moving away from giving their chatbots human-like names such as Alexa, Siri, and Cortana, to more non-human (read: boring) names. Read to find out why


VFX Industry Will Make Their Own ‘Sora-Like’ GenAI Tools


While Sora is impressive, will it contribute to the growing concerns about potential job losses for VFX artists? Click here to find out.


     

PEOPLE & TECH >>

     

CoRover.ai is the Silent Winner of Indian LLM Race

Ankush Sabharwal, the co-founder of CoRover.ai, has had a busy few months building BharatGPT. Most recently, his company launched an educational tablet called Milkyway, which would be powered by CoRover’s BharatGPT virtual assistant, including video and chatbots for students. 


In an exclusive interview with AIM, Sabharwal spoke about how CoRover.ai’s BharatGPT was built and what exactly it offers. Check out the complete interview here.


How Epsilon is Navigating DE&I in Tech

The journey to equality has been a bit of a tedious trek for the LGBTQ+community in tech (and elsewhere). Sharing a similar story is Joseleen Princy C, a senior business system analyst at Texas-based advertising and technology company Epsilon. Click here to know more about her journey


     

AIM VIDEOS >>

     

The Beginning of ISRO's success story: Story Kya Hai


The renewed 'Story Kya Hai' focuses on fresh hopes, cutting-edge technologies, and new initiatives in India's AI, space, Agri-tech and Data Centre sectors. 


In this episode, learn about the ongoing revolution shaping the future of India's space exploration, through a historical lens which pans from the beginning of initiatives like Earth observation data, ground station, launching capabilities etc, to the present-day advancements.


     

AIM EVENTS >>

     

The Rising 2024 


The Rising 2024 stands as a powerful testament to diversity and inclusion in tech – limited not just to the work culture of an organisation. It goes beyond processes, products, and services. To find out more about the event and book your tickets for the most exciting conference of 2024, click on this link.


Location: Hilton Convention Center, Manyata Tech Park, Bengaluru

Date: April 4-5, 2024 

     

AIM SHOTS >>

     


   

Download our mobile app

Access our stories, video content and events on the go.

Stay Connected

info@analyticsindiamag.com


© 2023 Analytics India Magazine Pvt Ltd and/or its affiliates. All rights reserved. For more information, email info@analyticsindiamag.com

   
Facebook
Twitter
LinkedIn
Youtube
Instagram
   
 
Analytics India Magazine | 280, 2nd floor, 5th Main, 15 A cross, Sector 6, HSR layout Bengaluru, Karnataka 560102

Комментариев нет:

Отправить комментарий

Примечание. Отправлять комментарии могут только участники этого блога.