Trouble viewing this email? View in web browser

Thursday, 10 April 2025
By Vishal Mathur

Good morning!

De-extinction. A terminology that we are certain to hear more of, in the coming months and years. Humanity has perhaps found its latest look-good project, or maybe there’s more to this. A question to which an answer isn’t exactly clear is, but may be so in due course — what do humans want to achieve with this de-extinction? The reason I wanted to chat about something that isn’t exactly technology (this is more science; but intersections are blurring) is because earlier this week, American biotechnology and genetic engineering company Colossal Laboratories & Biosciences staking a claim that the dire wolf has become the first animal to be resurrected from extinction (I have a particularly specific fascination to one day photograph wolves in the forests of Poland; don’t ask). Romulus and Remus, their first photos released now as they are a bit older than 6 months, are undoubtedly a furry bundle of cuteness. But are we doing things right with nature?

     

Are humans approaching this (I talk broadly about de-extinction; not questioning Colossal’s intent or undeniable achievements) as a way to learn, heal ecosystems, perhaps atone for past mistakes? Or is this designed for profit? Think about it, as I circle back to Colossal’s approach. “Leading edge of genetic engineering and restorative biology” is how they define their work (CRISPR genome editing is crucial, as are cloning techniques such as somatic cell nuclear transfer), and the plan is to bring back the Woolly Mammoth back, within the next five years. In a way, this may be a great service for our generation and the next generations. 

The dire wolf (Aenocyon dirus) roamed the earth (its believed to have predominantly populated North America) during the Pleistocene epoch, going extinct between 10,000 to 13,000 years ago, at the end of the last Ice Age. This predator, incredibly powerful and larger than today's gray wolves, have seen multiple depictions in popular culture.

There can of course be an argument that Romulus and Remus aren’t thoroughbred dire wolves, and that does have legs to stand on. Are they "real" extinct species or just hybrids? This de-extinction methodology may well be limited to species with recoverable DNA and close relatives. Romulus and Remus, along with their sister Khaleesi, are genetically modified gray wolves (Canis lupus); created with the specific goal of replicating the elder’s physical appearance.

Weight may prove critical in answering whether the hybrid approach works. Romulus and Remus reportedly already weigh around 80 pounds; but by the time they reach the fully mature stage, they much each tip the scales at around 140-150 pounds (around 68 kilogram); this would be closer to what scientists believe the dire wolves weighed at best health. For context, typical gray wolves, generally weigh between 70 to 145 pounds depending on the subspecies.

HUMAN-LIKE?

OpenAI’s GPT-4.5 and Meta’s LLaMa models have passed the Turing Test, a benchmark proposed by Alan Turing in the 1950s to assess whether a machine can exhibit intelligent behaviour indistinguishable from humans. Researchers Cameron R. Jones and Benjamin K. Bergen from the University of California San Diego, found that GPT-4.5 performed so convincingly that judges identified it as human 73% of the time—significantly more often than they correctly identified actual human participants. Meta’s Llama-3.1-405B achieved a 56% success rate, essentially matching human performance (around 50%). That leads me to another question, of which there seem to be many this week — is AI really behaving and responding in a more human-like manner, or have we tuned our minds in such a way that AI gains more acceptability amidst our lives and workflows if we feel it is more human?

Read more about the Turing Test, and what it means…

(Premium): Two AI models pass benchmark Turing Test, blurring line between human, machine

Human-like behaviour key to AI models passing the Turing Test

AI PERCEPTION

If you are using Gemini or Copilot on your phones, there’s a big change on the way. Google has now unlocked some new smarts with Gemini Live — basically, an ability to talk live with the artificially intelligence Gemini about anything you see. That can be something you point at in the physical world using your phone’s camera, or anything that may be on your phone’s screen at the time. whether it’s through your phone’s screen or through its camera. A while ago, I had termed this AI vision as fascinating and terrifying. For now, these Gemini Live features are available across Google’s Pixel 9 series, as well as Samsung’s Galaxy S25 smartphones — and without needing a Gemini Advanced subscription plan. Expect wider rollout in the coming weeks, though Gemini Live as is, can be accessed if you have a subscription in place on any other, recent Android phone.

It’s a similar tale of versatility for Microsoft’s Copilot Vision that makes its way to Android phones, the Apple iPhone as well as Windows PCs (as a native app, more so). The vision (no pun intended) is for users to interact with their surroundings in real-time using their phone's camera or through their screen on Windows PCs. On mobile, it will be able to analyse the real-time video feed, or photos to provide information and suggestions (e.g., identifying plants, offering design tips). The native Windows app allows users to call upon Copilot while working across multiple applications, browser tabs, or files, enabling tasks like searching, changing settings, organising files, and collaborating on projects without switching apps. Why am I not too enthusiastic about Microsoft, Windows and AI?

Read our coverage of phones and PCs adopting AI smarts…

Nothing Phone 3(a) series’ graceful evolution blends AI with pristine refinement

Apple MacBook Air and Mac Studio refresh sets M4 and M3 Ultra groundwork for AI

We want Apple Intelligence to be locally relevant: Apple’s Bob Borchers

Intel’s AI chips rest hopes on Asus Vivobook Flip 14’s versatile consistency

In a Copilot+ PC era, HP’s flagship EliteBook Ultra bets on unique AI smarts

OPEN AND REASONING

Late last month, Google rolled out the Gemini 2.5, which they claim is their “most intelligent AI model” yet. The Gemini 2.5 Pro Experimental, they insist, leads the benchmarks quite significantly. A few days later, this model which was first rolled out only for the premium subscribers, has been made available to everyone who is willing to try this — albeit with rate limits. And Google says they’re looking at ways to make this available on the Gemini app too — for now, it’s desktop only. 

This model is positioned as a "thinking model" with enhanced reasoning and coding capabilities, defined by ability to deploy multi-step logic, nuance and for the sake of benchmarks, mathematical encoding. It’s a significantly large context window of up to 1 million tokens in its experimental form — simply put, this means it can process and understand massive amounts of information. Interesting to note, Gemini 2.5 Pro achieved a leading score on the "Humanity's Last Exam" (HLE) benchmark, designed to assess complex reasoning and expert-level thinking. 

I’ll talk about this for a moment. The Humanity's Last Exam (HLE) is a recent benchmark, the primary focus being to evaluate advanced reasoning and knowledge capabilities of AI models across a number of academic disciplines. This should become a significantly more challenging test, keeping in tune with the rapid AI evolution, than existing benchmarks have proved to be. Scores are based on accuracy, as well as calibration error. Emerging from this benchmark, Gemini 2.5 Pro led the way with an 18% score, with OpenAI’s GPT-4o (14%), GPT 4.5 (6.4%) and Claude 3.7 Sonnet 64k Extended Thinking (8.9%) following it. 

SCOUTS, MAVERICKS AND BEHEMOTHS

Things began well for Meta, when they announced a complete collection of Llama 4 models. There’s the Llama 4 Scout which is claimed to be a small model capable of “fitting in a single Nvidia H100 GPU” (and a 10M context window), the Llama 4 Maverick which will rival the GPT-4o and Gemini 2.0 Flash models, and the still-being-trained Llama 4 Behemoth which Meta CEO Mark Zuckerberg claims will be the “highest performing base model in the world”. 

I’ll go back to the Llama 4 Scout for a moment, and specifically the 10M context window (10M being 10 million). Higher is better, for processing complex, successive sequences of information and analysis. Comparing to the Llama 4 Scout’s 10 million context window are Anthropic's Claude 3.5 Sonnet and Claude 3.7 Sonnet which have the 200,000 token context window, OpenAI’s GPT-4.5 and the o1 family with 128,000 token context windows; incidentally, that’s same as the Mistral Large 2 and the DeepSeek R1. Impressive?

That is where things went pear shaped for Meta. AI researchers began to dig through the benchmarking done by open-platform LMArena. Turns out, there was fine-print which even the benchmark platform wasn’t aware of; they released a statement later. Meta had shared with them a Llama 4 Maverick model that was “optimised for conversationally”. Not exactly the spec customers would get, would they? 

“Meta’s interpretation of our policy did not match what we expect from model providers. Meta should have made it clearer that “Llama-4-Maverick-03-26-Experimental” was a customised model to optimise for human preference. As a result of that we are updating our leaderboard policies to reinforce our commitment to fair, reproducible evaluations so this confusion doesn’t occur in the future,” as factual as it gets, amidst LMArena’s understandable frustration.

     

Were you forwarded this email? Did you stumble upon it online? Sign up here.

Written and edited by Vishal Shanker Mathur. Produced by Md Shad Hasnain.

Get the Hindustan Times app and read premium stories
Google Play Store App Store
View in Browser | Privacy Policy | Contact us You received this email because you signed up for HT Newsletters or because it is included in your subscription. Copyright © HT Digital Streams. All Rights Reserved