- Big Data News Weekly
- Posts
- 📐 Evaluating Perplexity on Language Models
📐 Evaluating Perplexity on Language Models
🦾Plus: 💸 SoftBank completes $40B OpenAI investment

Hey folks! Let’s get into Big Data and AI craziness…
In today's edition: What's Shaping the Future of Data?
🔮Top 20 Data Science Trends and Predictions For 2026
⚡Speculative Decoding Models
📊10 Must-Know Concepts Every Analyst Should Know
🤖PyTorch OpenEnv: Environments for Agentic RL training
✨ Satya Nadella: AI to shift from ‘spectacle’ to ‘substance’
💡 AI Tutorial:How to create on-brand animations with AI
🤖 AI Tools and Data Tools to checkout

A language model is a probability distribution over sequences of tokens. When you train a language model, you want to measure how accurately it predicts human language use. This is a difficult task, and you need a metric to evaluate the model. In this article, you will learn about the perplexity metric. Specifically, you will learn: What is perplexity, and how to compute it, How to evaluate the perplexity of a language model with sample data
Learn AI in 5 minutes a day
What’s the secret to staying ahead of the curve in the world of AI? Information. Luckily, you can join 1,000,000+ early adopters reading The Rundown AI — the free newsletter that makes you smarter on AI with just a 5-minute read per day.

The rise of data science as a field of study and viable application throughout the last century has prompted the improvement of technologies, for example, deep learning, natural language processing, and computer vision.

SpecBundle Phase 1 is a set of production-ready EAGLE-3 checkpoints trained with industry partners to improve real-world speculative decoding. The release focused on instruct-tuned models and shipped alongside SpecForge v0.2, which added major system refactors and multi-backend support.

I put together 10 foundational principles that shape how data scientists reason and communicate data. Think of them as mental shortcuts that help interpret the world through data. Some you might remember from university, others are probably unwritten rules passed down in Slack threads and analyses readouts — things every data scientist somehow just knows.

In this studio, we are building and running agentic reinforcement learning (RL) environments using OpenEnv, an open-source framework by Meta’s PyTorch team. It provides a standard for interacting with agentic execution environments through simple Gymnasium-style APIs — step(), reset(), and state().
An estimated 75 percent of Americans are chronically dehydrated. The cause can be as simple as not drinking enough water, or from taking certain medications, and consequently, your cells are unable to function properly.
NativePath’s Hydrate drink mix is made with 100% clean Ingredients, zero sugar, and high bioavailability. It includes essential minerals—like sodium, potassium, chloride, magnesium, and calcium—that are vital to many key functions in the body, along with Amino Acids to enhance muscle recovery and all 9 essential amino acids.
Restore Whole-Body Hydration with NativePath’s Hydrate.
👨💻 Data Tools, Libraries
Stirrup (GitHub Repo)
Stirrup is a framework for building agents that lets models choose their own approach to completing tasks.
ExecuTorch (GitHub Repo)
ExecuTorch is a solution for deploying AI models on-device. Built by PyTorch for privacy, performance, and portability, ExecuTorch powers KPWA meta's on-device AI across Instagram, WhatsApp, Quest 3, Ray-Ban Meta Smart Glasses, and more.
git-backup
git-backup is a command-line tool for backing up your Git repositories to Amazon S3 or any S3-compatible storage.
AI News:

SoftBank has reportedly completed its $40B investment in OpenAI, according to CNBC — wiring the final $22B+ last week after months of asset sales and fundraising to pull together the largest single bet on the AI race. To fund the deal, Masayoshi Son sold SoftBank’s entire $5.8B Nvidia stake, $4.8B of T-Mobile shares, and also slowed his Vision Fund dealmaking.
Easy setup, easy money
Making money from your content shouldn’t be complicated. With Google AdSense, it isn’t.
Automatic ad placement and optimization ensure the highest-paying, most relevant ads appear on your site. And it literally takes just seconds to set up.
That’s why WikiHow, the world’s most popular how-to site, keeps it simple with Google AdSense: “All you do is drop a little code on your website and Google AdSense immediately starts working.”
The TL;DR? You focus on creating. Google AdSense handles the rest.
Start earning the easy way with AdSense.

Microsoft’s CEO Satya Nadella just shared his 2026 outlook, arguing that AI is entering a phase where we can see between “spectacle” and “substance,” and success will be measured less by model breakthroughs and more by real outcomes. Nadella said AI is shifting from discovery to diffusion, with capabilities outpacing our ability to turn them into real impact, creating a “model overhang.”

OpenAI has rolled out a real Thinking mode for its ChatGPT Android app, giving users control over how long the AI processes a request before responding. The update replaces the earlier toggle that only simulated extended reasoning and now matches the desktop’s “Extended Thinking” feature.

OpenAI's average stock-based compensation for its roughly 4,000 employees is about $1.5 million per employee. The company's equity awards, aimed at helping it keep its lead in the AI race, are inflating its heavy operating losses and diluting existing shareholders. OpenAI recently announced the discontinuation of a policy that required employees to work at the company for at least six months before their equity vests.

The U.S. granted Samsung and SK Hynix annual licenses to import chipmaking equipment to their China facilities for 2026. Samsung and SK Hynix rely on Chinese operations for traditional memory chips, whose prices have surged on AI data center demand and tight supply.
The Headlines Traders Need Before the Bell
Tired of missing the trades that actually move?
In under five minutes, Elite Trade Club delivers the top stories, market-moving headlines, and stocks to watch — before the open.
Join 200K+ traders who start with a plan, not a scroll.
AI Tutorial
How to create on-brand animations with AI

Visit Pomelli and click ‘Get Started.’
Add your website or product link to auto-import your brand.
Review and tweak your Business DNA (colors, fonts, tone, messaging).
Describe your campaign.
Sample Prompt: “Launch an AI writing assistant that saves marketers hours each week.”
Pick a generated concept.
Edit text, visuals, colors, and resize for any channel. Use Fix Layout to clean it up.
Click ‘Animate’ and export your on-brand animation.
🔥Top AI tools to increase productivity:
Nectar AI is an AI companion platform where users can create and roleplay
FeatureShark is an all-in-one platform designed to revolutionize how you collect and manage customer feedback.
BeadPattern, AI-powered perler bead pattern maker with smart color matching.
Brandmaven is a brand intelligence platform for marketers, powered by AI.
SEOzilla - Let AI Transform Your SEO: Publish Articles That Rank Every Day
SellerPic is an AI SaaS platform purpose-built for e-commerce sellers
Undress AI Tool is a website that offers a deepnude application, allowing users to create modified images
Screenloop is the ultimate Talent Operations Platform, seamlessly integrating a next-gen ATS
View our database of all the best AI tools for your needs: aitoolsup.com
Have cool resources to share? Submit AI tool
A.I. Generated Image of the Day
👀 The Borge!

Recommended reading:
SPONSOR US
Get your product in front of Big Data & AI enthusiasts
Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world.
Interested in Sponsoring the Big Data News Weekly Newsletter?Get in touch today
What did you think of today's email?Your feedback helps me create better emails for you! |




