- Big Data News Weekly
- Posts
- 🤖 The Complete Guide to Inference Caching in LLMs
🤖 The Complete Guide to Inference Caching in LLMs
🦾Plus: 🎨 Anthropic Launched Claude Design

Hey folks! Let’s get into Big Data and AI craziness…
In today's edition: Claude's 1M context window changes how you build. Cloudflare ships persistent agent memory.👇
🤖Top 15 AI Trends to watch In 2026
📖 How to Use Claude’s 1M Context
🧠 Cloudflare Ships Agent Memory
🎓GRPO: Train Gemma 4 with Reinforcement Learning
✌️ OpenAI is Losing Two of its Ambitious Architects
💡 AI Tutorial:How to Use Chrome’s New AI-Powered ‘Skills’
🤖AI Tools and Data Tools to checkout

Calling a large language model API at scale is expensive and slow. A significant share of that cost comes from repeated computation: the same system prompt processed from scratch on every request, and the same common queries answered as if the model has never seen them before. In this article, you will learn how inference caching works in large language models and how to use it to reduce cost and latency in production systems.
The Electrification of Heavy Machinery Has a Ground Floor
Tesla did it to cars. Now the same shift is coming for excavators, forklifts, cranes, and military equipment. The difference is that nobody has owned this moment yet — until RISE Robotics.
Their technology strips hydraulics out of heavy machinery entirely and replaces it with a patented electric actuator. No fluid. Full digital control. Built for the autonomous machines that are coming whether the industry is ready or not. The Pentagon is already a customer.
Last Round Oversubscribed. $9.7M in revenue already on the board. Dylan Jovine of ‘Behind the Markets’ spotted it early. The Wefunder community round lets anyone invest alongside institutional backers.

Artificial Intelligence is becoming an integral part of many organizations’ business plans. Already the journey of digital transformation has catapulted thanks to Machine Learning and Artificial Intelligence and because of the pandemic situation. The full scale of the impact that giving machines the ability to make decisions – and therefore enable decision-making to take place far more quickly and accurately than could ever be done by humans – is very difficult to conceive right now.

Anthropic breaks down when to continue, rewind, compact, clear, or spin up subagents in Claude Code, with a cheat sheet for managing context rot. Claude released /usage, a new slash command to help you understand your usage with Claude Code. This feature was informed by a number of conversations with customers.

A new managed service extracts facts, events, and instructions from agent conversations and retrieves them on demand, so agents stop losing context every time it compacts.

Unsloth now supports GRPO for Gemma 4. You can RL fine-tune Google's latest model on a consumer GPU.A free Colab notebook from Unsloth is available that walks through the full Sudoku training setup end to end. The model learns through trial and error with reward signals instead of memorizing solutions.
Learn how to code faster with AI in 5 mins a day
You're spending 40 hours a week writing code that AI could do in 10.
While you're grinding through pull requests, 200k+ engineers at OpenAI, Google & Meta are using AI to ship faster.
How?
The Code newsletter teaches them exactly which AI tools to use and how to use them.
Here's what you get:
AI coding techniques used by top engineers at top companies in just 5 mins a day
Tools and workflows that cut your coding time in half
Tech insights that keep you 6 months ahead
Sign up and get access to the Ultimate Claude code guide to ship 5X faster.
👨💻 Data Tools, Libraries
NativePHP is a framework for building desktop applications using PHP. It allows PHP developers to create cross-platform, native apps using familiar tools and technologies.
TypeScript Execute is Node.js enhanced with esbuild to run TypeScript and ESM.
Llama 2
The next generation of Meta's open source large language model. Llama 2 is available for free for research and commercial use.
AI News:

Maybe Figma and Canva should be worried after all. Anthropic announced Claude Design’s launch today, a new experimental product that lets users create visuals like prototypes, slides, one-pagers, and more using Claude. That said, Anthropic claims Claude Design is intended to complement the other design tools, not replace them
Slow down aging at the biological level.
Aramore is a completely new approach to skincare—one that helps your body produce more NAD+, the vital co-enzyme responsible for our cellular health and how we age.
See firmer, more radiant, more resilient skin in just 28 days. Get 20% off your first order with code NEWSLETTER20.

Have you ever been in a Zoom meeting with an AI-generated human? Well, now you’ll have a way to find out. Meeting platform Zoom announced a partnership with World, Sam Altman’s human ID verification company, to ensure that the people attending meetings are actually human and not AI-generated imposters.

OpenAI is losing two of the people responsible for its most ambitious projects: Kevin Weil, who led the company’s science research initiative, and Bill Peebles, the researcher behind AI video tool Sora, both announced their departures today. The departures follow OpenAI’s decision to cut back on “side quests,” including customer-facing bets like Sora and OpenAI for Science.

Well, next month might be tough for some. Meta is reportedly planning job cuts for May that could affect around 10% of its workforce, or around 8,000 employees. It’s also just the beginning… additional cuts are reportedly planned for later in 2026 that could lead Meta to cut as much as 20% of its workforce.

Google wired its AI Mode search directly into Chrome. Clicking a link now opens the page side-by-side with AI Mode instead of hijacking your tab. You can pull multiple open tabs, images, and PDFs into a single query.
AI Tutorial
How to Use Chrome’s New AI-Powered ‘Skills’

Go to ‘Chrome Web Store’ and download the ‘Gemini in Chrome’ extension
Once done, login with your Gemini account and start using the Gemini sidebar
Write any prompt you want to reuse (e.g., “Summarize this page” or “Turn this into a LinkedIn post”)
After sending the prompt save it as a ‘Skill’ and give it a clear name
When browsing any webpage, trigger your saved Skills by typing ‘/’ or ‘+’ and select your saved Skill from the list
The Skill will automatically execute your saved prompt on that content
Use Skills for repetitive tasks like summarizing articles, drafting emails, analyzing tabs, or generating posts
What if the solution to muscle spasms, bladder issues, fatigue, and brain fog…Was hiding in your water?
And in just seconds a day you could turn back the clock on your body!?
Without painful treatments, expensive doctor bills, or even leaving your house…In one study, participants experienced a difference in as little as 2 hours!
🔥Top AI tools to increase productivity:
Zoice is the single platform for every creator. Transcribe, generate, and animate your content.
Floowed is a flexible, no-code AI credit workflow automation platform
BookSwift is a modern appointment booking platform for providers
Marketsy.ai provides a smart e-commerce experience supported by a powerful admin panel.
StrideFuel - Built for weight loss success—especially GLP-1 users
WorldEngen is an AI copilot for 3D production that helps professional teams
AppWizzy is an AI tool that helps you build and host full-stack web applications
SongGuru.AI: An AI-Based Music Creation and Audio Processing Platform
View our database of all the best AI tools for your needs: aitoolsup.com
Have cool resources to share? Submit AI tool
A.I. Generated Image of the Day
👀 Rapture but it never fell

Recommended reading:
SPONSOR US
Get your product in front of Big Data & AI enthusiasts
Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world.
Interested in Sponsoring the Big Data News Weekly Newsletter?Get in touch today
What did you think of today's email?Your feedback helps me create better emails for you! |




