- Big Data News Weekly
- Posts
- š¤ Tokenizers in Language Models
š¤ Tokenizers in Language Models
š¦¾Plus: 68% of tech support to be handled by AI

Hey folks! Letās get into Big Data and AI crazinessā¦
In today's edition: What's Shaping the Future of Data?
šļø40 Best Free and Open Source NoSQL Databases
š¤ Async Dev Agents with Memory, RAG & MCP
ā”Speeding up LLMs with Turn-Based Reasoning
šGraph Concepts in Java With Eclipse JNoSQL
šØManusAI Launches āSlidesā
šMetaās AI Hits 1 Billion Users, Moves into Military Tech
š” AI Tutorial:Automate Educational Content Processing with Zapier
š¤ AI Tools and Data Tools to checkout

Tokenization is a crucial preprocessing step in natural language processing (NLP) that converts raw text into tokens that can be processed by language models. Modern language models use sophisticated tokenization algorithms to handle the complexity of human language. In this article, we will explore common tokenization algorithms used in modern LLMs, their implementation, and how to use them.
Youāre already enjoying free shipping and exclusive shows, but Amazon Prime has a lot more to offer. From early access to exclusive deals to unlimited photo storage, these 10 hidden perks can transform your shopping experience and help you get the most out of your membership. Donāt miss out on valuable benefits that can save you time, money, and effort. Start unlocking these powerful features today. Youāre paying for them ā now itās time to make them work for you.

As per Stackoverflow Developer Survey 2025, PostgreSQL took over the first place spot from MySQL. Professional Developers are more likely than those learning to code to use PostgreSQL (50%) and those learning are more likely to use MySQL (54%).

Factory AI has launched Droids, a team of software development agents that donāt just code but handle the entire engineering cycle. Each Droid is built for a specific task like debugging, spec writing, incident response, deep codebase search, or managing tickets. They connect to your

This paper by Apple and Duke researchers introduces a new framework where large language models alternate between reasoning steps and answer generation. This approach uses RL to reduce latency by 80% and improve accuracy by up to 19% on complex reasoning benchmarks.

We live in an era of polyglot persistence, where the guiding principle is to use the most appropriate data model for each use case. This article focuses on graph databases, their structure, practical applications, and how Java developers can leverage Eclipse JNoSQL and Jakarta Data to work seamlessly with them.
Through Squarespaceās cutting-edge features that combine automation, design presets, creative guidance, and generative AI, Design Intelligence makes it easy to build a beautiful and impactful website. With just a few pieces of information, Blueprint AI generates an entire website customized based off your brandās goals, name, and personality. Itās AI speed, with Squarespaceās 20+ years of design expertise in website building.
šØāš» Data Tools, Libraries
Vibetest-use MCP: MCP server that launches 10+ Browser-Use agents to test a vibe-coded website and flag every UI bugs, broken links, accessibility issues, and other technical problems.
MCPglue: Lets agents create and run their own tools by wrapping multiple API endpoints into single, reusable functions that stay stable even when the underlying APIs change.
AI News:

Hume has been quietly working on something ambitious: a voice AI that doesnāt just sound human, but feels human. With EVI 3, they may have just cracked it. EVI 3 is Humeās third-gen speech-language model, blending transcription, language, and speech into one.
With car insurance premiums projected to reach a record $2,101 annually in 2025, it's more important than ever to make sure you're not overpaying. In fact, switching car insurance providers could save drivers over $1,300 a year, according to a 2024 survey.
Anduril and Meta have teamed up to make the world's best AR and VR systems for the United States Military.
Leveraging Meta's massive investments in XR technology for our troops will save countless lives and dollars.
ā Palmer Luckey (@PalmerLuckey)
4:30 PM ⢠May 29, 2025
Metaās AI ambitions now extend across scale and strategy, with its assistant quietly crossing 1 billion monthly users across WhatsApp, Instagram, Facebook, and Messenger. This milestone follows the launch of a standalone AI app and a roadmap that leans into voice, personalized help, and entertainment. Beyond the consumer and research layers, Meta is extending its AI capabilities into defense through a partnership with Anduril,

A Cisco report highlights that agentic AIāAI agents combining conversational skills with tool interactionāis set to transform the IT industry, especially customer service. Surveying nearly 8,000 business leaders worldwide, the report projects that by 2028, 68% of customer support interactions with tech vendors could be automated via agentic AI, with 93% believing this will improve personalization and efficiency.
Today, we are introducing Manus slides!
Manus creates stunning, structured presentationsāinstantly. With a single prompt, Manus generates entire slide decks tailored to your needs. Whether you're presenting in a boardroom, a classroom, or online, Manus ensures your message
ā ManusAI (@ManusAI_HQ)
3:06 PM ⢠May 29, 2025
Manus creates structured presentationsāinstantly. With a single prompt, Manus generates entire slide decks tailored to your needs. Whether you're presenting in a boardroom, a classroom, or online, Manus ensures your message lands. Want edits? Just click and adjust.

Odyssey just showed off a demo of its new āinteractive videoā AI model. It lets users control and interact with AI-generated videos in real time ā like stepping into a playable movie. The system renders 360-degree video frames every 40 milliseconds, enabling users to navigate immersive spaces with simple directional commands, no traditional game engine required.
Shoppers are going nuts over these low cost hearing aids that are virtually invisible. Discover how these affordable hearing aids are changing the lives of people everyday.
AI Tutorial
š Automate Educational Content Processing with Zapier

In this tutorial, you will learn how to create an automated system with Zapier Agents that transcribes lecture recordings, generates study materials, and builds quiz questions.
Step-by-step:
Visit Zapier Agents, click the plus button, and create a New Agent.
Configure your agent to trigger when new recordings are uploaded to a āLecturesā folder in Google Drive
Add four essential tools: Google Drive to retrieve the file, ChatGPT to create a transcription and generate educational materials, and Google Docs to compile everything into organized documents.
Test your setup with a sample lecture and activate your agent
Economic pressure is rising, and doing more with less has become the new reality. But surviving a downturn isnāt about stretching yourself thinner; itās about protecting what matters most.
BELAY matches leaders with fractional, cost-effective support ā exceptional Executive Assistants, Accounting Professionals, and Marketing Assistants ā tailored to your unique needs. When you're buried in low-level tasks, you lose the focus, energy, and strategy it takes to lead through challenging times.
BELAY helps you stay ready for whatever comes next.
š„Top AI tools to increase productivity:
Trolly AI: Revolutionizing SEO Content Creation with Advanced AI Technology.
Videotok is the perfect tool if you want to create viral videos without wasting time editing
Marblism generates a fully-functional web application from a single prompt:
Clipwing A tool for cutting long videos into dozens of short clips
aiPDF is an innovative, multi-modal tool designed to work with a wide array of inputs, including ebooks, web articles, YouTube videos, podcasts.
Architecture Helper is a platform for analyzing and exploring various architectural styles
View our database of all the best AI tools for your needs: aitoolsup.com
Have cool resources to share? Submit AI tool
A.I. Generated Image of the Day
š Ancient Valyrian Wedding

Recommended reading
SPONSOR US
Get your product in front of Big Data & AI enthusiasts
Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world.
Interested in Sponsoring the Big Data News Weekly Newsletter?Get in touch today
What did you think of today's email?Your feedback helps me create better emails for you! |