🤖 What AI Engineers Should Know about Search

🦾Plus: Binding Skin Tissue to Humanoid Robots 🤖

In partnership with

Hey folks! Let’s get into Big Data and AI craziness…

In today's edition:

  • 💡 Meta's LLM Compiler

  • 🔍 Building search-based RAG

  • 📚 A Guide for Beginner Data Scientists

  • 🧠 Beyond the Basics of Retrieval for Augmenting Generation

  • 🚀 Google launches Gemma 2, Gemini upgrades

  • 📰 OpenAI Partners with TIME Magazine

  • 🤖 AI Tools and Data Tools to checkout

In the ever-evolving landscape of data analytics, staying ahead of the curve is essential to unlock new insights, drive informed decisions, and gain a competitive edge. From advanced algorithms to innovative technologies, the realm of data analytics is continuously breaking new ground, offering unprecedented opportunities for organizations to harness the power of data

200+ hours of research on AI tools & hacks packed in 3 hours

The only AI Crash Course you need to master 20+ AI tools, multiple hacks & prompting techniques in just 3 hours.

Trust us, you will never waste time on boring & repetitive tasks, ever again!

Get the crash course here for free (valid FOR next 24 hours only!)

This course on AI has been taken by 1 Million+ people across the globe, who have been able to:

  • Automate 50% of your workflow & scale your business

  • Make quick & smarter decisions for your company using AI-led data insights

  • Write emails, content & more in seconds using AI

  • Solve complex problems, research 10x faster & save 16 hours every week

At least 50 of them :). I probably don’t need to discuss bi/cross-encoders, etc. A lot of great content is out there on those topics, especially folks like Pinecone, targeting the AI / LLM / RAG crowd. But maybe you to quickly get some high-level, historical, lexical search context… Well I’m here for ya! You might be new to all this.

The Meta Large Language Model Compiler is a suite of models designed to emulate the compiler, predict optimal passes for code size, and disassemble code.

Retrieval Augmented Generation (RAG) enhances large language models by incorporating external knowledge through search engines to answer questions accurately. Simon Willison implemented this with Claude 3.5 Sonnet, using SQLite full-text search in Datasette and Val Town for prototyping, completing the task in a live coding session.

Welcome to the exciting world of data science! You're about to embark on a journey of discovery, where you'll learn to extract valuable insights from data and use them to solve real-world problems. This guide will equip you with the foundational knowledge and practical tips to unlock data's full potential and become a successful data scientist.

👨‍💻 Data Tools, Libraries 

Corcel (GitHub Repo) 

Corcel is a collection of PHP classes that provides a fluent interface for connecting and getting data directly from a WordPress database.

Dorkly Feature Flags (GitHub Repo)

Dorkly is a free feature flag system for LaunchDarkly's open source SDKs. It allows developers to implement feature flagging consistently across dozens of languages.

Midday is an all-in-one tool designed to help freelancers, contractors, consultants, and solo entrepreneurs manage their business operations more efficiently. It integrates various functions typically scattered across multiple platforms into a single, cohesive system.

A cross-platform asciinema(v2) terminal session recorder for MacOS/Linux/Windows. Currently a better choice than the official one.

AI News:

University of Tokyo researchers developed a new technique to bind living human skin to robotic faces, potentially enabling more lifelike androids in addition to other medical applications

Google just launched Gemma 2, the next generation of its open lightweight AI model series — alongside new upgrades to its Gemini 1.5 Pro model including a 2M token context window and enhanced coding capabilities.

OpenAI's partnership with TIME grants access to a century's worth of archives. This historical content will enhance products like ChatGPT, allowing the AI to reference TIME articles in user queries with proper attribution. In return, TIME will leverage OpenAI's technology for innovative product development.

Meta's CEO, Mark Zuckerberg, announced the integration of AI characters on Instagram, developed through Meta AI studio. These initial tests will surface AI chatbots from popular creators like Wasted and Don Allen Stevenson III, primarily in messaging.

Amazon's "Just Walk Out" tech at its Fresh and Go stores drew criticism when it was found to rely heavily on manual checks by workers in India. Despite Amazon's defense, this incident highlights AI washing—overstating AI's role and effectiveness

AI Tutorial

📊 Use ChatGPT to turn docs into spreadsheets

With a simple prompt, ChatGPT can analyze documents, answer questions, perform calculations, and create a downloadable spreadsheet — all in one conversation!


  1. Log in to ChatGPT and upload your document to the chat (remember to remove any sensitive data before submitting).

  2. Ask questions about it, e.g., "How much is the security deposit?"

  3. Request calculations, like the total first month's costs, including deposits and fees.

  4. Simple prompt ChatGPT to “create a downloadable budget spreadsheet”, specifying the time frame and desired columns.

Note: Only share documents you're comfortable with. Be cautious with sensitive information and consider redacting critical details before uploading.

 🔥Top AI tools to increase productivity: 

  1. AI Writer is the only AI writing platform built to be trusted!

  2. MimicPC enables access to popular AI open-source applications from any device’s browser

  3. Chat Thing gives you the tools to make AI assistants and bots trained on your content.

  4. ResumeBoostAI is a tool that improves resume bullet points, creates cover letters, answers common job questions and more using AI.

  5. X Headshot is an AI headshot generator that turns your selfies into professional AI headshots.

  6. Markero, an all-in-one marketing tool equipped with artificial intelligence, democratizes advanced marketing techniques

  7. DOMYSHOOT revolutionizes product photography with cutting-edge AI technology.

  8. Owlbot offers a cutting-edge AI-powered chatbot service that seamlessly integrates with your data to provide instant responses

View our database of all the best AI tools for your needs: aitoolsup.com 

Have cool resources to share? Submit AI tool 


A.I. Generated Image of the Day

👀 Alabama Avengers (source)

AI Tools Up NewsletterReceive a weekly email with updates on new AI tools, helpful prompts, and the latest AI developments. Join over 8000 + professionals from Google, OpenAI, Notion, Apple, and more.


Get your product in front of Big Data & AI enthusiasts

Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world.

Interested in Sponsoring the Big Data News Weekly Newsletter? Get in touch today

Read news on Big Data | Data Science | AI | ML | NoSQL | ChatGPT | IoT | Cloud

What did you think of today's email?

Your feedback helps me create better emails for you!

Login or Subscribe to participate in polls.