By Anastasia Grinevich October 5, 2023 October 9th, 2023
This week’s AI news roundup highlights the latest developments in AI solutions for businesses of all sizes.

Let’s dive in!

AI models and capabilities

OpenAI introduces new voice and image capabilities for ChatGPT

OpenAI is introducing new voice and image capabilities for ChatGPT, allowing users to engage in natural conversations and share images for various tasks.

ChatGPT’s new voice capabilities enable users to have natural conversations with the AI, asking questions, giving instructions, and discussing topics just as they would with a human. ChatGPT’s image capabilities allow users to share images with the AI and get information about them, such as identifying objects and scenes, or generating text descriptions.

IBM rolls out new generative AI features and models

IBM has announced new generative AI models and capabilities for its Watsonx data science platform. The new models, called the Granite series, can summarize, analyze, and generate text. IBM is also launching Tuning Studio, a tool for tailoring generative AI models to data, and a synthetic data generator for tabular data.

Starting in Q4 2023, customers will be able to discover, augment, visualize, and refine data for AI through a self-service tool. Watsonx data will also gain a vector database capability to support retrieval-augmented generation (RAG).

Meta is reportedly working on a new AI model to rival GPT-4

Meta is developing a more powerful chatbot that could be as sophisticated as OpenAI’s GPT-4. The company is buying more AI training chips and building out data centers to support the project. Meta also assembled a group earlier this year to build the model, with the goal of accelerating the creation of AI tools that can emulate human expressions.

AI advancements in healthcare

Paige and Microsoft build AI model to beat cancer

Microsoft and digital pathology provider Paige are teaming up to build the world’s largest image-based AI model for identifying cancer. The model is training on an unprecedented amount of data, including billions of images, and can identify both common and rare cancers. Researchers hope it will help doctors contend with staffing shortages and growing caseloads.

The model is orders of magnitude larger than anything out there and is training on 4 million slides to identify both common and rare cancers.

AI in language processing

Tencent unveils its AI model, offering enterprise access

Tencent debuted its foundation model, Hunyuan, at its Global Digital Ecosystem Summit. Hunyuan has strong Chinese language processing abilities, advanced logical reasoning, and reliable task execution abilities. It supports a wide array of functions, including image creation, copywriting, text recognition, and customer service.

Enterprises can use Hunyuan to build powerful tools and train their own unique large models. Hunyuan has over 100 billion parameters and was pre-trained on more than two trillion tokens.

Open-source AI models

Alibaba opens AI model Tongyi Qianwen to the public

Alibaba has opened its AI model Tongyi Qianwen to the public, signaling regulatory approval to mass-market the model. The company said organizations including OPPO, Taobao, DingTalk, and Zhejiang University have reached agreements to train their own large language models or develop applications based on Tongyi Qianwen. An open-source version will be available for free commercial use in the near future.

Math problem solving AI

MathGPT Pro: AI-powered math problem solver

MathGPT Pro is a new AI tool that solves math problems, from basic arithmetic to complex calculus. It generates step-by-step solutions, making it ideal for students learning new concepts.

MathGPT Pro is easy to use. Simply enter a problem in the search bar and click “Solve.” MathGPT Pro will display the solution and a step-by-step explanation.

It is constantly learning and improving. As more users solve problems with MathGPT Pro, it becomes better at solving problems and providing accurate solutions.

AI-powered tools for content creation

Hugging Face presents CLIP Interrogator 2.1

CLIP Interrogator 2.1 is a tool designed to assist users in generating effective prompts for creating new images that resemble an existing image. This particular version of the CLIP Interrogator is specialized in producing prompts compatible with ‘Stable Diffusion 2.0′ and utilizes the specific ViT-H-14 OpenCLIP’ model.

Stable Audio: Fast timing-conditioned latent audio diffusion

Stability AI’s Stable Audio is a new tool that lets users create original audio by entering a text prompt and a duration. It uses a latent diffusion for audio model to generate high-quality, 44.1 kHz stereo audio, trained on data from AudioSparx, a leading music library.

HeyGen offers AI-powered video translation 

HeyGen’s new tool can translate videos up to 5 minutes long into different languages, clone the speaker’s voice, and adjust lip movements accordingly. Early tests show promising results, with high-quality translations and realistic-looking lip sync. However, the synthetic voice still sounds slightly robotic and the face appears brighter after translation.

DiffBIR: Blind image restoration with generative diffusion prior

DiffBIR is a computer vision project that aims to restore degraded images without knowing the specific degradation process. It uses a generative diffusion prior, a machine learning approach that leverages generative models and diffusion processes.

DiffBIR can be used to improve the quality of images that have been affected by noise, blurriness, or other imperfections. It is a valuable tool for researchers and developers who are working on image processing and restoration tasks.

AI-powered website building

Vzy: AI-powered website builder for everyone

Vzy is an AI-powered website builder that enables users to create websites easily and quickly without the need for coding or design skills. It offers features such as AI integration, mobile editing, SEO readiness, secure hosting, and various customization options. Users can create websites using Vzy’s templates and tools, and it is designed to be user-friendly and efficient for individuals and businesses looking to establish an online presence.


