Building a Pipeline for Natural Language Processing using RedisGears

Architecture Diagram

Abstract

In this tutorial, you will learn how to build a pipeline for Natural Language Processing(NLP) using RedisGears. For this demonstration, we will be leveraging the Kaggle CORD19 datasets. The implementation is designed to avoid running out of memory, leveraging Redis Cluster and RedisGears, where the use of RedisGears allows for processing data on storage without the need to move data in and out of the Redis Cluster—using Redis Cluster as data fabric. Redis Cluster allows for horizontal scalability up to 1000 nodes, and together with RedisGears, provides a distributed system where data science/ML engineers can focus on processing steps, without the worry of writing tons of scaffoldings for distributed calculations.

Publication
developers.redislabs.com
Dr Alexander Mikhalev
Dr Alexander Mikhalev
AI/ML Architect

I am highly experienced technology leader and researcher with expertise in Natural Language Processing, distributed systems including distrbuted sensors and data.

Related