CORD19 competition project into Open Source project
I was planning to turn my RedisHack/Kaggle competition project into set of smaller open source projects for several months now.
Here is the first one: RedisGears based NLP pipeline https://github.com/applied-knowledge-systems/the-pattern-platform — turn text into knowledge graph (stored in RedisGraph) using Medical Ontology (methathesaurus). Why Redis (Gears)? This pipeline takes about 6 hours to process 50K articles with peak 80 GB RAM. It takes about a week to process the same 50K articles using python’s scispacy (and land into Neo4j). UI, API, BERT QA and BERT Summary deployments will follow.
This is my attempt to turn the competition project into something which will hopefully be useful to others. Leave comment, PR or Issue to help me to continue working on it.
Written on December 28, 2020 by Alex Mikhalev.
Originally published on Medium