How to Search Through
800 Billion Records
in Real Time
Big (Huge) Data
-
60B+
Files in the threat repository
-
150M+
Files with new metadata every day
-
500+
Interesting features per file
Advanced Search
Aggregating in ClickHouse
Just Make It Real-Time
A Completely Different Approach
The New Pipeline
Too Much Data
Processing Loop
Deduplicating Messages
Timed Cooldown
Delayed Processing
Acknowledge When Processed
Keep The Polls Going
Not Acking Intermediate Messages Is Safe
When All You Have Is A Hammer